Senior Site Reliability Engineer (SRE)

Be Yourself at Roche

At Roche, we celebrate individuality. You’re welcomed for your unique strengths and perspectives in an inclusive culture that values openness, authenticity, and connection. Here, you’re empowered to grow both personally and professionally—because when you thrive, we all do. Together, we’re working to prevent, treat, and cure diseases while ensuring global access to healthcare for generations to come. At Roche, every voice counts.

The Role

We’re looking for a Site Reliability Engineer (SRE) who is ready to be on-call, prepared to respond to critical issues outside of regular business hours. This role is essential in maintaining stability, minimizing disruptions, and keeping systems running smoothly 24/7.

About Roche

Roche is driven by a bold mission: improving lives through science and innovation. We act with integrity, prioritize patient outcomes, and champion equitable access to medical advancements. Each day, we work toward a better tomorrow by embracing scientific excellence and ethical responsibility.

Our commitment to diversity and inclusion means we build teams with varied backgrounds and experiences. This diversity fuels creativity, collaboration, and innovation—all of which help us better serve patients.

We’re expanding our global Site Reliability Engineering team to support both internal and commercial platforms. This team will bring an engineering mindset to solving reliability challenges across our systems.

Your Opportunity: Step Into the Future of IT

As an experienced SRE at Roche, you’ll apply your software engineering skills to build a more resilient, scalable, and high-performing IT infrastructure. This is your chance to shape mission-critical systems that support healthcare delivery worldwide.

Your Key Responsibilities

  • Automation & Efficiency: Create tools and scripts to automate workflows, streamline deployments, and manage complex systems effectively.

  • Collaboration: Work with development teams to design high-efficiency systems that improve performance and resource utilization.

  • Incident Management: Take the lead in managing incidents—detecting issues, responding quickly, and performing root cause analysis to prevent recurrence.

  • Monitoring & Reliability: Continuously improve observability using tools like DataDog, VictorOps, ELK, Grafana, and Prometheus.

  • Cloud Infrastructure: Manage robust cloud environments on AWS and Azure, optimizing performance and cost-efficiency while applying best practices.

  • Scripting & Troubleshooting: Use Python (or similar languages) for automation and troubleshoot across distributed and cloud systems.

  • Documentation & Process Improvement: Use JIRA and ServiceNow to log incidents and capture learnings for ongoing refinement.

  • Cross-Functional Support: Collaborate across engineering, DevOps, operations, and security teams to foster a culture of resilience and continuous improvement.

  • Flexible Availability: Participate in scheduled on-call rotations, including nights and weekends, to ensure round-the-clock system support.

  • Team Development: Contribute to team growth by sharing knowledge and building a supportive, inclusive environment.

What You Bring

  • Education: Bachelor’s in Computer Science, Engineering, or related field. Advanced degrees (MBA, PhD) are a plus.

  • Certifications: AWS and/or Azure certifications preferred.

  • Experience: ~5 years in SRE, DevOps, or IT operations with deep cloud and infrastructure expertise.

  • Technical Skills:

    • Hands-on with AWS, Azure, Kubernetes (EKS/AKS/GKE), Terraform.

    • Monitoring tools: DataDog, ELK, Grafana, Prometheus (Loki/Mimir/Tempo is a bonus).

    • Scripting in Python or similar.

  • Soft Skills: Strong communicator, proactive collaborator, and committed to continuous improvement.

  • Language: Proficient in English, both spoken and written.

  • Diversity Advocate: Open-minded, inclusive, and passionate about working in diverse teams.

Why Join Roche?

By joining Roche, you’ll directly influence the reliability and performance of systems that support millions globally. You’ll gain professional development opportunities, work with forward-thinking experts, and help redefine what’s possible in IT infrastructure.

About Us

Roche is a global healthcare leader with over 100,000 employees worldwide. Each year, our medicines help more than 26 million people, and over 30 billion diagnostic tests are conducted using our technologies. We encourage bold ideas, creative exploration, and high ambition as we work together toward a healthier future.

Join us in building a more resilient, inclusive, and innovative world of healthcare.

Roche is proud to be an Equal Opportunity Employer.

Copyright © 2025 SRE-Jobs.com. All Rights Reserved.