Site Reliability Engineer

About the Company

Nasdaq is a dynamic and entrepreneurial organization, fostering a culture that encourages initiative, innovation, and intelligent risk-taking. Committed to diversity and inclusion, Nasdaq supports employees to bring their authentic selves to work while building a connected, collaborative, and empowering environment. Recognized globally for shaping financial markets, Nasdaq provides an opportunity to impact the stability and scalability of systems that drive the global economy.

About the Role

The Site Reliability Engineer in Denver, CO will play a key role in maintaining and improving the reliability, scalability, and performance of Nasdaq’s critical systems. This role focuses on automating processes, optimizing cloud infrastructure, and supporting high-performance applications used by financial institutions worldwide. It offers a chance to work at the intersection of software development, operations, and cloud architecture in a fast-paced, mission-driven environment.

Responsibilities

  • Maintain uptime and performance of critical applications and infrastructure in low-latency, high-pressure environments.
  • Troubleshoot AWS cloud and on-premise applications, hardware, and software issues.
  • Develop and manage Linux systems, focusing on automation for operational tasks.
  • Design, deploy, and maintain scalable, highly available infrastructure using AWS and Infrastructure as Code (IaC).
  • Optimize AWS-based solutions across services including EKS, MSK, Redshift, RDS, EC2, Route 53, SES, SSM, KMS, and IAM.
  • Implement and maintain configuration management systems such as Puppet, Ansible, or Chef.
  • Build and maintain CI/CD pipelines using GitLab or similar tools.
  • Monitor applications and infrastructure using alerting and logging tools (Datadog, Splunk, Grafana, CloudWatch).
  • Maintain technical documentation and support on-call rotation as required.
  • Communicate effectively with management regarding project status, problem resolution, and escalation needs.

Required Skills

  • Bachelor’s degree in Computer Science or related field.
  • 5–8 years of experience in site reliability, systems engineering, or a similar role.
  • Strong expertise in AWS cloud infrastructure and services.
  • Proficiency in containerization technologies (Docker, EKS, ECS, Fargate, Argo).
  • Experience with configuration management tools and Infrastructure as Code (Terraform).
  • Hands-on experience with CI/CD pipelines, GitLab/GitHub, and deployment automation.
  • Advanced Linux administration skills (5–10+ years).
  • Expertise in Java and Bash (5–7+ years) with additional programming in Python, Go, JavaScript, or Ruby.
  • Solid understanding of cloud architecture, networking, and security best practices.
  • Strong problem-solving skills for complex system issues.
  • Experience with relational and NoSQL databases (Postgres, MySQL, DynamoDB, InfluxDB).

Preferred Qualifications

  • Experience with monitoring and logging platforms (Datadog, Splunk, Grafana, CloudWatch).
  • Strong understanding of automation and operational best practices.
  • Ability to communicate technical concepts clearly to diverse stakeholders.
  • Creative approach to problem-solving and delivering resilient solutions.
  • Passion for innovation, financial technology, and enterprise-grade infrastructure.

Additional Information

  • Location: Denver, Colorado, with hybrid work model (minimum 3 days/week in office).
  • Benefits include 401(k) with 6% match, Employee Stock Purchase Program, student loan repayment, medical/dental/vision coverage, paid time off, parental leave, and wellness support.
  • Access to professional development, training programs, and career growth opportunities.
  • Inclusive workplace with zero tolerance for discrimination or harassment.

For a detailed job description, kindly refer to the official website linked below:

Copyright © 2025 SRE-Jobs.com. All Rights Reserved.