
We’re seeking a highly skilled Site Reliability Engineer (SRE) to join our engineering team and help ensure the reliability, scalability, and performance of our systems. As an SRE, you’ll blend software engineering with systems engineering to build and maintain resilient infrastructure, automate operations, and drive continuous improvement across our platform.
You’ll be instrumental in designing fault-tolerant systems, implementing proactive monitoring, and streamlining incident response. This role is ideal for someone who thrives in high-ownership environments and enjoys solving complex challenges with code.
Key Responsibilities
- Design, implement, and maintain scalable, reliable infrastructure across our environments
- Develop automation for deployment, monitoring, and incident response using infrastructure-as-code and scripting tools
- Collaborate with development and operations teams to define SLAs/SLOs and improve system performance
- Build observability into systems through metrics, logging, and tracing
- Lead root cause analysis and post-mortems for production incidents
- Optimize alerting workflows to reduce noise and improve signal quality
- Champion reliability best practices across engineering teams
- Contribute to capacity planning, disaster recovery, and performance tuning
- Maintain documentation and runbooks for operational processes and incident response
- Support evolving systems by performing manual operational tasks that enable critical features not yet automated or fully built
- Identify opportunities to reduce manual toil through automation, tooling, and process improvement
- Collaborate with engineering teams to translate recurring manual workflows into scalable, reliable solutions
- Document interim procedures and ensure visibility into temporary workarounds while long-term fixes are in development
required skills & experience
- Experience with cloud platforms (e.g., Azure) and container orchestration (e.g., Docker)
- Proficiency with monitoring and observability tools (e.g., Azure Monitor)
- Familiarity with CI/CD pipelines and automation tools (e.g., GitHub Actions, Terraform)
- Solid understanding of networking, security, and system architecture
- Experience with incident management and on-call rotations
- Excellent problem-solving and communication skills
- Working knowledge of Jira and Confluence for issue tracking, documentation, and cross-team collaboration
What We Offer:
- 1 00% remote work environment with flexible scheduling
- Competitive compensation and benefits package
- Mission-driven culture focused on empowering organizations through compliance excellence
- Professional development opportunities in a growing company at the intersection of compliance and technology
Be part of aprivate equity-backed, high-growth companywith opportunities for rapid advancement
Competitive base salary + uncapped commission
Full benefits package: health, dental, vision, 401(k), PTO, and more
World-class sales training, mentorship, and technology
Collaborative, mission-driven culture that’s serious about compliance—and about winning
#J-18808-Ljbffr