Henry Meds is looking for a Lead Site Reliability Engineer to ensure the reliability, scalability, and performance of complex systems and cloud infrastructure. This role involves architecting and creating observability and monitoring systems, disaster recovery planning, and collaborating with engineering and security teams.
Sign up for our
weekly newsletter
of fresh jobs
Skills
Experience in GCP
Experience managing identity and access management
Experience leading incident management processes
Experience setting up availability expectations
Experience managing cloud operations
Experience defining SLIs, SLOs, and leading on-call support
Experience in chaos engineering and resilience testing
Experience with Infrastructure as Code using Terraform
Responsibilities
Architect and create observability and monitoring system
Create disaster recovery plan and facilitate testing
Oversee design and development of operational infrastructure
Assist in hiring and embedding SRE operations
Provide guidance and mentorship to SRE teams
Lead and prioritize projects, create roadmaps, and implement plans
Partner with product and engineering stakeholders to deliver solutions
Benefits
Platinum PPO Healthcare + Vision & Dental
401(k) with matching contributions
Unlimited PTO
Fully remote position with occasional travel
Impactful work helping thousands of people daily
To read the complete job description, please click on the ‘Apply’ button