Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Site Reliability Engineering Manager image - Rise Careers
Job details

Site Reliability Engineering Manager

Zoox is looking for a Site Reliability Engineering Manager who will be responsible for leading and growing Zoox's Core Site Reliability Engineering team, ensuring the reliability, scalability, and performance of our critical infrastructure, cloud platform, and core services that powers company-wide software engineering operations.Zoox is a robotics company and our ethos of automation extends throughout the infrastructure components we build.


In this role, you will:
  • Lead and mentor a team of Site Reliability Engineers, fostering their growth and technical development
  • Establish and drive SLOs, monitoring strategies, and incident management frameworks across core infrastructure
  • Partner with engineering teams to architect scalable solutions and improve system reliability
  • Champion infrastructure automation and build robust observability solutions
  • Oversee incident response and build resilient on-call processes to support business-critical services


Qualifications
  • 5+ years of experience in SRE, DevOps, or similar technical roles
  • 3+ years of people management experience
  • Strong background in distributed systems and cloud infrastructure
  • Experience with modern observability tools and practices
  • Proven track record of improving system reliability and performance


Preferred Qualifications
  • Knowledge of Kubernetes, AWS, and infrastructure as code
  • Track record of building and scaling high-performing technical teams
  • Experience with real-time systems and safety-critical applications


Compensation

There are three major components to compensation for this position: salary, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights. The salary range for this position is $234,000 to $342,000. A sign-on bonus may be offered as part of the compensation package. Compensation will vary based on geographic location and level. Leveling, as well as positioning within a level, is determined by a range of factors, including, but not limited to, a candidate's relevant years of experience, domain knowledge, and interview performance. The salary range listed in this posting is representative of the range of levels Zoox is considering for this position.

 

Zoox also offers a comprehensive package of benefits including paid time off (e.g. sick leave, vacation, bereavement), unpaid time off, Zoox Stock Appreciation Rights, Amazon RSUs, health insurance, long-term care insurance, long-term and short-term disability insurance, and life insurance.

Average salary estimate

$288000 / YEARLY (est.)
min
max
$234000K
$342000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Site Reliability Engineering Manager, Zoox

Zoox is on the lookout for an enthusiastic and highly skilled Site Reliability Engineering Manager to lead our Core Site Reliability Engineering team in Foster City, CA. This is an exciting opportunity to be at the forefront of developing the infrastructure of a pioneering robotics company. In this role, you’ll not only manage but also mentor a talented group of Site Reliability Engineers, focusing on their professional growth and technical development. You’ll establish and oversee key metrics like Service Level Objectives (SLOs) and monitoring strategies, and collaborate with cross-functional engineering teams to architect scalable solutions that enhance system reliability. Your passion for automation will shine as you champion infrastructure automation and build observability solutions that are robust and insightful. You’ll be at the helm of incident management, ensuring we have resilient on-call processes to keep vital business services running smoothly. If you have over 5 years of experience in Site Reliability Engineering or similar roles, along with a strong technical background and a passion for mentoring others, then this is the role for you. Join Zoox, where the ethos of automation and innovation drives our mission to reshape the future of transportation. We can't wait to see what you bring to the team.

Frequently Asked Questions (FAQs) for Site Reliability Engineering Manager Role at Zoox
What responsibilities does a Site Reliability Engineering Manager have at Zoox?

As a Site Reliability Engineering Manager at Zoox, you will lead and mentor the Core Site Reliability Engineering team, establish SLOs, and oversee incident management frameworks. Additionally, you will collaborate with engineering teams to create scalable solutions and improve system reliability, champion infrastructure automation, and ensure a robust observability strategy is in place.

Join Rise to see the full answer
What qualifications are required for the Site Reliability Engineering Manager position at Zoox?

To qualify for the Site Reliability Engineering Manager position at Zoox, candidates should have 5+ years of experience in Site Reliability Engineering, DevOps, or similar roles, along with 3+ years of people management experience. A strong background in distributed systems, cloud infrastructure, and experience with modern observability tools is also essential.

Join Rise to see the full answer
How does Zoox support the professional growth of Site Reliability Engineering Managers?

At Zoox, we are committed to the professional growth of our Site Reliability Engineering Managers. You'll have the opportunity to mentor and shape the careers of your team members, as well as access resources for technical development. We believe that investing in our team's growth directly contributes to the overall success of our infrastructure and services.

Join Rise to see the full answer
What compensation can a Site Reliability Engineering Manager expect at Zoox?

The compensation for a Site Reliability Engineering Manager at Zoox consists of salary, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights. The salary range for this position is between $234,000 and $342,000, with adjustments based on geographical location and experience. A sign-on bonus may also be offered as part of the comprehensive compensation package.

Join Rise to see the full answer
What technical skills are preferred for the Site Reliability Engineering Manager role at Zoox?

Ideal candidates for the Site Reliability Engineering Manager role at Zoox should have knowledge of Kubernetes, AWS, and infrastructure as code practices. Experience with real-time systems and safety-critical applications is preferred, alongside a proven track record of building and scaling high-performing technical teams.

Join Rise to see the full answer
Common Interview Questions for Site Reliability Engineering Manager
What methods do you use to measure system reliability in your role as Site Reliability Engineering Manager?

In answering this question, it's essential to discuss specific metrics you monitor, such as uptime, latency, and SLOs. Highlight your experience with monitoring tools, your approach to incident management, and how you utilize these metrics to improve system reliability and performance.

Join Rise to see the full answer
Can you describe a time when you improved system performance in your previous SRE role?

Here, you should provide a detailed example of a specific project. Discuss the problem you identified, the steps you took to improve performance, the technologies you used, and the measurable results. Show how your actions directly contributed to improved efficiency or reliability.

Join Rise to see the full answer
How do you handle team conflicts, especially in a technical environment?

In your response, discuss your conflict resolution strategies. Highlight your approach to listening to team members, facilitating constructive discussions, and ensuring that team goals are aligned. Stress your commitment to fostering a positive team culture and supporting each member's professional development.

Join Rise to see the full answer
What is your experience with infrastructure automation?

To answer this, share your specific experiences with automation tools and practices, emphasizing how these have improved efficiency and reliability in your past roles. Discuss any frameworks or methodologies you've implemented for automation, and their impact on system performance.

Join Rise to see the full answer
Describe your experience with cloud infrastructures such as AWS.

Outline your level of expertise with cloud platforms, specifically AWS. Discuss any significant projects or responsibilities you’ve had that involved deploying or managing cloud resources, focusing on best practices and lessons learned during those experiences.

Join Rise to see the full answer
How do you ensure effective communication with engineering teams?

Success in an SRE role requires excellent communication. Discuss your strategies for maintaining open lines of communication with engineering teams, such as regular sync meetings, collaborative incident reviews, and using shared documentation to keep everyone informed.

Join Rise to see the full answer
What role does observability play in your SRE strategy?

Discuss your understanding of observability and its importance in maintaining system reliability. Share the tools and techniques you employ to enhance observability, and how you utilize this data to address performance issues proactively.

Join Rise to see the full answer
Can you give an example of how you have established SLOs in the past?

In your response, describe the process you followed to define and implement SLOs. Highlight the metrics you selected, the stakeholders you engaged, and how you used these goals to drive accountability and improve service reliability.

Join Rise to see the full answer
What challenges have you faced in incident management, and how did you address them?

This is an opportunity to demonstrate your problem-solving and leadership abilities. Discuss a specific incident, outline the challenges you faced in managing it, and describe how your strategies for incident response and communication helped mitigate the situation successfully.

Join Rise to see the full answer
What do you feel is the key to building a high-performing SRE team?

To answer this question effectively, discuss your views on team culture, encouraging continuous learning, mentorship, and collaboration among team members. Share specific practices you value that foster a high-performing environment, such as regular training sessions and shared goals.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Posted 4 days ago
Photo of the Rise User
Posted 9 days ago
Photo of the Rise User
AECOM Remote Singapore, Singapore
Posted 10 days ago
Photo of the Rise User
Posted 6 days ago
Photo of the Rise User
Posted 10 days ago
TOMRA Remote Otto-Hahn-Straße 2-6, 56218 Mülheim-Kärlich, Germany
Posted 10 days ago
Posted 9 days ago

Zoox was founded to make personal transportation safer, cleaner, and more enjoyable—for everyone. To achieve that goal, the team created a whole new form of transportation. Zoox will provide mobility-as-a-service in dense urban environments.

133 jobs
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, on-site
DATE POSTED
December 14, 2024

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!