Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Director of Systems Reliability & Field Resilience image - Rise Careers
Job details

Director of Systems Reliability & Field Resilience

At Serve Robotics, we’re reimagining how things move in cities. Our personable sidewalk robot is our vision for the future. It’s designed to take deliveries away from congested streets, make deliveries available to more people, and benefit local businesses.

The Serve fleet has been delighting merchants, customers, and pedestrians along the way in Los Angeles while doing commercial deliveries. We’re looking for talented individuals who will grow robotic deliveries from surprising novelty to efficient ubiquity.

Who We Are

We are tech industry veterans in software, hardware, and design who are pooling our skills to build the future we want to live in. We are solving real-world problems leveraging robotics, machine learning and computer vision, among other disciplines, with a mindful eye towards the end-to-end user experience. Our team is agile, diverse, and driven. We believe that the best way to solve complicated dynamic problems is collaboratively and respectfully.

Serve Robotics is seeking a Director of Systems Reliability & Field Resilience, responsible for continuously improving end-to-end operational reliability across our robotic delivery operations infrastructure. In this role, you and your team will proactively identify, triage, and resolve complex, cross-domain issues that impact delivery service quality/efficiency, and will work cross-functionally to build monitoring, alerting, automation and resiliency into our platform.

In this role you will provide leadership and direction to your team while also contributing directly in defining, building and deploying solutions. You will work closely with engineering, product and operations to prioritize the work, and you’ll hire, allocate resources and support your team to deliver capabilities from concept to production.

The Serve Robotics delivery platform spans a wide range of technologies, from cloud and networking infrastructure that powers delivery matching, front-end solutions for robot fleet supervisors and field agents, and on-robot embedded and autonomous systems that all must work seamlessly together to fulfill our daily delivery growth and economics. You will lead a team of experts with backgrounds in SRE, Devops and Cloud Infrastructure and partner across the entire engineering organization to ensure a robust and resilient delivery infrastructure.

The ideal candidate will have a strong track record of hands-on leadership of small and highly technical software engineering teams. You will have experience hiring, mentoring and coaching Sr. level engineers, building a high-performance, collaborative team. You are a highly capable and technical generalist who is comfortable working across all components of a complex system and partnering with domain experts and functional teams to identify issues, perform detailed root cause analysis, and develop strategies for short- and long-term solutions that will often require highly technical collaboration between your team and other engineering teams to deliver.

Responsibilities

  • Full-Stack Troubleshooting & System Deep Dives: Become the go-to expert for identifying root causes of service issues—whether they're in cloud APIs, robot hardware, network layers, or operational workflows—and coordinate with the respective owning teams to resolve and prevent them.

  • Build and Lead a Global Systems Reliability Team: Hire, mentor, and grow a multidisciplinary team of high-context generalists who can investigate system-wide failures, document their learnings, and drive improvements across organizational boundaries.

  • Own the On-Call & Incident Management Process: Take over and evolve the company's on-call process into a mature, well-documented, and inspectable system. Define SLAs, escalation policies, and a best-in-class paging infrastructure that aligns with our service goals.

  • Establish and Maintain a Knowledge Base: Ensure on-call responders have access to actionable documentation, playbooks, and troubleshooting guides. Make knowledge capture a core part of incident response.

  • Reliability Analytics & Intuition Building: Use incident and operational data to build a deep intuition about where our systems are most fragile. Create predictive frameworks and reliability metrics that help the organization stay ahead of failures.

  • Service Health & Performance Dashboards: Build and maintain dashboards that monitor the health of end-to-end services—not just software, but everything that supports customer delivery. Highlight systemic issues, performance regressions, and areas needing investment.

  • Cross-Functional Collaboration: Work closely with engineering, infrastructure, hardware, field ops, customer support, and leadership to align on reliability priorities and drive systemic improvement efforts.

Qualifications

  • 8+ years of experience in a technical engineering or operations role, with at least 3 years in a leadership position. Background in both software engineering and IT/DevOps a plus.

  • Deep experience with complex distributed systems, infrastructure, and system debugging, triage and root cause analysis. Familiarity with observability tools like Datadog, Grafana, Prometheus, ELK, etc. a plus.

  • Strong understanding of hardware/software integration, particularly in cloud-connected device infrastructure including robotics, consumer electronics and embedded systems

  • Proven success leading incident response or SRE-style functions, and managing on-call teams

  • Ability to drive organization wide improvements by building trusted cross-functional relationships and technical collaboration across teams

  • Strong data and dash-boarding skills; can translate operational data into clear insights and action plans

  • Excellent communication and organizational skills; comfortable writing high-quality docs and leading blameless postmortems

What Makes You Stand Out

  • Relentless Drive for Quality: You set high standards for code and system design, continually raising the bar for your team and the organization.

  • Strong Cross-Functional Communicator: You effectively collaborate with product, operations, and executive teams to ensure technology and business goals are aligned.

  • Strategic Vision Paired with Execution: You think beyond immediate tasks to chart a roadmap that ensures platform longevity and innovation. You excel at driving changes that boost overall team cohesion and performance.

  • Passion for Innovation: You bring curiosity and enthusiasm for solving complex challenges in delivery and fleet management, keeping up with the latest trends and technologies in the space.

Average salary estimate

$175000 / YEARLY (est.)
min
max
$150000K
$200000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs
Photo of the Rise User
Posted 13 days ago

Serve Robotics is looking for a Senior Reliability Engineer with expertise in electromechanical systems testing and data-driven reliability improvements to enhance their autonomous delivery robots.

Photo of the Rise User
Posted 6 days ago

Serve Robotics is looking for an experienced VP of Human Resources to strategically lead and enhance our rapidly growing team and organizational culture.

Photo of the Rise User

Lead performance engineering efforts for Palo Alto Networks' Cortex Cloud, optimizing scalability and reliability in dynamic cloud environments.

Photo of the Rise User
Stantec Hybrid Sacramento, California, United States
Posted 14 days ago

Contribute to innovative building designs as an Electrical Designer Technician at Stantec, supporting electrical system projects with engineering expertise.

Photo of the Rise User

Seeking a Fixture Lead with expertise in precision mechanical fixture design and automation integration for the lithium-ion battery production environment.

Posted 7 days ago

Drive innovation in healthcare software as a Senior Software Engineer at Monarch Medical Technologies, working remotely to develop cutting-edge insulin management solutions.

Photo of the Rise User
Experian Hybrid 475 Anton Blvd., UNITED STATES, UNITED STATES, United States
Posted 4 days ago

Seeking an Integration Engineer to leverage technical and business expertise in healthcare revenue cycle management software at Experian Health.

ngc Hybrid United States-California-Sunnyvale
Posted 13 days ago

Northrop Grumman seeks a Sr. Principal Environmental Health and Safety Engineer to drive EHS excellence at their Sunnyvale manufacturing site by conducting hazard assessments and ensuring regulatory compliance.

Posted 11 days ago

Lead the development and launch of advanced Chiller systems as Principal Systems Engineer at GE Appliances' Louisville team.

Photo of the Rise User
Posted 13 days ago

A Process Engineer role at Wieland in Delaware, OH, focusing on equipment troubleshooting, process improvements, and leading operational excellence initiatives.

Niron Magnetics Hybrid Minneapolis, Minnesota, United States
Posted 7 days ago

Contribute to cutting-edge electric motor designs as a Senior Electromagnetic Engineer at Niron Magnetics, leveraging breakthrough magnet technology to innovate in the manufacturing sector.

Photo of the Rise User
Duolingo Hybrid New York, United States
Posted 11 days ago
Inclusive & Diverse
Empathetic
Diversity of Opinions
Collaboration over Competition
Growth & Learning
Transparent & Candid
Rise from Within
Work/Life Harmony
Medical Insurance
Dental Insurance
Vision Insurance
Maternity Leave
Paternity Leave
Mental Health Resources
401K Matching
Paid Time-Off

Duolingo seeks an AI Product Engineer to innovate AI-driven math learning experiences for their global user base.

Photo of the Rise User
Archer Hybrid San Jose, California, United States
Posted 13 days ago
Dental Insurance
Flexible Spending Account (FSA)
Health Savings Account (HSA)
Vision Insurance

Archer is seeking a Staff Engineer with expertise in aerospace vehicle simulation and Python/MATLAB to advance all-electric aircraft performance analysis.

Photo of the Rise User
Posted 4 days ago

Aerospace innovator Hermeus seeks Mechanical Engineering interns to design hydraulic and fuel systems for hypersonic aircraft in a paid Fall 2025 Los Angeles internship.

Photo of the Rise User
Intel Hybrid US, California, Folsom
Posted 12 days ago
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Growth & Learning
Transparent & Candid
Customer-Centric
Snacks
Onsite Gym
Family Coverage (Insurance)
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Learning & Development
Paid Time-Off
401K Matching
Maternity Leave
Paternity Leave

Contribute to Intel's PC innovation as a Firmware Validation Engineer ensuring firmware quality through meticulous testing and collaboration.

Why deliver a 2-pound burrito in a 2-ton car? Serve is the future of sustainable, self-driving delivery. Our zero-emissions rovers are designed to serve people in public spaces, starting with food delivery. We partner with platforms and merchants ...

107 jobs
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, onsite
DATE POSTED
May 15, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!