Job Summary: As the Senior Director of Site Reliability Engineering (SRE), you will lead a team of SREs to ensure the highest level of performance and reliability of our services. You will be responsible for the end-to-end availability and performance of mission-critical services and building automation to prevent problem recurrence. The role requires a strategic leader who can create a vision for the SRE function and drive a culture of ‘automation first’ to improve the scalability and stability of our systems.
Essential Functions:
Lead and scale the SRE team, setting objectives and key results that align with the company’s strategic goals.
Develop and implement SRE policies, standards, and best practices for enterprise-wide systems.
Define standards for building reliable applications that are highly available and resilient.
Drive the adoption of a DevSecOps culture, fostering collaboration between development and operations teams.
Oversee the design and implementation of solutions for system monitoring, logging, alerting, and incident response.
Collaborate with product development teams to ensure reliability and scalability are considered at the design phase.
Manage on-call rotations, incident management processes, and post-mortem analyses to ensure continuous improvement.
Define Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets for all critical services.
Work closely with the security team to ensure compliance with industry standards and regulatory requirements.
Lead initiatives to improve CI/CD pipelines and automate infrastructure provisioning and deployment.
Provide technical leadership and mentorship to team members, encouraging professional growth and technical excellence.
This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.
Visa is not offering relocation assistance for this role.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
As the Senior Director of Site Reliability Engineering at our innovative company based in Foster City, you'll be stepping into a pivotal leadership role. Your primary mission? To ensure our services perform at the highest level while maintaining exceptional reliability. Imagine leading a talented team of Site Reliability Engineers dedicated to making sure our mission-critical services run seamlessly. Your knack for developing a strategic vision will help shape the SRE function within our organization, instilling a culture of 'automation first' that focuses on scalability and stability. From leading the SRE team to develop impactful objectives aligning with our strategic goals to defining standards for reliable applications, this role is all about driving excellence. You'll collaborate closely with product development teams to integrate reliability into the design phase, oversee proactive incident management processes, and champion best practices for system monitoring and alerting. This hybrid position allows flexibility, enabling you to balance remote work with essential office interactions. If you're ready to make a significant impact while fostering collaboration across development and operations teams, we want to hear from you!
Join our team as a Power Generator Technician responsible for maintaining and servicing power generators in Mobile, AL.
Visa Inc. operates as a payments technology company worldwide. The company facilitates commerce through the transfer of value and information among consumers, merchants, financial institutions, businesses, strategic partners, and government entiti...
8337 jobsSubscribe to Rise newsletter