Job Summary: As the Senior Director of Site Reliability Engineering (SRE), you will lead a team of SREs to ensure the highest level of performance and reliability of our services. You will be responsible for the end-to-end availability and performance of mission-critical services and building automation to prevent problem recurrence. The role requires a strategic leader who can create a vision for the SRE function and drive a culture of ‘automation first’ to improve the scalability and stability of our systems.
Essential Functions:
Lead and scale the SRE team, setting objectives and key results that align with the company’s strategic goals.
Develop and implement SRE policies, standards, and best practices for enterprise-wide systems.
Define standards for building reliable applications that are highly available and resilient.
Drive the adoption of a DevSecOps culture, fostering collaboration between development and operations teams.
Oversee the design and implementation of solutions for system monitoring, logging, alerting, and incident response.
Collaborate with product development teams to ensure reliability and scalability are considered at the design phase.
Manage on-call rotations, incident management processes, and post-mortem analyses to ensure continuous improvement.
Define Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets for all critical services.
Work closely with the security team to ensure compliance with industry standards and regulatory requirements.
Lead initiatives to improve CI/CD pipelines and automate infrastructure provisioning and deployment.
Provide technical leadership and mentorship to team members, encouraging professional growth and technical excellence.
This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.
Visa is not offering relocation assistance for this role.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
As the Senior Director of Site Reliability Engineering at our company in Foster City, you will play a pivotal role in shaping the future of our services. You’ll lead a dynamic team of Site Reliability Engineers, ensuring not just availability but exceptional performance and reliability across mission-critical systems. Your strategic vision will drive a culture centered on automation, making our infrastructure more scalable and stable. You'll set objectives and key results that align with the company’s goals while developing and implementing SRE policies and best practices across our systems. Your collaborative spirit will see you engaging with product development teams to incorporate reliability and scalability from the outset of projects. Monitoring, incident management, and continuous improvement will be part of your daily routine, as you define Service Level Objectives and maintain our DevSecOps culture. This hybrid position allows for flexibility, enabling you to balance working from home with in-office collaboration. If you’re passionate about building efficient systems and leading talented teams, this is the perfect opportunity to elevate our SRE function in an impactful way!
Visa Inc. operates as a payments technology company worldwide. The company facilitates commerce through the transfer of value and information among consumers, merchants, financial institutions, businesses, strategic partners, and government entiti...
8332 jobsSubscribe to Rise newsletter