Job details

Lead Site Reliability Engineer - job 9 of 22

Get a free resume review

The Lead Site Reliability Engineering (SRE) is a critical part of our Visa Cloud platform strategy. In this role, you will be focused on ensuring Visa’s development platform and processes enable our software engineers to focus more on innovation than infrastructure. This role will drive the adoption of observability best practices and instrument automation for resolving recurring issues. You must be comfortable working with software engineering teams and supporting their demanding needs to ensure the security, availability and performance of the platform. This engineer must be capable of triaging issues on the front line as well as framing strategic initiatives from leadership. Being hands on keyboard is a must for this role with a focus on developing reliability engineering for Visa Cloud Platform.

Essential Functions:

You will guide the instrumentation of monitoring for the Visa Cloud Platform (IaaS/PaaS/Container as a service)
You will ensure the platform target SLAs are met and implement appropriate SLIs for supporting services
You will work with developers during service transition, evaluating reliability and operability of the applications and ensuring adequate monitoring, alerting and observability
You will partner with peers within Operations & Infrastructure supporting ongoing maintenance and enhancement of the platform
To be successful in this role, you must focus on setting standards for automating routine tasks and workflows in support of the larger DevEx SRE team
The right candidate must be capable of supporting multiple internal stakeholders with a variety of technical challenges. Excelling in this role requires the ability to analyze and discern patterns in the myriad of issues that arise and propose solutions to these problems.
Visa Cloud SRE team has 24/7/365 operation model and work schedule will be required to work in shift or on call support model (weekend required)

This is a hybrid position. Expectation of days in office will be confirmed by your hiring manager.

Average salary estimate

$135000 / YEARLY (est.)

min

max

$120000K

$150000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Lead Site Reliability Engineer, Visa

As the Lead Site Reliability Engineer at Visa in Ashburn, you’re stepping into a pivotal role where your expertise will drive the success of our cloud platform strategy. You’ll be at the forefront, ensuring that our software engineers can prioritize innovation while you handle the backend intricacies of infrastructure. Collaborating closely with engineering teams, you will foster a culture of observability and automation, making it easier to tackle recurring issues. Your hands-on approach to reliability engineering will shine as you work to improve the performance, security, and availability of the Visa Cloud Platform. Your responsibilities will include guiding the instrumentation of monitoring systems for our IaaS/PaaS/Container services, meeting crucial SLA targets, and aiding developers in transitioning services effectively. Emphasizing optimal workflows, you’ll streamline routine tasks, ensuring that the larger DevEx SRE team runs smoothly. Given the dynamic nature of our operations, your ability to analyze intricate issues and devise robust solutions will be key. The Visa Cloud SRE team operates around the clock, so flexibility is essential, including shifts or on-call support. This hybrid position will require collaboration in the office as determined by your hiring manager, allowing for a balance of remote and on-site work to enhance our team synergy and productivity.

Frequently Asked Questions (FAQs) for Lead Site Reliability Engineer Role at Visa

What are the primary responsibilities of a Lead Site Reliability Engineer at Visa?

At Visa, a Lead Site Reliability Engineer holds the vital responsibility of ensuring that our development platform enhances software engineers' productivity by minimizing infrastructure concerns. This includes driving the adoption of best practices for observability, addressing recurring issues through automation, and ensuring that platform SLAs are consistently met.