Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Staff Site Reliability Engineer - Cloud Engineering image - Rise Careers
Job details

Staff Site Reliability Engineer - Cloud Engineering - job 4 of 20

Visa’s Technology Organization is a community of problem solvers and innovators reshaping the future of commerce. We operate the world’s most sophisticated processing networks capable of handling more than 65k secure transactions a second across 80M merchants, 15k Financial Institutions, and billions of everyday people. While working with us you’ll get to work on complex distributed systems and solve massive scale problems centered on new payment flows, business and data solutions, cyber security, and B2C platforms.

 

The Opportunity:

As a Staff Site Reliability Engineer in Product Reliability Engineering, you will be part of a team that maintains and supports Visa's Data Platform and provides support for key cloud based Big data and Kafka Platforms. You will be responsible for driving innovation for our partners and clients, within Visa and globally. You will work on open-source Big Data and Kafka clusters focusing on Cloud, ensuring their availability, performance, reliability, and improving operational efficiency.

 

The Work itself:

Essential Functions:

· Design, build and manage Big Data and Kafka infrastructure on AWS, GCP and Azure.

· Manage and optimize Apache Big Data and Kafka clusters for high performance, reliability, and scalability.

· Develop tools and processes to monitor and analyze system performance and to identify potential issues.

· Collaborate with other teams to design and implement Solutions to improve reliability and efficiency of the Big data cloud platforms.

· Ensure security and compliance of the platforms within organizational guidelines.

· Other responsibilities include effective root cause analysis of major production incidents and the development of learning documentation. The person will identify and implement high-availability solutions for services with a single point of failure.

· The role involves planning and performing capacity expansions and upgrades in a timely manner to avoid any scaling issues and bugs. This includes automating repetitive tasks to reduce manual effort and prevent human errors.

· The successful candidate will tune alerting and set up observability to proactively identify issues and performance problems. They will also work closely with Level 3 teams in reviewing new use cases and cluster hardening techniques to build robust and reliable platforms.

· The role involves creating standard operating procedure documents and guidelines on effectively managing and utilizing the platforms. The person will leverage DevOps tools, disciplines (Incident, problem, and change management), and standards in day-to-day operations.

· The individual will ensure that the platforms can effectively meet performance and service level agreement requirements. They will also perform security remediation, automation, and self-healing as per the requirement.

· The individual will concentrate on developing automations and reports to minimize manual effort. This can be achieved through various automation tools such as Shell scripting, Ansible, or Python scripting, or by using any other programming language.

 

The Skills You Bring:

· Energy and Experience: A growth mindset that is curious and passionate about technologies and enjoys challenging projects on a global scale.

·  Challenge the Status Quo: Comfort in pushing the boundaries, “hacking” beyond traditional solutions.

·  Language Expertise: Expertise in one or more general development languages (e.g., Java, python)

· Builder: Experience building and deploying distributed systems.

·  Learner: Constant drive to learn new technologies such as cloud technologies, Kubernetes, MLOPS.

· Partnership: Experience collaborating with Engineering, Application and Other functional teams.

 

**We do not expect that any single candidate would fulfill all these characteristics. For instance, we have awesome team members who are really focused on building scalable systems but didn’t work with payments technology or web applications before joining Visa.

This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.

Average salary estimate

$135000 / YEARLY (est.)
min
max
$120000K
$150000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Staff Site Reliability Engineer - Cloud Engineering, Visa

Join Visa, where innovation and technology meet to shape the future of commerce! As a Staff Site Reliability Engineer in Cloud Engineering, you'll dive into the exciting world of big data and cloud computing right here in Austin. In this role, you’ll be at the forefront of maintaining and enhancing Visa’s Data Platform, focusing on cutting-edge technologies like Apache Big Data and Kafka on cloud platforms like AWS, GCP, and Azure. Your mission? Ensure these systems are not only reliable and efficient but also secure, all while collaborating with a dynamic team that strives for excellence. You'll design, build, and optimize robust infrastructure, tackle complex challenges, and develop tools to monitor system performance. Moreover, root cause analysis, automating repetitive tasks, and enhancing operational efficiency will become second nature to you. At Visa, everyone’s input matters, so you'll regularly partner with Engineering, Application, and other functional teams to craft innovative, scalable solutions. With your passion for technology and your growth mindset, you’ll be challenged to think outside the box and push the envelope on what’s possible. In a hybrid work environment, you'll enjoy flexibility while still engaging with your teammates in the office a few days a week. Ready to make a global impact within a diverse and collaborative culture? Join us at Visa and help redefine the future!

Frequently Asked Questions (FAQs) for Staff Site Reliability Engineer - Cloud Engineering Role at Visa
What are the main responsibilities of a Staff Site Reliability Engineer at Visa?

As a Staff Site Reliability Engineer at Visa, you will be responsible for designing, building, and managing Big Data and Kafka infrastructure on leading cloud platforms like AWS, GCP, and Azure. You’ll need to optimize Apache Big Data and Kafka clusters for high performance, reliability, and scalability, while also developing monitoring tools to analyze system performance. Another key aspect of your role involves collaborating with various teams to enhance the reliability and efficiency of Visa’s cloud-based platforms.

Join Rise to see the full answer
What skills are required for the Staff Site Reliability Engineer position at Visa?

To succeed as a Staff Site Reliability Engineer at Visa, strong expertise in one or more general programming languages like Java or Python is essential. You should also have experience in building and deploying distributed systems. A growth mindset and eagerness to learn new technologies such as Kubernetes and MLOps can greatly benefit you in this role. Collaboration experience and a proactive approach to problem-solving are also crucial for this position.

Join Rise to see the full answer
Is prior experience in payments technology necessary for the Staff Site Reliability Engineer role at Visa?

No, Visa recognizes that candidates come from diverse backgrounds. While experience in building scalable systems is important, you don’t need prior payments technology experience to qualify for the Staff Site Reliability Engineer role. We value your unique perspective and skills, especially if you have a passion for tackling challenging projects on a global scale.

Join Rise to see the full answer
What does a typical day look like for a Staff Site Reliability Engineer at Visa?

A typical day for a Staff Site Reliability Engineer at Visa involves a blend of hands-on problem-solving and collaboration. You’ll spend time optimizing Apache Big Data and Kafka clusters, conducting root cause analyses for production incidents, and working on automating routine tasks. Additionally, you'll conduct performance monitoring and engage in discussions with other teams to ensure efficiency and reliability in Visa's cloud platforms.

Join Rise to see the full answer
What kind of work environment can Staff Site Reliability Engineers expect at Visa?

At Visa, Staff Site Reliability Engineers can expect a hybrid work environment that promotes flexibility and collaboration. You will be expected to work in the office 2-3 days a week, which helps foster teamwork and innovation. This setup allows you to connect with your colleagues while also enjoying the benefits of remote work, creating a balanced and supportive atmosphere.

Join Rise to see the full answer
Common Interview Questions for Staff Site Reliability Engineer - Cloud Engineering
How do you approach the monitoring and optimization of Apache Big Data clusters?

To monitor and optimize Apache Big Data clusters effectively, it’s essential to implement tailored monitoring tools adjusted to your system's architecture. Make sure you're able to analyze performance metrics, identify bottlenecks, and utilize alerting systems for proactive issue detection. Understanding scalability challenges and having procedures for load testing can also simplify managing large clusters.

Join Rise to see the full answer
What experience do you have with cloud technologies, particularly AWS, GCP, or Azure?

Discuss your hands-on experience with cloud platforms, including past projects where you deployed or optimized services. Highlight your understanding of services specific to their functionalities such as computing, storage solutions, and how you integrated various tools for seamless operation while emphasizing security and compliance.

Join Rise to see the full answer
Can you explain a time when you conducted root cause analysis after a major production incident?

Describe a specific incident where your methodical approach involved gathering data, analyzing system logs, and reviewing events leading up to the incident. Explain how you utilized collaboration and documentation to ensure clear communication and shared learning within the team to avoid similar issues in the future.

Join Rise to see the full answer
How do you ensure the security and compliance of cloud-based platforms?

Emphasize your knowledge of industry best practices around cloud security. Discuss your experience in implementing security controls, compliance frameworks, and conducting vulnerability assessments while ensuring data protection standards are met for cloud platforms.

Join Rise to see the full answer
What strategies do you use for automating repetitive tasks in your engineering work?

Talk about your experience with automation tools like Ansible, Shell scripting, or Python for reducing manual operations. Provide examples demonstrating how these strategies improved efficiency and reduced error rates, while supporting continuous deployment practices.

Join Rise to see the full answer
How do you prioritize tasks when managing system performance and reliability?

Describe your task prioritization methodology, such as assessing impact and urgency. Stress the importance of a calm, analytical approach to evaluating system performance metrics and incident reports to determine which issues require immediate attention versus those that can be scheduled.

Join Rise to see the full answer
What new technologies are you currently interested in and how do you plan to integrate them into your work?

Share your enthusiasm for emerging technologies by mentioning specific tools or frameworks. Elaborate on how you plan to stay informed about advancements and evaluate their potential to enhance the platforms you manage - demonstrating an active interest in continuous learning.

Join Rise to see the full answer
How do you handle a situation where a service has a single point of failure?

Discuss strategies such as identifying critical services, conducting risk assessments, and implementing redundancy measures to mitigate single points of failure. Provide specific examples where you've successfully implemented such strategies in past roles.

Join Rise to see the full answer
How do you ensure effective communication and collaboration with other teams?

Highlight your communication techniques, such as regular syncs, collaborative tools, and gathering feedback. Emphasize your adaptability in bridging gaps between technical and non-technical teams to enhance project outcomes.

Join Rise to see the full answer
Why do you think the role of a Staff Site Reliability Engineer is important in today’s technology landscape?

Articulate the essential role SREs play in maintaining seamless user experiences through reliability and performance optimization. Discuss how SREs contribute to innovation by building resilient systems and fostering a culture of continuous improvement in engineering teams.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Posted 4 days ago

Visa, a leader in payments technology, is looking for a skilled Business Analyst in Cybersecurity to support their Cyber Defense team.

Photo of the Rise User

Visa is looking for a Senior Manager to lead global talent acquisition compliance, focusing on governance and risk in hiring practices.

Photo of the Rise User

Join UES as a Senior Construction Materials Testing Field Technician and contribute to pioneering engineering solutions in a collaborative environment.

Photo of the Rise User
Xero Remote No location specified
Posted 4 days ago
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Customer-Centric
Social Impact Driven
Passion for Exploration
Family Medical Leave
Maternity Leave
Paternity Leave
Family Coverage (Insurance)
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)

As a Lead Engineer at Xero, you will drive software innovation and technical excellence while mentoring your team.

Posted 4 days ago

Join Northrop Grumman as an Electromechanical Engineer in Beavercreek, OH, and contribute to cutting-edge technology in the aerospace and military fields.

Posted 10 days ago

Exciting opportunity for an Electrochemical Engineer to join Travertine, a pioneering climate tech start-up, advancing sustainable production technologies.

Photo of the Rise User
Posted 13 days ago

Join Aggreko as a Travel Technician III and become an essential part of delivering energy solutions across various projects while enjoying a home-based role with extensive travel.

Photo of the Rise User
Posted 10 days ago

Air Liquide is looking for an experienced Maintenance Engineer to lead maintenance operations in Bayport, TX, ensuring safety and compliance across our industrial gas facilities.

Photo of the Rise User
Anduril Industries Hybrid Quincy, Massachusetts, United States
Posted 11 days ago

As a Senior Manufacturing Test Engineer at Anduril, you'll play a key role in developing automated test procedures for cutting-edge defense technology.

Photo of the Rise User
Bosch Group Remote 38000 Hills Tech Dr, Farmington Hills, MI 48331, USA
Posted 12 hours ago

Join Robert Bosch LLC as a Sr Calibration Engineer to advance powertrain calibration technologies for sustainable automotive innovations.

Visa Inc. operates as a payments technology company worldwide. The company facilitates commerce through the transfer of value and information among consumers, merchants, financial institutions, businesses, strategic partners, and government entiti...

11734 jobs
MATCH
VIEW MATCH
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
April 4, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!
LATEST ACTIVITY
Photo of the Rise User
9 people applied to Welder/Fabricator at Pyrotek
C
Someone from OH, Middletown just viewed Operations Analyst at Core Specialty Insurance
Photo of the Rise User
6 people applied to Technology Intern at SABIC
A
Someone from OH, Strongsville just viewed Graphic Design Intern at Anvil NorthWest
W
Someone from OH, Uhrichsville just viewed Director Operations at WVUMedicine
Photo of the Rise User
Someone from OH, Cincinnati just viewed Game Director, Scripps Sports at The E.W. Scripps Company
Photo of the Rise User
Someone from OH, Lorain just viewed 3D Modeler / Graphic Designer - Freelance at Twine
o
Someone from OH, Oxford just viewed Digital Media & Marketing Student Intern at osu
Photo of the Rise User
Someone from OH, Beachwood just viewed Dispensary Tech at Ayr Wellness
Photo of the Rise User
Someone from OH, Springfield just viewed Front Desk Clerk at Marriott International
L
Someone from OH, Akron just viewed Junior Graphic Designer at Little Spoon
Photo of the Rise User
Someone from OH, Columbus just viewed Licensing and Regulatory Compliance Analyst at Sportradar
Photo of the Rise User
Someone from OH, Mansfield just viewed US_EN_Operations_Warehouse Loader (Part Time) at Red Bull
Photo of the Rise User
Someone from OH, Dublin just viewed Salesforce Administrator at Multiverse
Photo of the Rise User
Someone from OH, Pickerington just viewed Salesforce Solution Analyst at GoodLeap
S
Someone from OH, Pickerington just viewed Salesforce Project Manager at Studio Science
Photo of the Rise User
Someone from OH, Dayton just viewed Medical Receptionist at LifeStance Health
C
Someone from OH, Massillon just viewed RN Ambulatory - Outpatient Infusion Therapy at CCF
Photo of the Rise User
Someone from OH, Columbus just viewed HR Business Partner (Maternity Cover) at Marshmallow
Photo of the Rise User
Someone from OH, Columbus just viewed Community Outreach Canvasser $24/Hr at Confidential
Photo of the Rise User
Someone from OH, Cincinnati just viewed Email Marketing Coordinator at Creative Circle
Photo of the Rise User
Someone from OH, Columbus just viewed UX Researcher, Amazon Autos at Amazon
Photo of the Rise User
Someone from OH, Cincinnati just viewed AI training and enablement at Writer