Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Staff Site Reliability Engineer - Cloud Engineering image - Rise Careers
Job details

Staff Site Reliability Engineer - Cloud Engineering - job 7 of 20

Visa’s Technology Organization is a community of problem solvers and innovators reshaping the future of commerce. We operate the world’s most sophisticated processing networks capable of handling more than 65k secure transactions a second across 80M merchants, 15k Financial Institutions, and billions of everyday people. While working with us you’ll get to work on complex distributed systems and solve massive scale problems centered on new payment flows, business and data solutions, cyber security, and B2C platforms.

 

The Opportunity:

As a Staff Site Reliability Engineer in Product Reliability Engineering, you will be part of a team that maintains and supports Visa's Data Platform and provides support for key cloud based Big data and Kafka Platforms. You will be responsible for driving innovation for our partners and clients, within Visa and globally. You will work on open-source Big Data and Kafka clusters focusing on Cloud, ensuring their availability, performance, reliability, and improving operational efficiency.

 

The Work itself:

Essential Functions:

· Design, build and manage Big Data and Kafka infrastructure on AWS, GCP and Azure.

· Manage and optimize Apache Big Data and Kafka clusters for high performance, reliability, and scalability.

· Develop tools and processes to monitor and analyze system performance and to identify potential issues.

· Collaborate with other teams to design and implement Solutions to improve reliability and efficiency of the Big data cloud platforms.

· Ensure security and compliance of the platforms within organizational guidelines.

· Other responsibilities include effective root cause analysis of major production incidents and the development of learning documentation. The person will identify and implement high-availability solutions for services with a single point of failure.

· The role involves planning and performing capacity expansions and upgrades in a timely manner to avoid any scaling issues and bugs. This includes automating repetitive tasks to reduce manual effort and prevent human errors.

· The successful candidate will tune alerting and set up observability to proactively identify issues and performance problems. They will also work closely with Level 3 teams in reviewing new use cases and cluster hardening techniques to build robust and reliable platforms.

· The role involves creating standard operating procedure documents and guidelines on effectively managing and utilizing the platforms. The person will leverage DevOps tools, disciplines (Incident, problem, and change management), and standards in day-to-day operations.

· The individual will ensure that the platforms can effectively meet performance and service level agreement requirements. They will also perform security remediation, automation, and self-healing as per the requirement.

· The individual will concentrate on developing automations and reports to minimize manual effort. This can be achieved through various automation tools such as Shell scripting, Ansible, or Python scripting, or by using any other programming language.

 

The Skills You Bring:

· Energy and Experience: A growth mindset that is curious and passionate about technologies and enjoys challenging projects on a global scale.

·  Challenge the Status Quo: Comfort in pushing the boundaries, “hacking” beyond traditional solutions.

·  Language Expertise: Expertise in one or more general development languages (e.g., Java, python)

· Builder: Experience building and deploying distributed systems.

·  Learner: Constant drive to learn new technologies such as cloud technologies, Kubernetes, MLOPS.

· Partnership: Experience collaborating with Engineering, Application and Other functional teams.

 

**We do not expect that any single candidate would fulfill all these characteristics. For instance, we have awesome team members who are really focused on building scalable systems but didn’t work with payments technology or web applications before joining Visa.

This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.

Average salary estimate

$135000 / YEARLY (est.)
min
max
$120000K
$150000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Staff Site Reliability Engineer - Cloud Engineering, Visa

Join Visa as a Staff Site Reliability Engineer - Cloud Engineering and become a key player in shaping the future of commerce! At Visa’s Technology Organization, we pride ourselves on being a community of innovative problem solvers dedicated to creating sophisticated processing networks that handle over 65,000 secure transactions every second. In this role based in Austin, you'll dive into a world of complex distributed systems, tackling massive scale problems involving new payment flows, cyber security, and developing data solutions. Your primary focus will be on maintaining and supporting Visa's Data Platform and ensuring key cloud-based technologies, like Big Data and Kafka platforms, run seamlessly. You’ll design, build, and manage these infrastructures on AWS, GCP, and Azure, enhancing their performance and reliability. Collaborating with a dynamic team, you will drive innovations that improve operational efficiency while supporting our partners and clients globally. You’ll also tackle root cause analysis of production incidents, automate repetitive tasks, and develop tools for monitoring system performance. Ideal candidates bring a growth mindset and are passionate about learning and challenging traditional solutions, making this an exciting opportunity to influence the evolution of technology in payments. If you have experience with DevOps disciplines and are eager to work within a hybrid model, this role could be your next big adventure at Visa!

Frequently Asked Questions (FAQs) for Staff Site Reliability Engineer - Cloud Engineering Role at Visa
What are the main responsibilities of a Staff Site Reliability Engineer - Cloud Engineering at Visa?

As a Staff Site Reliability Engineer - Cloud Engineering at Visa, your primary responsibilities will revolve around the design, management, and optimization of Big Data and Kafka infrastructure hosted on cloud platforms like AWS, GCP, and Azure. You will ensure these platforms' performance, reliability, and security while proactively identifying issues through monitoring tools. Collaboration with other teams to enhance operational efficiency and performing root cause analysis of production incidents are also key functions of this position.

Join Rise to see the full answer
What skills are essential for success as a Staff Site Reliability Engineer - Cloud Engineering at Visa?

To succeed as a Staff Site Reliability Engineer - Cloud Engineering at Visa, candidates should demonstrate expertise in cloud technologies, distributed systems, and various programming languages such as Java and Python. Familiarity with DevOps principles, including incident management and automation tools like Shell scripting and Ansible, is crucial. A growth mindset and the ability to collaborate effectively with Engineering and Application teams will also support your success in this innovative role.

Join Rise to see the full answer
How does the hybrid work model function for the Staff Site Reliability Engineer - Cloud Engineering role at Visa?

Visa's hybrid work model for the Staff Site Reliability Engineer - Cloud Engineering role allows employees to balance remote work with office presence. Employees are expected to work from the office 2-3 days a week, determined by leadership's guidance, with a general expectation of being in the office 50% or more of the time depending on business needs. This combination offers flexibility while ensuring strong teamwork and collaboration.

Join Rise to see the full answer
What technologies will I work with as a Staff Site Reliability Engineer - Cloud Engineering at Visa?

In your role as a Staff Site Reliability Engineer - Cloud Engineering at Visa, you will engage with various technologies including cloud platforms like AWS, GCP, and Azure, as well as open-source Big Data solutions and Kafka clusters. You will also have the opportunity to explore Kubernetes, MLOps, and automation tools, all while continuously learning and adapting to new technologies in a rapidly evolving field.

Join Rise to see the full answer
What kind of projects will I be involved in as a Staff Site Reliability Engineer - Cloud Engineering at Visa?

As a Staff Site Reliability Engineer - Cloud Engineering at Visa, you will be involved in a variety of exciting projects that focus on improving the performance and resilience of cloud-based Big Data and Kafka platforms. You will tackle challenges such as capacity planning, incident response, and automation, aiming to enhance the overall reliability and efficiency of services while driving innovation in payment technologies.

Join Rise to see the full answer
Common Interview Questions for Staff Site Reliability Engineer - Cloud Engineering
Can you explain your experience with cloud technologies relevant to the Staff Site Reliability Engineer position?

When answering this question, provide specific examples of your work with cloud platforms, highlighting how you've designed or managed infrastructure in AWS, GCP, or Azure. Mention any projects involving cloud security and compliance that demonstrate your understanding of these critical elements in system reliability.

Join Rise to see the full answer
How do you approach incident management in your role as a Site Reliability Engineer?

Discuss your incident management practices, emphasizing your methods for identifying, analyzing, and resolving incidents. Cite specific tools or processes you’ve utilized to monitor performance and ensure a proactive approach to mitigating potential issues.

Join Rise to see the full answer
What tools do you use for automation in your current role?

Explain the various automation tools you are familiar with, such as Shell scripting, Ansible, and any programming languages you've used, like Python. Provide examples of tasks you've automated to improve efficiency and reduce manual effort.

Join Rise to see the full answer
How do you ensure that your systems are secure and compliant?

Outline your strategies for ensuring security and compliance, such as conducting regular security audits and following industry standards. Discuss any experience you have with security remediation measures and how you've collaborated with teams to maintain compliance.

Join Rise to see the full answer
Can you provide an example of a challenging project you worked on?

Share a detailed account of a challenging project, focusing on the obstacles you faced and the creative solutions you implemented. Highlight your technical skills and teamwork in achieving the project's objectives.

Join Rise to see the full answer
Describe your experience with monitoring and observability tools.

Discuss the monitoring and observability tools you’ve used, including any specific metrics you monitor to assess system health. Mention how you leverage these tools to proactively identify issues before they escalate.

Join Rise to see the full answer
How do you approach collaboration with other teams?

Explain your collaborative approach, including how you effectively communicate and coordinate with Engineering, Application, and other functional teams to achieve shared goals. Provide an example of a successful collaboration that enhanced project outcomes.

Join Rise to see the full answer
What has been your experience with Apache Kafka?

Depending on your background, discuss your experience deploying and managing Kafka clusters, including any specific projects you've contributed to, how you ensured their reliability, and any challenges you overcame while working with Kafka.

Join Rise to see the full answer
How do you stay current with new technologies and trends in SRE?

Share your strategies for continuous learning in the fast-evolving field of Site Reliability Engineering, whether through online courses, attending conferences, or participating in tech communities. Mention any recent technologies or practices you are excited about.

Join Rise to see the full answer
What measures do you take to improve system performance?

Discuss specific techniques you’ve employed to enhance system performance, including optimizing configurations, scaling resources, and automating processes. Provide data or outcomes from previous roles to demonstrate the impact of your efforts.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Posted 10 days ago

Join Visa’s legal team as an Associate Counsel, where you'll provide key commercial and legal support for innovative financial services.

Photo of the Rise User

Visa seeks a Senior Associate Counsel to guide its Value-Added Services, offering a balance of legal expertise and business insight.

ngc Hybrid United States-Florida-Apopka
Posted 5 days ago

Join Northrop Grumman as a Principal Manufacturing Engineer and take part in revolutionary defense technology innovations.

Photo of the Rise User
American Express Remote Phoenix, Arizona, United States
Posted 11 days ago
Inclusive & Diverse
Empathetic
Collaboration over Competition
Growth & Learning
Transparent & Candid
Medical Insurance
Dental Insurance
Mental Health Resources
Life insurance
Disability Insurance
Child Care stipend
Employee Resource Groups
Learning & Development

American Express is seeking imaginative Engineers to enhance digital solutions and drive innovation in automated systems.

Photo of the Rise User
Posted 3 days ago

Become a Site Reliability Engineer at Visa and play a key role in enhancing our Service Now capabilities while collaborating with a global team.

Eudia Hybrid Palo Alto, California
Posted 3 hours ago

Eudia is looking for an Augmented Engineer to innovate legal solutions through AI while collaborating closely with clients.

Photo of the Rise User

Join Integral as a Senior Engineer VI to enhance UAS training systems and collaborate with government stakeholders.

Photo of the Rise User
AECOM Hybrid Los Angeles, CA, United States
Posted 8 days ago

AECOM is on the lookout for a Lead Resident Engineer to spearhead significant infrastructure projects in the wastewater sector.

Photo of the Rise User

Integral is in search of a Principal Engineer VIII to drive engineering support for Uncrewed Aircraft Systems and training solutions.

Serco North America Hybrid VA-Newport News US-VA-Norfolk US-VA-Chesapeake US-VA-Portsmouth
Posted 2 days ago

Join Serco as an Experienced Engineering Technician, supporting the U.S. Navy with crucial electrical work on Naval Ships in Newport News, VA.

Visa Inc. operates as a payments technology company worldwide. The company facilitates commerce through the transfer of value and information among consumers, merchants, financial institutions, businesses, strategic partners, and government entiti...

11734 jobs
MATCH
VIEW MATCH
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
April 3, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!
LATEST ACTIVITY
Photo of the Rise User
9 people applied to Welder/Fabricator at Pyrotek
C
Someone from OH, Middletown just viewed Operations Analyst at Core Specialty Insurance
Photo of the Rise User
6 people applied to Technology Intern at SABIC
A
Someone from OH, Strongsville just viewed Graphic Design Intern at Anvil NorthWest
W
Someone from OH, Uhrichsville just viewed Director Operations at WVUMedicine
Photo of the Rise User
Someone from OH, Cincinnati just viewed Game Director, Scripps Sports at The E.W. Scripps Company
Photo of the Rise User
Someone from OH, Lorain just viewed 3D Modeler / Graphic Designer - Freelance at Twine
o
Someone from OH, Oxford just viewed Digital Media & Marketing Student Intern at osu
Photo of the Rise User
Someone from OH, Beachwood just viewed Dispensary Tech at Ayr Wellness
Photo of the Rise User
Someone from OH, Springfield just viewed Front Desk Clerk at Marriott International
L
Someone from OH, Akron just viewed Junior Graphic Designer at Little Spoon
Photo of the Rise User
Someone from OH, Columbus just viewed Licensing and Regulatory Compliance Analyst at Sportradar
Photo of the Rise User
Someone from OH, Mansfield just viewed US_EN_Operations_Warehouse Loader (Part Time) at Red Bull
Photo of the Rise User
Someone from OH, Dublin just viewed Salesforce Administrator at Multiverse
Photo of the Rise User
Someone from OH, Pickerington just viewed Salesforce Solution Analyst at GoodLeap
S
Someone from OH, Pickerington just viewed Salesforce Project Manager at Studio Science
Photo of the Rise User
Someone from OH, Dayton just viewed Medical Receptionist at LifeStance Health
C
Someone from OH, Massillon just viewed RN Ambulatory - Outpatient Infusion Therapy at CCF
Photo of the Rise User
Someone from OH, Columbus just viewed HR Business Partner (Maternity Cover) at Marshmallow
Photo of the Rise User
Someone from OH, Columbus just viewed Community Outreach Canvasser $24/Hr at Confidential
Photo of the Rise User
Someone from OH, Cincinnati just viewed Email Marketing Coordinator at Creative Circle
Photo of the Rise User
Someone from OH, Columbus just viewed UX Researcher, Amazon Autos at Amazon
Photo of the Rise User
Someone from OH, Cincinnati just viewed AI training and enablement at Writer