Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Staff Site Reliability Engineer - Cloud Engineering image - Rise Careers
Job details

Staff Site Reliability Engineer - Cloud Engineering - job 12 of 20

Visa’s Technology Organization is a community of problem solvers and innovators reshaping the future of commerce. We operate the world’s most sophisticated processing networks capable of handling more than 65k secure transactions a second across 80M merchants, 15k Financial Institutions, and billions of everyday people. While working with us you’ll get to work on complex distributed systems and solve massive scale problems centered on new payment flows, business and data solutions, cyber security, and B2C platforms.

 

The Opportunity:

As a Staff Site Reliability Engineer in Product Reliability Engineering, you will be part of a team that maintains and supports Visa's Data Platform and provides support for key cloud based Big data and Kafka Platforms. You will be responsible for driving innovation for our partners and clients, within Visa and globally. You will work on open-source Big Data and Kafka clusters focusing on Cloud, ensuring their availability, performance, reliability, and improving operational efficiency.

 

The Work itself:

Essential Functions:

· Design, build and manage Big Data and Kafka infrastructure on AWS, GCP and Azure.

· Manage and optimize Apache Big Data and Kafka clusters for high performance, reliability, and scalability.

· Develop tools and processes to monitor and analyze system performance and to identify potential issues.

· Collaborate with other teams to design and implement Solutions to improve reliability and efficiency of the Big data cloud platforms.

· Ensure security and compliance of the platforms within organizational guidelines.

· Other responsibilities include effective root cause analysis of major production incidents and the development of learning documentation. The person will identify and implement high-availability solutions for services with a single point of failure.

· The role involves planning and performing capacity expansions and upgrades in a timely manner to avoid any scaling issues and bugs. This includes automating repetitive tasks to reduce manual effort and prevent human errors.

· The successful candidate will tune alerting and set up observability to proactively identify issues and performance problems. They will also work closely with Level 3 teams in reviewing new use cases and cluster hardening techniques to build robust and reliable platforms.

· The role involves creating standard operating procedure documents and guidelines on effectively managing and utilizing the platforms. The person will leverage DevOps tools, disciplines (Incident, problem, and change management), and standards in day-to-day operations.

· The individual will ensure that the platforms can effectively meet performance and service level agreement requirements. They will also perform security remediation, automation, and self-healing as per the requirement.

· The individual will concentrate on developing automations and reports to minimize manual effort. This can be achieved through various automation tools such as Shell scripting, Ansible, or Python scripting, or by using any other programming language.

 

The Skills You Bring:

· Energy and Experience: A growth mindset that is curious and passionate about technologies and enjoys challenging projects on a global scale.

·  Challenge the Status Quo: Comfort in pushing the boundaries, “hacking” beyond traditional solutions.

·  Language Expertise: Expertise in one or more general development languages (e.g., Java, python)

· Builder: Experience building and deploying distributed systems.

·  Learner: Constant drive to learn new technologies such as cloud technologies, Kubernetes, MLOPS.

· Partnership: Experience collaborating with Engineering, Application and Other functional teams.

 

**We do not expect that any single candidate would fulfill all these characteristics. For instance, we have awesome team members who are really focused on building scalable systems but didn’t work with payments technology or web applications before joining Visa.

This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.

Average salary estimate

$135000 / YEARLY (est.)
min
max
$120000K
$150000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Staff Site Reliability Engineer - Cloud Engineering, Visa

At Visa, the role of Staff Site Reliability Engineer - Cloud Engineering in Austin is more than just a job; it's an opportunity to be at the forefront of technical innovation! If you enjoy tackling complex distributed systems and want to play a crucial role in shaping the future of commerce, this position is perfect for you. You’ll be diving into the world of Big Data and cloud-based Kafka platforms, ensuring seamless availability and performance. What does that look like? You'll be designing and managing infrastructure on platforms like AWS, GCP, and Azure, optimizing Apache Big Data and Kafka clusters, and collaborating with cross-functional teams to enhance reliability. Your mission will also include developing tools that monitor system performance, analyzing potential issues, and implementing high-availability solutions. Not to mention, you’ll get to automate those repetitive tasks that can sometimes lead to human error. At Visa, we value a growth mindset and encourage a culture of learning. You may have expertise in programming languages like Java or Python and experience with distributed systems, but we’re most excited about your curiosity and willingness to challenge the status quo. Your journey at Visa will empower you to build impactful solutions on a global scale, all while working collaboratively with other talented individuals. With a hybrid working model, you'll enjoy a flexible work environment that promotes both teamwork in the office and productivity at home. If you are ready to push the boundaries of technology and create a future of secure transactions, apply today and become a part of our innovative team!

Frequently Asked Questions (FAQs) for Staff Site Reliability Engineer - Cloud Engineering Role at Visa
What are the primary responsibilities of a Staff Site Reliability Engineer - Cloud Engineering at Visa?

As a Staff Site Reliability Engineer - Cloud Engineering at Visa, your key responsibilities include designing, building, and managing Big Data and Kafka infrastructure on platforms like AWS, GCP, and Azure. You will also monitor and optimize Apache Big Data and Kafka clusters for high performance and reliability, while collaborating with teams to enhance the cloud platforms' efficiency.

Join Rise to see the full answer
What qualifications do you need to be a Staff Site Reliability Engineer at Visa?

To excel as a Staff Site Reliability Engineer at Visa, you should have a solid understanding and experience in one or more programming languages such as Java or Python, and ideally have experience in building distributed systems. Familiarity with cloud technologies, Kubernetes, and MLOPS is also beneficial, along with a strong collaborative spirit to work effectively with various engineering teams.

Join Rise to see the full answer
How does Visa ensure security and compliance for their cloud platforms in the Staff Site Reliability Engineer role?

In the role of Staff Site Reliability Engineer, security and compliance are prioritized through adherence to organizational guidelines. Responsibilities include ensuring platform security, performing security remediation, automating processes, and implementing self-healing capabilities to maintain system integrity, which helps meet both security and performance standards.

Join Rise to see the full answer
What tools and technologies are utilized by the Staff Site Reliability Engineer at Visa?

A Staff Site Reliability Engineer at Visa will utilize a variety of tools and technologies, including but not limited to Shell scripting, Ansible, Python, and other programming languages for automation. Knowledge of cloud services (AWS, GCP, Azure) and distributed systems is critical as you work to build resilient Kafka clusters and Big Data infrastructure.

Join Rise to see the full answer
What does the hybrid work model look like for the Staff Site Reliability Engineer role at Visa?

The Staff Site Reliability Engineer position at Visa operates on a hybrid work model, where you can alternate between remote work and office presence. Employees are encouraged to work from the office 2-3 days a week, based on business needs, providing a balance of collaboration and individual productivity.

Join Rise to see the full answer
Common Interview Questions for Staff Site Reliability Engineer - Cloud Engineering
Can you explain your experience with building distributed systems as a Staff Site Reliability Engineer?

When answering this question, highlight specific projects where you contributed to building distributed systems. Discuss the technologies you used, the challenges you faced, and how you overcame them. Demonstrating a clear understanding of both the technical and collaborative aspects will showcase your qualifications.

Join Rise to see the full answer
How would you optimize Big Data and Kafka clusters for performance?

Focus on discussing various tuning techniques you've employed, such as configuring cluster parameters, optimizing data pipelines, and implementing monitoring tools to detect and resolve performance issues proactively. Sharing quantitative results from your optimizations can strengthen your response.

Join Rise to see the full answer
Describe a time you handled a major production incident.

Use the STAR method (Situation, Task, Action, Result) to structure your response. Detail the incident, explain your role in resolving it, and highlight any improvements made to prevent future occurrences. This showcases your analytical skills under pressure and your commitment to system reliability.

Join Rise to see the full answer
What automation tools are you familiar with, and how have you used them in your work?

Elaborate on your experience with automation tools like Shell scripting, Ansible, or Python. Discuss specific use cases where you've implemented automation to streamline processes, reduce manual work, or enhance system reliability, emphasizing the business impact of your actions.

Join Rise to see the full answer
How do you ensure compliance and maintain security in your engineering practices?

Discuss your understanding of compliance standards relevant to cloud engineering and security best practices. Explain methodologies you've employed to ensure security, such as regular audits, implementing role-based access controls, and keeping documentation up to date, to demonstrate your proactive approach.

Join Rise to see the full answer
What does your process look like for troubleshooting system performance issues?

Outline your systematic approach to troubleshooting, mentioning tools or methodologies you employ to diagnose and resolve performance issues. Discuss your experience in identifying bottlenecks and how you collaborate with others to implement solutions effectively.

Join Rise to see the full answer
How comfortable are you with programming and what languages do you prefer?

Be honest about your programming competencies. Highlight your preferred languages (like Java or Python) and provide examples of how you've used them in previous roles, particularly in relation to site reliability engineering tasks, emphasizing your adaptability in learning new technologies.

Join Rise to see the full answer
Can you give an example of how you've collaborated with cross-functional teams?

Utilize specific examples to describe how you've worked with teams across various functions to develop solutions. Stress the importance of communication and how you facilitated collaboration to meet a common goal, emphasizing your teamwork and interpersonal skills.

Join Rise to see the full answer
What challenges have you faced in a hybrid work model, and how did you overcome them?

Reflect on your experiences in a hybrid environment and address the unique challenges, such as communication and collaboration. Share strategies you implemented to maintain productivity and engagement with your team, illustrating your adaptability to new work conditions.

Join Rise to see the full answer
Why do you believe you are a good fit for the Staff Site Reliability Engineer position at Visa?

This is your opportunity to match your skills and experiences with the role's requirements. Emphasize your passion for technology, your collaborative mindset, and specific experiences that align well with Visa's mission and the responsibilities of a Staff Site Reliability Engineer.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User

Join Visa as a Product Analyst to enhance model risk governance and drive client success in the payments industry.

Photo of the Rise User
Posted 3 days ago

Visa seeks a detail-oriented Senior Paralegal to oversee its corporate legal entities and ensure compliance across its North American operations.

Photo of the Rise User

Become a key part of Sargent & Lundy as a Senior Transmission Line Engineer, driving innovative engineering solutions for comprehensive transmission line projects.

Posted 10 days ago

As a Senior Mechanical Test Engineer at MKS, you'll develop and sustain advanced test systems for cutting-edge production technology.

Photo of the Rise User
Posted 2 days ago

Join Kimley-Horn’s Princeton office as a Civil Engineer, where you will lead exciting structural projects and mentor emerging engineers.

Photo of the Rise User
Thomson Reuters Remote IND-BLR-Salarpuria Sattva Knowledge Court
Posted 12 days ago

Be a pivotal part of Thomson Reuters as a DevOps Engineer, utilizing your expertise to enhance our innovative AI solutions.

Photo of the Rise User

Join Boeing as a Lead Hardware Systems Engineer focusing on mission systems for fighter aircraft, where your expertise will shape advanced defense technologies.

Photo of the Rise User

Become a key player in our Data & Algorithm team as a Site Reliability Engineer, ensuring system reliability and automation.

Photo of the Rise User
Performance Bonus
Paid Holidays

Transform healthcare by joining Doctolib as a Senior Site Reliability Engineer, building robust infrastructure automation and tooling.

Photo of the Rise User
Posted 11 days ago

Join ORAU as a Health Physicist 3, where you'll utilize your engineering expertise on complex decommissioning projects.

Visa Inc. operates as a payments technology company worldwide. The company facilitates commerce through the transfer of value and information among consumers, merchants, financial institutions, businesses, strategic partners, and government entiti...

11735 jobs
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
April 3, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!
LATEST ACTIVITY
Photo of the Rise User
9 people applied to Welder/Fabricator at Pyrotek
C
Someone from OH, Middletown just viewed Operations Analyst at Core Specialty Insurance
Photo of the Rise User
6 people applied to Technology Intern at SABIC
A
Someone from OH, Strongsville just viewed Graphic Design Intern at Anvil NorthWest
W
Someone from OH, Uhrichsville just viewed Director Operations at WVUMedicine
Photo of the Rise User
Someone from OH, Cincinnati just viewed Game Director, Scripps Sports at The E.W. Scripps Company
Photo of the Rise User
Someone from OH, Lorain just viewed 3D Modeler / Graphic Designer - Freelance at Twine
o
Someone from OH, Oxford just viewed Digital Media & Marketing Student Intern at osu
Photo of the Rise User
Someone from OH, Beachwood just viewed Dispensary Tech at Ayr Wellness
Photo of the Rise User
Someone from OH, Springfield just viewed Front Desk Clerk at Marriott International
L
Someone from OH, Akron just viewed Junior Graphic Designer at Little Spoon
Photo of the Rise User
Someone from OH, Columbus just viewed Licensing and Regulatory Compliance Analyst at Sportradar
Photo of the Rise User
Someone from OH, Mansfield just viewed US_EN_Operations_Warehouse Loader (Part Time) at Red Bull
Photo of the Rise User
Someone from OH, Dublin just viewed Salesforce Administrator at Multiverse
Photo of the Rise User
Someone from OH, Pickerington just viewed Salesforce Solution Analyst at GoodLeap
S
Someone from OH, Pickerington just viewed Salesforce Project Manager at Studio Science
Photo of the Rise User
Someone from OH, Dayton just viewed Medical Receptionist at LifeStance Health
C
Someone from OH, Massillon just viewed RN Ambulatory - Outpatient Infusion Therapy at CCF
Photo of the Rise User
Someone from OH, Columbus just viewed HR Business Partner (Maternity Cover) at Marshmallow
Photo of the Rise User
Someone from OH, Columbus just viewed Community Outreach Canvasser $24/Hr at Confidential
Photo of the Rise User
Someone from OH, Cincinnati just viewed Email Marketing Coordinator at Creative Circle
Photo of the Rise User
Someone from OH, Columbus just viewed UX Researcher, Amazon Autos at Amazon
Photo of the Rise User
Someone from OH, Cincinnati just viewed AI training and enablement at Writer