Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Staff Site Reliability Engineer - Cloud Engineering image - Rise Careers
Job details

Staff Site Reliability Engineer - Cloud Engineering - job 3 of 20

Visa’s Technology Organization is a community of problem solvers and innovators reshaping the future of commerce. We operate the world’s most sophisticated processing networks capable of handling more than 65k secure transactions a second across 80M merchants, 15k Financial Institutions, and billions of everyday people. While working with us you’ll get to work on complex distributed systems and solve massive scale problems centered on new payment flows, business and data solutions, cyber security, and B2C platforms.

 

The Opportunity:

As a Staff Site Reliability Engineer in Product Reliability Engineering, you will be part of a team that maintains and supports Visa's Data Platform and provides support for key cloud based Big data and Kafka Platforms. You will be responsible for driving innovation for our partners and clients, within Visa and globally. You will work on open-source Big Data and Kafka clusters focusing on Cloud, ensuring their availability, performance, reliability, and improving operational efficiency.

 

The Work itself:

Essential Functions:

· Design, build and manage Big Data and Kafka infrastructure on AWS, GCP and Azure.

· Manage and optimize Apache Big Data and Kafka clusters for high performance, reliability, and scalability.

· Develop tools and processes to monitor and analyze system performance and to identify potential issues.

· Collaborate with other teams to design and implement Solutions to improve reliability and efficiency of the Big data cloud platforms.

· Ensure security and compliance of the platforms within organizational guidelines.

· Other responsibilities include effective root cause analysis of major production incidents and the development of learning documentation. The person will identify and implement high-availability solutions for services with a single point of failure.

· The role involves planning and performing capacity expansions and upgrades in a timely manner to avoid any scaling issues and bugs. This includes automating repetitive tasks to reduce manual effort and prevent human errors.

· The successful candidate will tune alerting and set up observability to proactively identify issues and performance problems. They will also work closely with Level 3 teams in reviewing new use cases and cluster hardening techniques to build robust and reliable platforms.

· The role involves creating standard operating procedure documents and guidelines on effectively managing and utilizing the platforms. The person will leverage DevOps tools, disciplines (Incident, problem, and change management), and standards in day-to-day operations.

· The individual will ensure that the platforms can effectively meet performance and service level agreement requirements. They will also perform security remediation, automation, and self-healing as per the requirement.

· The individual will concentrate on developing automations and reports to minimize manual effort. This can be achieved through various automation tools such as Shell scripting, Ansible, or Python scripting, or by using any other programming language.

 

The Skills You Bring:

· Energy and Experience: A growth mindset that is curious and passionate about technologies and enjoys challenging projects on a global scale.

·  Challenge the Status Quo: Comfort in pushing the boundaries, “hacking” beyond traditional solutions.

·  Language Expertise: Expertise in one or more general development languages (e.g., Java, python)

· Builder: Experience building and deploying distributed systems.

·  Learner: Constant drive to learn new technologies such as cloud technologies, Kubernetes, MLOPS.

· Partnership: Experience collaborating with Engineering, Application and Other functional teams.

 

**We do not expect that any single candidate would fulfill all these characteristics. For instance, we have awesome team members who are really focused on building scalable systems but didn’t work with payments technology or web applications before joining Visa.

This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.

Average salary estimate

$140000 / YEARLY (est.)
min
max
$120000K
$160000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Staff Site Reliability Engineer - Cloud Engineering, Visa

Join Visa's innovative Technology Organization as a Staff Site Reliability Engineer - Cloud Engineering in Austin, where we're committed to revolutionizing the future of commerce. You'll be immersed in complex distributed systems and tackle massive scale challenges, focusing on new payment flows and robust data solutions. In this pivotal role, you will help maintain and support Visa's Data Platform, particularly our cloud-based Big Data and Kafka Platforms. Your primary responsibilities will include designing and managing Big Data and Kafka infrastructures across AWS, GCP, and Azure, ensuring their optimal performance and reliability. Imagine driving innovation not just for Visa, but for clients around the world! You'll create and optimize tools to monitor system performance, collaborate across teams to enhance platform efficiency, and ensure compliance with our security standards. Additionally, you’ll dive deep into root cause analysis of production incidents and champion automation to minimize manual tasks. You are encouraged to embrace a growth mindset, constantly learning about the latest technologies while being comfortable with challenging traditional solutions. We pride ourselves on a diverse team where unique backgrounds enrich our project outcomes. This hybrid position offers flexibility, allowing you to alternate between remote work and in-office collaboration, striking the perfect balance to meet business needs. Ready to shape the future of commerce? Let's innovate together at Visa!

Frequently Asked Questions (FAQs) for Staff Site Reliability Engineer - Cloud Engineering Role at Visa
What are the core responsibilities of a Staff Site Reliability Engineer at Visa?

As a Staff Site Reliability Engineer at Visa, your core responsibilities include designing, building, and managing Big Data and Kafka infrastructures across cloud platforms like AWS, GCP, and Azure. You'll oversee the optimization of these systems for high performance and reliability, develop monitoring tools, and collaborate with cross-functional teams to enhance the efficiency and security of our platforms.

Join Rise to see the full answer
What skills are required for the Staff Site Reliability Engineer position at Visa?

The ideal candidate for the Staff Site Reliability Engineer position at Visa should possess a growth mindset with expertise in general development languages such as Java or Python. Experience in building distributed systems and familiarity with cloud technologies, including Kubernetes and MLOps, are highly valued. Moreover, effective collaboration with other engineering and functional teams is crucial for success.

Join Rise to see the full answer
How does the Staff Site Reliability Engineer ensure system performance at Visa?

To ensure system performance, the Staff Site Reliability Engineer at Visa implements tools to monitor and analyze system metrics, conducts root cause analysis of incidents, and actively collaborates with other teams to devise solutions for improved reliability. You'll also work on capacity planning, automation of repetitive tasks, and setting up alerting and observability frameworks.

Join Rise to see the full answer
What qualifications are preferred for applicants applying to the Staff Site Reliability Engineer role at Visa?

While Visa does not expect candidates to meet every qualification, preferred qualifications for the Staff Site Reliability Engineer role include expertise in multiple programming languages, experience in developing and managing distributed systems, and a strong background in cloud technologies. Additionally, a passion for continuous learning and innovative problem-solving is essential.

Join Rise to see the full answer
What does the work environment look like for a Staff Site Reliability Engineer at Visa?

As a Staff Site Reliability Engineer at Visa, you can expect a hybrid work environment that combines the flexibility of remote work with in-office collaboration. You'll be encouraged to interact with team members in the office 2-3 days a week, fostering creativity and enhancing project outcomes while maintaining a balance tailored to business needs.

Join Rise to see the full answer
Common Interview Questions for Staff Site Reliability Engineer - Cloud Engineering
Can you describe your experience with cloud technologies in reliability engineering?

Highlight specific projects where you've designed and managed cloud infrastructures. Discuss the challenges faced and your approach to ensuring availability and performance, demonstrating your problem-solving skills in cloud environments.

Join Rise to see the full answer
How do you approach capacity planning for cloud-based systems?

Explain the strategies you use for capacity planning, including analyzing traffic patterns, historical data, and using monitoring tools. Emphasize the importance of scalability and proactive planning to avoid bottlenecks.

Join Rise to see the full answer
What steps would you take to troubleshoot a production incident?

Detail your systematic approach to root cause analysis, including gathering logs, collaborating with teams to identify issues, and documenting the process to prevent recurrence. Highlight the importance of communication during incident management.

Join Rise to see the full answer
How do you ensure the security and compliance of cloud platforms?

Discuss the security best practices you follow, such as regular audits, using encryption, and adhering to compliance regulations. Mention your experience in implementing security measures in cloud environments.

Join Rise to see the full answer
Can you provide an example of optimization you've done for Big Data or Kafka systems?

Share a specific example of how you've optimized Big Data or Kafka infrastructures, detailing the methods used (e.g., configurations, performance tuning) and the resulting impact on efficiency and system reliability.

Join Rise to see the full answer
What kind of automation tools do you prefer to use in your daily operations?

Mention your expertise in automation tools such as Ansible, Shell scripting, or Python. Share examples of tasks you've automated to enhance efficiency and reduce manual effort, emphasizing the benefits of automation.

Join Rise to see the full answer
How do you stay updated with the latest technologies in your field?

Discuss your continuous learning initiatives, such as attending workshops, pursuing certifications, reading industry publications, or participating in communities. Highlight your passion for remaining at the forefront of technology.

Join Rise to see the full answer
What experience do you have in collaborating with cross-functional teams?

Share examples of projects where you've effectively collaborated with different teams, focusing on how communication and teamwork contributed to project success. Highlight your ability to understand diverse perspectives.

Join Rise to see the full answer
How do you approach system reliability and performance monitoring?

Explain the performance metrics you track and how you use monitoring tools to gather data. Discuss your proactive measures in identifying potential issues before they affect users and your learning from past incidents.

Join Rise to see the full answer
What is your philosophy on challenging the status quo in engineering?

Express your belief in innovation and continuous improvement. Provide examples of when you've pushed traditional boundaries to find creative solutions, emphasizing the importance of adaptability and forward-thinking approaches.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User

Be a vital part of Visa's U.S. Client Marketing team as a Marketing Coordinator, driving engagement through strategic marketing initiatives.

Photo of the Rise User
Posted 6 days ago

Support a dynamic team of executives in a fast-paced environment as a Mid-Level Executive Administrator.

Photo of the Rise User

Join Faith Technologies, Inc. as an Electrical Designer II to contribute to innovative electrical solutions while advancing your career in a supportive work environment.

L3Harris Technologies Hybrid US, Warren County, OH; Ohio, Mason, OH
Posted 11 days ago

L3Harris Technologies invites an experienced Senior Specialist Project Engineer to oversee pivotal projects in enhancing technology solutions for national security.

Photo of the Rise User
Posted 5 days ago

As the Electrical Test Engineering Manager at Shield AI, you will lead a team of engineers to design and implement test solutions for advanced aerospace technologies.

Photo of the Rise User
Posted 5 days ago

Join Obsidian Solutions Group as a Computer Network Engineer to strengthen the security framework supporting U.S. foreign policy operations.

L3Harris Technologies Hybrid US, McLennan County, TX; Texas, Waco, TX
Posted 11 days ago

At L3Harris, we seek a seasoned Avionics Systems Engineer to spearhead the development of advanced technology solutions within the defense sector.

Woolpert is looking for a seasoned Architectural Project Manager to guide innovative aviation projects while fostering a collaborative team environment.

Photo of the Rise User
Enbridge Remote Park Rapids, MN, USA
Posted 2 days ago

Join Enbridge as a Senior Advisor in Lands & ROW where you'll play a pivotal role in supporting pipeline operations and land management.

Photo of the Rise User
TrueML Remote No location specified
Posted 11 days ago

TrueML is on the lookout for a Mid-Level Engineer II to enhance their cloud infrastructure in a fully remote DevOps team.

Visa Inc. operates as a payments technology company worldwide. The company facilitates commerce through the transfer of value and information among consumers, merchants, financial institutions, businesses, strategic partners, and government entiti...

11735 jobs
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
April 4, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!
LATEST ACTIVITY
Photo of the Rise User
9 people applied to Welder/Fabricator at Pyrotek
C
Someone from OH, Middletown just viewed Operations Analyst at Core Specialty Insurance
Photo of the Rise User
6 people applied to Technology Intern at SABIC
A
Someone from OH, Strongsville just viewed Graphic Design Intern at Anvil NorthWest
W
Someone from OH, Uhrichsville just viewed Director Operations at WVUMedicine
Photo of the Rise User
Someone from OH, Cincinnati just viewed Game Director, Scripps Sports at The E.W. Scripps Company
Photo of the Rise User
Someone from OH, Lorain just viewed 3D Modeler / Graphic Designer - Freelance at Twine
o
Someone from OH, Oxford just viewed Digital Media & Marketing Student Intern at osu
Photo of the Rise User
Someone from OH, Beachwood just viewed Dispensary Tech at Ayr Wellness
Photo of the Rise User
Someone from OH, Springfield just viewed Front Desk Clerk at Marriott International
L
Someone from OH, Akron just viewed Junior Graphic Designer at Little Spoon
Photo of the Rise User
Someone from OH, Columbus just viewed Licensing and Regulatory Compliance Analyst at Sportradar
Photo of the Rise User
Someone from OH, Mansfield just viewed US_EN_Operations_Warehouse Loader (Part Time) at Red Bull
Photo of the Rise User
Someone from OH, Dublin just viewed Salesforce Administrator at Multiverse
Photo of the Rise User
Someone from OH, Pickerington just viewed Salesforce Solution Analyst at GoodLeap
S
Someone from OH, Pickerington just viewed Salesforce Project Manager at Studio Science
Photo of the Rise User
Someone from OH, Dayton just viewed Medical Receptionist at LifeStance Health
C
Someone from OH, Massillon just viewed RN Ambulatory - Outpatient Infusion Therapy at CCF
Photo of the Rise User
Someone from OH, Columbus just viewed HR Business Partner (Maternity Cover) at Marshmallow
Photo of the Rise User
Someone from OH, Columbus just viewed Community Outreach Canvasser $24/Hr at Confidential
Photo of the Rise User
Someone from OH, Cincinnati just viewed Email Marketing Coordinator at Creative Circle
Photo of the Rise User
Someone from OH, Columbus just viewed UX Researcher, Amazon Autos at Amazon
Photo of the Rise User
Someone from OH, Cincinnati just viewed AI training and enablement at Writer