Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Lead Systems Engineer, High-Performance Computing image - Rise Careers
Job details

Lead Systems Engineer, High-Performance Computing - job 15 of 20

IaaS Systems and Storage & Engineering (ISSE) team is part of the Operations & Infrastructure technology organization. Distributed Compute engineering (DCE) is part of ISSE and High-performance compute platform engineering is part of DCE. Our vision, mission and purpose are summarized as following:

Vision: To become a leading technical engineering professional, pioneering in the design and automation of server infrastructure. We envision creating highly secure and efficient operations environments that drive business success and technological advancement.

Mission: Our mission is to deliver high-quality server infrastructure design and automated implementation. We are committed to operating in complex, highly secure, and highly available environments, while maintaining rigorous operations, security, and procedural models.

Purpose: The purpose of this role is to utilize strong hands-on technical engineering skills to design and automate the implementation of server infrastructure based on business requirements. This role will interact with technology domain experts to maintain high security and availability in complex operational environments, thereby driving business efficiency and security.

Essential Functions:

  • GPU as a Service and High-Performance Compute Platform Support: Expertise in deploying, managing, and optimizing GPU as a Service (GaaS) and high-performance compute platforms to support advanced computational workloads.
  • Extensive Datacenter Experience: Proficient in managing complex, geographically distributed IT infrastructures to ensure high availability and performance.
  • Advanced Technical Knowledge: Profound understanding of high-performance, highly available, and secure computing systems utilizing x86 technologies and protocols (NVME, GPU, PCI-E).
  • Enterprise Server and Component Expertise: In-depth knowledge of server components (storage/network controllers, HBA, SSDs) and their functionalities, essential for maintaining high-performance compute environments.
  • Processor and GPU Systems Proficiency: Strong grasp of Intel/AMD architectures, GPU systems, memory hierarchy, and hardware-level security to enhance system performance and reliability.
  • Out-of-Band, UEFI, and BIOS Expertise: Comprehensive understanding of out-of-band management, UEFI, BIOS settings, and their impact on system performance and security in high-performance computing environments.
  • Hardware Lifecycle Management: Experienced in hardware lifecycle management, including firmware and OS driver certifications, to ensure the longevity and reliability of compute resources.
  • Infrastructure Management and Automation: Proficient in installing, configuring, supporting, and maintaining compute infrastructure management tools, with skills in Ansible for automation to streamline deployment and operational tasks.
  • Performance Benchmarking and Tech Evaluation: Capable of running performance benchmarks and evaluating new technologies for various platforms (Linux, Windows, containerized, and virtualized) to ensure optimal performance.
  • Scripting Proficiency: Advanced skills in scripting languages such as PowerShell and Python to automate and optimize infrastructure tasks.
  • Team and Independent Work: Highly motivated, excellent team player, capable of working independently, with strong analytical and troubleshooting abilities to resolve complex issues and mentor junior staff.

This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.

Average salary estimate

$135000 / YEARLY (est.)
min
max
$120000K
$150000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Lead Systems Engineer, High-Performance Computing, Visa

We are excited to announce an amazing opportunity to join our team as a Lead Systems Engineer in High-Performance Computing at our Ashburn location! In this role with the IaaS Systems and Storage & Engineering (ISSE) team, you will be part of a dynamic group dedicated to pioneering innovations in server infrastructure design and automation. Your primary responsibility will be to harness your extensive technical engineering skills to create robust server infrastructures tailored to our business needs. You’ll work closely with technology domain experts in a highly complex and secure environment, ensuring that our high-performance computing platforms operate at peak efficiency. If you’re passionate about GPU as a Service, possess advanced knowledge of x86 technologies, and thrive in a fast-paced environment, we want to hear from you! The job demands expertise in managing distributed IT infrastructures, as well as hands-on experience with hardware lifecycle management and automation tools like Ansible. Strong scripting skills in PowerShell and Python will also set you apart. Plus, with the hybrid nature of this position, you'll have the flexibility to balance remote work and time spent in the office. If you're excited about contributing to a mission that prioritizes high-quality server infrastructure in a collaborative and innovative setting, consider applying for the Lead Systems Engineer position today!

Frequently Asked Questions (FAQs) for Lead Systems Engineer, High-Performance Computing Role at Visa
What are the main responsibilities of the Lead Systems Engineer at the High-Performance Computing team?

As a Lead Systems Engineer in the High-Performance Computing team at our Ashburn location, your main responsibilities will include designing and automating server infrastructure, managing GPU as a Service platforms, and maintaining the security and availability of complex IT systems. You'll also conduct performance benchmarking, perform hardware lifecycle management, and contribute to the overall efficiency of our advanced computational workloads.

Join Rise to see the full answer
What qualifications do I need to apply for the Lead Systems Engineer role?

To apply for the Lead Systems Engineer position in High-Performance Computing, you should possess advanced technical knowledge of high-performance computing systems, experience with x86 technologies, and a strong grasp of Intel and AMD architectures. Proficiency in scripting languages like PowerShell and Python is also highly beneficial, along with hands-on experience in managing distributed IT infrastructures and hardware lifecycle management.

Join Rise to see the full answer
What skills are essential for a Lead Systems Engineer at the High-Performance Computing team?

Essential skills for the Lead Systems Engineer position at our High-Performance Computing team include expertise in deploying and optimizing GPU as a Service, knowledge of server components and their functionalities, and proficiency in automation tools such as Ansible. Additionally, a deep understanding of performance benchmarking and experience with both Linux and Windows platforms are essential for maximizing system performance.

Join Rise to see the full answer
Is remote work an option for the Lead Systems Engineer at the High-Performance Computing team?

Yes, the Lead Systems Engineer role in High-Performance Computing offers a hybrid working model. You can alternate between remote work and office visits, typically spending 2-3 days per week in the office, depending on business needs. This flexibility allows you to maintain a work-life balance while contributing to our team's success.

Join Rise to see the full answer
What kind of team dynamics can I expect as a Lead Systems Engineer at High-Performance Computing?

As a Lead Systems Engineer at the High-Performance Computing team, you can expect a supportive and collaborative environment where teamwork is valued. You will work alongside talented professionals who share a commitment to innovation and excellence in server infrastructure. Your role also involves mentoring junior staff, allowing you to develop leadership skills and help shape the next generation of engineers.

Join Rise to see the full answer
Common Interview Questions for Lead Systems Engineer, High-Performance Computing
Can you explain your experience with GPU as a Service?

In answering this question, highlight specific projects where you've deployed or managed GPU as a Service platforms, including any challenges you faced and how you overcame them. Discuss your approach to optimizing these environments and any performance improvements achieved.

Join Rise to see the full answer
How do you ensure high availability and security in distributed IT infrastructures?

Share your strategies for achieving high availability, such as redundancy measures, load balancing, and regular performance monitoring. For security, discuss the protocols you implement to safeguard systems and any compliance standards you've adhered to in previous roles.

Join Rise to see the full answer
What is your approach to automating infrastructure tasks?

Detail your experiences with automation tools, especially Ansible. Discuss specific tasks you've automated, the challenges you encountered in those processes, and how automation has improved operational efficiency within your teams.

Join Rise to see the full answer
How do you handle performance benchmarking for new technologies?

Walk the interviewer through your methodology for conducting performance benchmarks, including the tools you use, the metrics you focus on, and how you interpret the results to make informed decisions about technology implementation.

Join Rise to see the full answer
What scripting languages are you proficient in, and how have you used them?

Discuss your proficiency in PowerShell and Python, providing examples of scripts you've developed to automate tasks or solve specific problems in your previous roles. Highlight any significant impacts these scripts had on project outcomes.

Join Rise to see the full answer
Can you provide an example of a complex problem you resolved independently?

Identify a specific challenge you've faced in high-performance computing and describe how you approached it. Focus on the analytical skills you used, the steps you took to diagnose and solve the issue, and the outcome of your efforts.

Join Rise to see the full answer
How do you stay up-to-date with advancements in high-performance computing?

Emphasize your commitment to continuous learning through attending relevant conferences, participating in training workshops, engaging with professional communities, and following industry trends. This showcases your dedication and proactive approach to personal growth.

Join Rise to see the full answer
What methods do you use to mentor junior engineers?

Discuss your mentorship philosophy, focusing on how you encourage junior engineers through knowledge sharing, providing constructive feedback, and involving them in projects where they can learn and grow. Highlight any successful outcomes from your mentorship efforts.

Join Rise to see the full answer
What are the most critical aspects of hardware lifecycle management?

Talk about your knowledge of firmware updates, OS driver certifications, and strategies for ensuring the longevity and reliability of compute resources. Emphasize how effective lifecycle management contributes to high-performance compute environments.

Join Rise to see the full answer
How would you approach team collaboration in a hybrid work environment?

Share your strategies for effective communication and project management in a hybrid setting. Include examples of tools you use to facilitate collaboration and how you ensure all team members are aligned and engaged, whether in-person or remotely.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Posted 3 days ago

Join Visa as a Lead Systems Engineer to enhance their monitoring and observability systems within a hybrid work model.

Photo of the Rise User

Join Visa as a People Advisory Manager to support and elevate the experience of People Leaders through consultative and empathetic guidance.

Posted 11 hours ago

Join the Omni Hilton Head Oceanfront Resort as an Overnight Maintenance Engineer and ensure a safe and efficiently maintained environment for our guests.

Photo of the Rise User

Seeking an Application Services Engineer to manage and support HealthEdge's cloud applications remotely.

Photo of the Rise User
Egis Group Hybrid Brisbane QLD, Australia
Posted 14 days ago
Photo of the Rise User
Posted 8 days ago
Photo of the Rise User
Posted 8 days ago
Photo of the Rise User
Posted 13 days ago

Visa Inc. operates as a payments technology company worldwide. The company facilitates commerce through the transfer of value and information among consumers, merchants, financial institutions, businesses, strategic partners, and government entiti...

8905 jobs
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
April 3, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!