Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Lead Systems Engineer, High-Performance Computing image - Rise Careers
Job details

Lead Systems Engineer, High-Performance Computing - job 5 of 20

IaaS Systems and Storage & Engineering (ISSE) team is part of the Operations & Infrastructure technology organization. Distributed Compute engineering (DCE) is part of ISSE and High-performance compute platform engineering is part of DCE. Our vision, mission and purpose are summarized as following:

Vision: To become a leading technical engineering professional, pioneering in the design and automation of server infrastructure. We envision creating highly secure and efficient operations environments that drive business success and technological advancement.

Mission: Our mission is to deliver high-quality server infrastructure design and automated implementation. We are committed to operating in complex, highly secure, and highly available environments, while maintaining rigorous operations, security, and procedural models.

Purpose: The purpose of this role is to utilize strong hands-on technical engineering skills to design and automate the implementation of server infrastructure based on business requirements. This role will interact with technology domain experts to maintain high security and availability in complex operational environments, thereby driving business efficiency and security.

Essential Functions:

  • GPU as a Service and High-Performance Compute Platform Support: Expertise in deploying, managing, and optimizing GPU as a Service (GaaS) and high-performance compute platforms to support advanced computational workloads.
  • Extensive Datacenter Experience: Proficient in managing complex, geographically distributed IT infrastructures to ensure high availability and performance.
  • Advanced Technical Knowledge: Profound understanding of high-performance, highly available, and secure computing systems utilizing x86 technologies and protocols (NVME, GPU, PCI-E).
  • Enterprise Server and Component Expertise: In-depth knowledge of server components (storage/network controllers, HBA, SSDs) and their functionalities, essential for maintaining high-performance compute environments.
  • Processor and GPU Systems Proficiency: Strong grasp of Intel/AMD architectures, GPU systems, memory hierarchy, and hardware-level security to enhance system performance and reliability.
  • Out-of-Band, UEFI, and BIOS Expertise: Comprehensive understanding of out-of-band management, UEFI, BIOS settings, and their impact on system performance and security in high-performance computing environments.
  • Hardware Lifecycle Management: Experienced in hardware lifecycle management, including firmware and OS driver certifications, to ensure the longevity and reliability of compute resources.
  • Infrastructure Management and Automation: Proficient in installing, configuring, supporting, and maintaining compute infrastructure management tools, with skills in Ansible for automation to streamline deployment and operational tasks.
  • Performance Benchmarking and Tech Evaluation: Capable of running performance benchmarks and evaluating new technologies for various platforms (Linux, Windows, containerized, and virtualized) to ensure optimal performance.
  • Scripting Proficiency: Advanced skills in scripting languages such as PowerShell and Python to automate and optimize infrastructure tasks.
  • Team and Independent Work: Highly motivated, excellent team player, capable of working independently, with strong analytical and troubleshooting abilities to resolve complex issues and mentor junior staff.

This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.

Average salary estimate

$135000 / YEARLY (est.)
min
max
$120000K
$150000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Lead Systems Engineer, High-Performance Computing, Visa

As a Lead Systems Engineer in High-Performance Computing at our Ashburn location, you’ll be at the heart of the IaaS Systems and Storage & Engineering (ISSE) team, an essential part of our Operations & Infrastructure technology organization. This role is perfect for tech enthusiasts who aspire to shape the future of server infrastructure design and automation. Our vision is clear – we aim to create secure and efficient operational environments that foster both business success and technological innovation. In this position, you'll utilize your strong technical skills to design and automate server infrastructure tailored to the needs of our business while collaborating closely with domain experts. Your responsibilities will include deploying and optimizing GPU as a Service (GaaS), managing extensive datacenter infrastructures, and ensuring high availability and performance of our computing systems. You'll need advanced technical knowledge, particularly of x86 technologies and high-performance computing, alongside experience in hardware lifecycle management and automation tools like Ansible. We also value scripting expertise in languages like PowerShell and Python to streamline our operations. This hybrid position allows you to work both remotely and in the office, providing you a balanced work-life. If you're a motivated team player and love tackling complex challenges, we can’t wait for you to join our vibrant engineering group!

Frequently Asked Questions (FAQs) for Lead Systems Engineer, High-Performance Computing Role at Visa
What are the primary responsibilities of a Lead Systems Engineer in High-Performance Computing at our company?

The Lead Systems Engineer in High-Performance Computing at our company is responsible for deploying and managing GPU as a Service (GaaS), ensuring high availability in our datacenter infrastructures, and implementing server infrastructure based on business requirements. The role demands strong technical skills to maintain efficient operations and encourage technological advancement.

Join Rise to see the full answer
What qualifications are needed for the Lead Systems Engineer, High-Performance Computing position?

To qualify for the Lead Systems Engineer position in High-Performance Computing, candidates should possess extensive knowledge in high-performance computing systems, expertise with x86 technologies, and experience in server and hardware lifecycle management. Proficiency in scripting and familiarity with automation tools like Ansible is also essential.

Join Rise to see the full answer
How does the Lead Systems Engineer contribute to security and availability in high-performance computing environments?

The Lead Systems Engineer contributes to security and availability by applying advanced technical knowledge to design secure computing systems and maintain high operational standards. Their work includes collaboration with technology experts to ensure stringent security measures and optimal performance across all environments.

Join Rise to see the full answer
What skill sets are crucial for the Lead Systems Engineer role in High-Performance Computing?

Key skill sets for the Lead Systems Engineer role include advanced knowledge in high-performance computing, experience in managing complex IT infrastructures, scripting proficiency in PowerShell and Python, and expertise in hardware lifecycle management and automation. Strong analytical skills and the ability to work both independently and in teams are also vital.

Join Rise to see the full answer
What is the work environment like for the Lead Systems Engineer, High-Performance Computing in Ashburn?

The work environment for the Lead Systems Engineer, High-Performance Computing in Ashburn is hybrid, allowing flexibility in working remotely and in the office. Employees are expected to work in-office 2-3 days a week, balancing teamwork and independent work with a strong emphasis on collaboration and innovation.

Join Rise to see the full answer
Common Interview Questions for Lead Systems Engineer, High-Performance Computing
Can you explain your experience with GPU as a Service in high-performance computing environments?

In your response, highlight specific projects you've worked on involving GPU as a Service, detailing how you deployed and managed the GaaS environment, the challenges faced, and how you ensured its effectiveness in handling advanced computational workloads.

Join Rise to see the full answer
How do you approach performance benchmarking in complex IT infrastructures?

Discuss your methodology for conducting performance benchmarks, the tools you utilize, and how you analyze results to evaluate and optimize system performance across different platforms.

Join Rise to see the full answer
What automation tools have you used, and how have they improved efficiency in your previous roles?

Share specific examples of automation tools you've implemented, such as Ansible, and explain how they streamlined processes and improved operational efficiency in your previous projects.

Join Rise to see the full answer
Describe your experience with hardware lifecycle management.

Explain your role in managing the lifecycle of hardware from procurement to decommissioning, focusing on strategies you've used for firmware updates, OS driver certifications, and ensuring system reliability.

Join Rise to see the full answer
How do you ensure the security of server infrastructure in high-performance computing?

Talk about your approach to integrating security measures throughout your systems. Discuss specific protocols, tools, or best practices you've employed to safeguard the integrity and availability of computing environments.

Join Rise to see the full answer
Can you provide examples of how you've collaborated with technology domain experts in previous positions?

Share a story that illustrates your collaborative efforts with domain experts. Highlight the outcomes of such collaborations and how they positively impacted project goals.

Join Rise to see the full answer
What is your experience with scripting languages like PowerShell or Python?

Discuss specific tasks you’ve automated using PowerShell or Python, including any challenges faced and how your scripting skills enhanced operational effectiveness.

Join Rise to see the full answer
What key metrics do you consider important when evaluating the performance of a high-performance computing environment?

Mention key performance indicators (KPIs) such as availability, latency, throughput, and how you monitor and improve performance against these metrics in your infrastructure.

Join Rise to see the full answer
How do you stay updated with emerging technologies in high-performance computing?

Describe your practices for staying informed about new developments in high-performance computing, such as attending webinars, following industry journals, or engaging in professional networks.

Join Rise to see the full answer
What strategies do you use to mentor junior staff in engineering roles?

Explain your mentorship approach, detailing specific strategies you've employed to support junior staff, enhance their learning, and foster their growth within the engineering team.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Innovative Staff Solutions Hybrid Jackson, Tennessee, United States
Posted 7 days ago

We are seeking a skilled Manufacturing Engineer to oversee large projects and implement innovative solutions for production efficiency.

Photo of the Rise User
AECOM Remote Adelaide, South Australia, Australia
Posted 11 days ago
Cyvl Hybrid Boston, Massachusetts, United States
Posted 11 days ago
Photo of the Rise User
Posted 10 days ago
Posted 4 days ago

Join Abacus Technology as an Enterprise Architect, providing strategic IT solutions at Lackland AFB.

Photo of the Rise User
Sword Group Remote No location specified
Posted 12 hours ago

Step into the role of AI & Automation Lead at Sword, where your expertise will champion innovative AI solutions across diverse client projects.

Visa Inc. operates as a payments technology company worldwide. The company facilitates commerce through the transfer of value and information among consumers, merchants, financial institutions, businesses, strategic partners, and government entiti...

8905 jobs
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
April 3, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!