Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Lead Systems Engineer, High-Performance Computing image - Rise Careers
Job details

Lead Systems Engineer, High-Performance Computing - job 19 of 20

IaaS Systems and Storage & Engineering (ISSE) team is part of the Operations & Infrastructure technology organization. Distributed Compute engineering (DCE) is part of ISSE and High-performance compute platform engineering is part of DCE. Our vision, mission and purpose are summarized as following:

Vision: To become a leading technical engineering professional, pioneering in the design and automation of server infrastructure. We envision creating highly secure and efficient operations environments that drive business success and technological advancement.

Mission: Our mission is to deliver high-quality server infrastructure design and automated implementation. We are committed to operating in complex, highly secure, and highly available environments, while maintaining rigorous operations, security, and procedural models.

Purpose: The purpose of this role is to utilize strong hands-on technical engineering skills to design and automate the implementation of server infrastructure based on business requirements. This role will interact with technology domain experts to maintain high security and availability in complex operational environments, thereby driving business efficiency and security.

Essential Functions:

  • GPU as a Service and High-Performance Compute Platform Support: Expertise in deploying, managing, and optimizing GPU as a Service (GaaS) and high-performance compute platforms to support advanced computational workloads.
  • Extensive Datacenter Experience: Proficient in managing complex, geographically distributed IT infrastructures to ensure high availability and performance.
  • Advanced Technical Knowledge: Profound understanding of high-performance, highly available, and secure computing systems utilizing x86 technologies and protocols (NVME, GPU, PCI-E).
  • Enterprise Server and Component Expertise: In-depth knowledge of server components (storage/network controllers, HBA, SSDs) and their functionalities, essential for maintaining high-performance compute environments.
  • Processor and GPU Systems Proficiency: Strong grasp of Intel/AMD architectures, GPU systems, memory hierarchy, and hardware-level security to enhance system performance and reliability.
  • Out-of-Band, UEFI, and BIOS Expertise: Comprehensive understanding of out-of-band management, UEFI, BIOS settings, and their impact on system performance and security in high-performance computing environments.
  • Hardware Lifecycle Management: Experienced in hardware lifecycle management, including firmware and OS driver certifications, to ensure the longevity and reliability of compute resources.
  • Infrastructure Management and Automation: Proficient in installing, configuring, supporting, and maintaining compute infrastructure management tools, with skills in Ansible for automation to streamline deployment and operational tasks.
  • Performance Benchmarking and Tech Evaluation: Capable of running performance benchmarks and evaluating new technologies for various platforms (Linux, Windows, containerized, and virtualized) to ensure optimal performance.
  • Scripting Proficiency: Advanced skills in scripting languages such as PowerShell and Python to automate and optimize infrastructure tasks.
  • Team and Independent Work: Highly motivated, excellent team player, capable of working independently, with strong analytical and troubleshooting abilities to resolve complex issues and mentor junior staff.

This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.

Average salary estimate

$135000 / YEARLY (est.)
min
max
$120000K
$150000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Lead Systems Engineer, High-Performance Computing, Visa

If you're passionate about high-performance computing and looking to take your career to the next level, join us as a Lead Systems Engineer with our IaaS Systems and Storage & Engineering (ISSE) team in Ashburn! Here, we’re all about pioneering the design and automation of server infrastructure, so you’ll play a vital role in creating secure and efficient operations that drive both business success and technological advancement. You’ll leverage your expertise in deploying and optimizing GPU as a Service and high-performance compute platforms to support advanced computational workloads. With extensive datacenter experience under your belt, you’ll manage complex IT infrastructures to ensure they run smoothly and securely. Your deep understanding of x86 technologies, server components, and Intel/AMD architectures will be essential as you work on enhancing system performance and reliability. We're looking for someone who’s well-versed in out-of-band management and has a knack for hardware lifecycle management. You'll also get to flex your scripting skills in PowerShell and Python for automation while collaborating with technology domain experts. The best part? This is a hybrid position, giving you the flexibility to split your time between home and the office, with a requirement of being in the office 50% of the time. So, if you're ready for an exciting challenge and want to make a real impact in the tech world, this role is perfect for you!

Frequently Asked Questions (FAQs) for Lead Systems Engineer, High-Performance Computing Role at Visa
What are the responsibilities of a Lead Systems Engineer at ISSE?

As a Lead Systems Engineer at ISSE, your primary responsibilities include designing and automating server infrastructure based on business requirements, managing high-performance computing platforms, and providing support for GPU as a Service. You'll also be proficient in datacenter management and liaising with technology experts to ensure high security and availability.

Join Rise to see the full answer
What qualifications are needed for the Lead Systems Engineer position at ISSE?

To excel as a Lead Systems Engineer at ISSE, candidates typically need advanced technical knowledge of high-performance computing systems using x86 technologies and protocols. You'll also need in-depth expertise in server components, processor architectures, and scripting skills in languages like PowerShell and Python for automation tasks.

Join Rise to see the full answer
How does a Lead Systems Engineer ensure availability and performance in high-performance computing environments?

A Lead Systems Engineer ensures availability and performance by managing complex IT infrastructures, utilizing advanced server components, and implementing effective hardware lifecycle management strategies. Regular performance benchmarking and tech evaluation also play a crucial role in maintaining optimal system efficiency.

Join Rise to see the full answer
What is the work environment like for a Lead Systems Engineer at ISSE?

The work environment for a Lead Systems Engineer at ISSE is dynamic and collaborative. This hybrid position allows for a blend of remote work and in-office collaboration, which fosters teamwork and innovation while ensuring the flexibility needed to meet personal work preferences.

Join Rise to see the full answer
What technical skills are essential for a Lead Systems Engineer at ISSE?

Essential technical skills for a Lead Systems Engineer at ISSE include a strong understanding of high-performance computing systems, expertise in GPU systems, proficiency in scripting for automation, and hands-on experience with hardware lifecycle management. Candidates should also be familiar with out-of-band management and relevant BIOS/UEFI settings.

Join Rise to see the full answer
Common Interview Questions for Lead Systems Engineer, High-Performance Computing
Can you describe your experience with GPU as a Service?

In answering this question, emphasize your hands-on experience deploying and optimizing GPU as a Service, detailing any specific projects or environments in which you improved performance or efficiency.

Join Rise to see the full answer
What methodologies do you use for infrastructure automation?

Discuss your preferred tools and methodologies, such as Ansible, and any scripting languages you use, like PowerShell or Python, providing examples of how you've automated tasks to enhance efficiency.

Join Rise to see the full answer
How do you approach troubleshooting complex systems?

When asked this, outline your analytical process—from identifying the issue to executing a solution. Share any past scenarios where your troubleshooting skills made a significant impact.

Join Rise to see the full answer
What is your experience with performance benchmarking?

In your response, provide specific examples of performance benchmarks you've run, the objectives you aimed to achieve, and how those benchmarks informed your decisions in system optimization.

Join Rise to see the full answer
How do you keep up with the latest technologies in high-performance computing?

Showcase your strategies for staying current, whether through continuous education, attending industry conferences, or engaging with professional communities online.

Join Rise to see the full answer
Describe your experience with hardware lifecycle management.

Be prepared to discuss your familiarity with hardware lifecycle management processes, detailing how you ensure that systems remain reliable and up to date throughout their lifecycle.

Join Rise to see the full answer
How do you ensure system security in a high-performance computing environment?

Discuss your approach to security, including the strategies you utilize to monitor, manage, and enhance security protocols in line with industry standards in high-performance computing.

Join Rise to see the full answer
Share a project where you had to lead a team toward a technical goal.

Highlight your leadership skills and project management abilities, explaining your role in guiding the team, overcoming obstacles, and achieving the project objectives.

Join Rise to see the full answer
What challenges have you faced in managing datacenter environments, and how did you overcome them?

Describe specific challenges you faced and outline the solutions you implemented. This shows your problem-solving skills and adaptability in dynamic environments.

Join Rise to see the full answer
Can you explain the impact of server components on system performance?

Articulate your understanding of how various server components work together, and discuss your experiences optimizing these components to enhance overall system performance.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Posted 6 days ago
Photo of the Rise User
Posted 5 days ago
Photo of the Rise User
Posted 7 days ago
Photo of the Rise User
Posted 6 days ago
Photo of the Rise User

Join Louis Dreyfus Company as a Superintendent to lead the Lecithin Plant operations in Claypool, IN, ensuring safety and quality standards are met.

Visa Inc. operates as a payments technology company worldwide. The company facilitates commerce through the transfer of value and information among consumers, merchants, financial institutions, businesses, strategic partners, and government entiti...

8343 jobs
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
April 2, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!