Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Lead Systems Engineer, High-Performance Computing image - Rise Careers
Job details

Lead Systems Engineer, High-Performance Computing - job 14 of 20

IaaS Systems and Storage & Engineering (ISSE) team is part of the Operations & Infrastructure technology organization. Distributed Compute engineering (DCE) is part of ISSE and High-performance compute platform engineering is part of DCE. Our vision, mission and purpose are summarized as following:

Vision: To become a leading technical engineering professional, pioneering in the design and automation of server infrastructure. We envision creating highly secure and efficient operations environments that drive business success and technological advancement.

Mission: Our mission is to deliver high-quality server infrastructure design and automated implementation. We are committed to operating in complex, highly secure, and highly available environments, while maintaining rigorous operations, security, and procedural models.

Purpose: The purpose of this role is to utilize strong hands-on technical engineering skills to design and automate the implementation of server infrastructure based on business requirements. This role will interact with technology domain experts to maintain high security and availability in complex operational environments, thereby driving business efficiency and security.

Essential Functions:

  • GPU as a Service and High-Performance Compute Platform Support: Expertise in deploying, managing, and optimizing GPU as a Service (GaaS) and high-performance compute platforms to support advanced computational workloads.
  • Extensive Datacenter Experience: Proficient in managing complex, geographically distributed IT infrastructures to ensure high availability and performance.
  • Advanced Technical Knowledge: Profound understanding of high-performance, highly available, and secure computing systems utilizing x86 technologies and protocols (NVME, GPU, PCI-E).
  • Enterprise Server and Component Expertise: In-depth knowledge of server components (storage/network controllers, HBA, SSDs) and their functionalities, essential for maintaining high-performance compute environments.
  • Processor and GPU Systems Proficiency: Strong grasp of Intel/AMD architectures, GPU systems, memory hierarchy, and hardware-level security to enhance system performance and reliability.
  • Out-of-Band, UEFI, and BIOS Expertise: Comprehensive understanding of out-of-band management, UEFI, BIOS settings, and their impact on system performance and security in high-performance computing environments.
  • Hardware Lifecycle Management: Experienced in hardware lifecycle management, including firmware and OS driver certifications, to ensure the longevity and reliability of compute resources.
  • Infrastructure Management and Automation: Proficient in installing, configuring, supporting, and maintaining compute infrastructure management tools, with skills in Ansible for automation to streamline deployment and operational tasks.
  • Performance Benchmarking and Tech Evaluation: Capable of running performance benchmarks and evaluating new technologies for various platforms (Linux, Windows, containerized, and virtualized) to ensure optimal performance.
  • Scripting Proficiency: Advanced skills in scripting languages such as PowerShell and Python to automate and optimize infrastructure tasks.
  • Team and Independent Work: Highly motivated, excellent team player, capable of working independently, with strong analytical and troubleshooting abilities to resolve complex issues and mentor junior staff.

This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.

Average salary estimate

$140000 / YEARLY (est.)
min
max
$120000K
$160000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Lead Systems Engineer, High-Performance Computing, Visa

As a Lead Systems Engineer specializing in High-Performance Computing at a leading technology firm in Ashburn, you'll be at the forefront of innovation, transforming engineering principles into practice. You will join our IaaS Systems and Storage & Engineering (ISSE) team, a vital part of the Operations & Infrastructure technology organization. In this role, your expertise will help us design and deploy cutting-edge server infrastructure while ensuring that security and availability are top priorities in all complex operational environments. Your creative problem-solving skills will be essential in managing and optimizing our GPU as a Service and high-performance compute platforms, allowing us to excel in supporting advanced computational workloads. With your extensive datacenter experience, you will ensure high availability and seamless performance across our geographically distributed IT infrastructures. You'll get to work hands-on with advanced technologies, from Intel/AMD architectures to out-of-band management, always focusing on both performance benchmarking and tech evaluation. If you thrive in a hybrid working environment and enjoy a blend of collaboration and independent problem-solving, we’d love to find out more about you. Together, let's drive business efficiency and explore the evolving landscape of high-performance computing.

Frequently Asked Questions (FAQs) for Lead Systems Engineer, High-Performance Computing Role at Visa
What qualifications are needed for the Lead Systems Engineer position at High-Performance Computing in Ashburn?

To excel as a Lead Systems Engineer in High-Performance Computing in Ashburn, candidates should have a robust background in systems engineering, particularly within high-performance computing environments. Essential qualifications include extensive experience with GPU as a Service, knowledge of x86 technologies and protocols, and proficiency in scripting languages like PowerShell and Python. Familiarity with datacenter management and hardware lifecycle management is also critical to succeed.

Join Rise to see the full answer
What are the main responsibilities of a Lead Systems Engineer at High-Performance Computing?

The primary responsibilities of a Lead Systems Engineer at High-Performance Computing involve designing and automating server infrastructure suited for business needs. You'll manage GPU as a Service platforms, engage in performance benchmarking, and oversee infrastructure management and automation. Additionally, mentoring junior staff while maintaining high security and availability in complex systems is a key responsibility.

Join Rise to see the full answer
Is the Lead Systems Engineer role in High-Performance Computing a remote position?

The Lead Systems Engineer role in High-Performance Computing offers a hybrid working arrangement. This means that while you can enjoy remote work flexibility, there is also an expectation to be in the office 2-3 days a week, depending on business needs, making this a great opportunity for those looking for a balanced work-life style.

Join Rise to see the full answer
How does the Lead Systems Engineer at High-Performance Computing contribute to business success?

The Lead Systems Engineer significantly contributes to business success by creating efficient and secure operational environments that support advanced computational workloads. By leveraging their technical expertise in high-performance systems and infrastructure automation, they enhance operational performance, drive technological advancement, and improve overall business efficiency.

Join Rise to see the full answer
What skills are essential for success as a Lead Systems Engineer specializing in High-Performance Computing?

Essential skills for a Lead Systems Engineer in High-Performance Computing include advanced knowledge of high-performance computing systems, strong scripting abilities, expertise in server components, and excellent analytical and troubleshooting capabilities. Additionally, skills in team collaboration and independent problem-solving are crucial for success in this role.

Join Rise to see the full answer
Common Interview Questions for Lead Systems Engineer, High-Performance Computing
Can you describe your experience with GPU as a Service?

In answering this question, focus on specific projects you have managed that involved GPU as a Service. Highlight your role in optimizing performance, addressing challenges, and collaborating with key stakeholders. Discuss any specific technologies or platforms you've used and their impact.

Join Rise to see the full answer
How do you ensure high availability in distributed IT infrastructures?

To effectively answer this question, describe your methodologies for ensuring high availability, such as redundancy plans, load balancing, and failover systems. Provide real-world examples of how you've implemented these strategies in previous positions.

Join Rise to see the full answer
What scripting languages are you proficient in, and how have you used them effectively in your projects?

For this question, detail your experience with scripting languages such as PowerShell and Python. Share specific examples of scripts you've written to automate tasks or solve infrastructure challenges, illustrating the efficiencies gained.

Join Rise to see the full answer
Can you give an example of a performance benchmarking you conducted?

When responding, provide a detailed account of a particular benchmarking project. Discuss the objectives, the benchmarks used, how you gathered and analyzed the data, and the outcomes that informed future technology decisions.

Join Rise to see the full answer
What measures do you take to maintain security in high-performance computing environments?

Stress your understanding of security protocols, including out-of-band management, and how you've implemented them in past roles. Share examples of security challenges you faced and the strategies used to mitigate risks effectively.

Join Rise to see the full answer
How do you approach hardware lifecycle management?

In your response, discuss your systematic approach to hardware lifecycle management, including monitoring hardware performance, coordinating upgrades, and ensuring firmware and driver certifications are up-to-date. Citing specific experiences will strengthen your answer.

Join Rise to see the full answer
Describe your experience with enterprise server components.

When answering this question, detail your hands-on experience with various server components and technologies, emphasizing any troubleshooting you've performed or complex configurations you've managed.

Join Rise to see the full answer
How do you prioritize tasks in a complex IT environment?

For this question, provide insight into your time management strategies and decision-making processes. Discuss how you assess urgency and importance and prioritize tasks effectively to meet deadlines.

Join Rise to see the full answer
What project management tools or practices do you use?

Mention specific project management methodologies (like Agile or Waterfall) and tools (such as JIRA or Trello) that you have utilized in your work to manage technical projects effectively. Share how they helped streamline communication and track progress.

Join Rise to see the full answer
How do you mentor junior engineers in your team?

When addressing this question, share your approach to mentoring, including any structured training or hands-on experience you provide. Discuss how you foster an environment of learning and innovation, highlighting any success stories from past mentoring relationships.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Posted 12 days ago
Photo of the Rise User
Posted 12 days ago

Bring your DevSecOps expertise to LMI, a leader in secure generative AI solutions, and help transform government services.

Posted 7 days ago

Join Trace3 Gov as a Systems Engineer/Architect to lead in designing and developing innovative IT solutions while being part of a passionate team.

Samsung Semiconductor Remote San Jose, California, United States
Posted 10 days ago

Join Samsung's DRAM Design Lab as a Senior Engineer to lead innovative memory design projects.

Photo of the Rise User
Posted 5 days ago

Join Bluum as a Field Engineer II to leverage your AV integration expertise and support complex system deployments.

SharkNinja Hybrid US, Norfolk County, MA; Massachusetts, Needham, MA
Posted 3 days ago

Join SharkNinja as a Senior Electrical Engineer and lead the development of innovative consumer electronics that enhance everyday living.

Photo of the Rise User
Posted 3 days ago

Join Kimley-Horn as a CAD Operator, where your expertise will help create detailed surveys in a collaborative environment.

Visa Inc. operates as a payments technology company worldwide. The company facilitates commerce through the transfer of value and information among consumers, merchants, financial institutions, businesses, strategic partners, and government entiti...

9232 jobs
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
April 3, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!