Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Lead Systems Engineer, High-Performance Computing image - Rise Careers
Job details

Lead Systems Engineer, High-Performance Computing - job 6 of 20

IaaS Systems and Storage & Engineering (ISSE) team is part of the Operations & Infrastructure technology organization. Distributed Compute engineering (DCE) is part of ISSE and High-performance compute platform engineering is part of DCE. Our vision, mission and purpose are summarized as following:

Vision: To become a leading technical engineering professional, pioneering in the design and automation of server infrastructure. We envision creating highly secure and efficient operations environments that drive business success and technological advancement.

Mission: Our mission is to deliver high-quality server infrastructure design and automated implementation. We are committed to operating in complex, highly secure, and highly available environments, while maintaining rigorous operations, security, and procedural models.

Purpose: The purpose of this role is to utilize strong hands-on technical engineering skills to design and automate the implementation of server infrastructure based on business requirements. This role will interact with technology domain experts to maintain high security and availability in complex operational environments, thereby driving business efficiency and security.

Essential Functions:

  • GPU as a Service and High-Performance Compute Platform Support: Expertise in deploying, managing, and optimizing GPU as a Service (GaaS) and high-performance compute platforms to support advanced computational workloads.
  • Extensive Datacenter Experience: Proficient in managing complex, geographically distributed IT infrastructures to ensure high availability and performance.
  • Advanced Technical Knowledge: Profound understanding of high-performance, highly available, and secure computing systems utilizing x86 technologies and protocols (NVME, GPU, PCI-E).
  • Enterprise Server and Component Expertise: In-depth knowledge of server components (storage/network controllers, HBA, SSDs) and their functionalities, essential for maintaining high-performance compute environments.
  • Processor and GPU Systems Proficiency: Strong grasp of Intel/AMD architectures, GPU systems, memory hierarchy, and hardware-level security to enhance system performance and reliability.
  • Out-of-Band, UEFI, and BIOS Expertise: Comprehensive understanding of out-of-band management, UEFI, BIOS settings, and their impact on system performance and security in high-performance computing environments.
  • Hardware Lifecycle Management: Experienced in hardware lifecycle management, including firmware and OS driver certifications, to ensure the longevity and reliability of compute resources.
  • Infrastructure Management and Automation: Proficient in installing, configuring, supporting, and maintaining compute infrastructure management tools, with skills in Ansible for automation to streamline deployment and operational tasks.
  • Performance Benchmarking and Tech Evaluation: Capable of running performance benchmarks and evaluating new technologies for various platforms (Linux, Windows, containerized, and virtualized) to ensure optimal performance.
  • Scripting Proficiency: Advanced skills in scripting languages such as PowerShell and Python to automate and optimize infrastructure tasks.
  • Team and Independent Work: Highly motivated, excellent team player, capable of working independently, with strong analytical and troubleshooting abilities to resolve complex issues and mentor junior staff.

This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.

Average salary estimate

$135000 / YEARLY (est.)
min
max
$120000K
$150000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Lead Systems Engineer, High-Performance Computing, Visa

If you're a passionate tech wizard with a knack for automation and server infrastructure design, we at the IaaS Systems and Storage & Engineering (ISSE) team are excited to welcome you as our Lead Systems Engineer in High-Performance Computing based in Ashburn! Our Distributed Compute Engineering (DCE) unit is where you'll get to utilize your hands-on technical skills to redefine operational environments for efficiency and security. Imagine being at the forefront of deploying and optimizing GPU as a Service (GaaS) while managing complex datacenter infrastructures across diverse geographic locations. With your expertise in x86 technologies, BIOS settings, and hardware lifecycle management, you’ll maintain high availability in highly secure environments, driving technological advancement and business success. Your love for innovation will shine as you utilize your advanced scripting skills in PowerShell and Python to automate processes and elevate our performance benchmarks. We believe in teamwork, so while you’ll often work independently, collaboration with various tech domain experts will be crucial to ensure that we're always on the cutting edge. Plus, we appreciate the flexibility of hybrid work environments and believe that embracing the balance between office and remote work can foster creativity and productivity. If you’re ready to take your career to the next level and make a significant impact in the High-Performance Computing realm, come join us on this exciting journey!

Frequently Asked Questions (FAQs) for Lead Systems Engineer, High-Performance Computing Role at Visa
What are the main responsibilities of a Lead Systems Engineer in High-Performance Computing at ISSE?

As a Lead Systems Engineer in High-Performance Computing at ISSE, your primary responsibilities include designing and automating the implementation of server infrastructure to meet business requirements. You'll manage GPU as a Service (GaaS) and high-performance compute platforms, ensuring optimal performance in secure environments. You'll also be involved in infrastructure management, performance benchmarking, and collaborating with technology domain experts to uphold rigorous security and availability standards.

Join Rise to see the full answer
What qualifications are necessary for the Lead Systems Engineer position at ISSE?

To qualify for the Lead Systems Engineer position in High-Performance Computing at ISSE, you should have a profound understanding of high-performance computing systems, extensive datacenter experience, and expertise in managing IT infrastructures. Proven skills in scripting, automation (like Ansible), and knowledge of x86 technologies, particularly around server components, are essential. A background in evaluating new technologies and a proactive approach to hardware lifecycle management will also be beneficial.

Join Rise to see the full answer
How does a Lead Systems Engineer contribute to security in high-performance computing environments at ISSE?

In the role of Lead Systems Engineer at ISSE, you contribute to security in several ways. Your work in maintaining high availability and security in operational environments is vital. You will utilize your advanced knowledge of hardware-level security, BIOS settings, and out-of-band management practices to enhance system reliability and mitigate risks, ensuring that the infrastructure is robust and resilient against vulnerabilities.

Join Rise to see the full answer
What tools and technologies should a Lead Systems Engineer at ISSE be familiar with?

A Lead Systems Engineer in High-Performance Computing at ISSE should be familiar with a variety of tools and technologies, including Ansible for automation, PowerShell and Python for scripting, and key protocols like NVME and PCI-E for high-performance systems. Proficiency in managing datacenter environments, as well as a strong grasp of Intel/AMD architectures and GPU systems, will form the foundation of your role in optimizing compute platforms.

Join Rise to see the full answer
What is the work culture like for a Lead Systems Engineer at ISSE?

The work culture for a Lead Systems Engineer at ISSE is collaborative and innovative, emphasizing teamwork while also valuing individual contributions. With a hybrid work model, you'll enjoy the flexibility of alternating between remote and office work. The environment is dynamic, offering opportunities for continuous learning and professional development, where creativity and efficiency are championed to drive the overall success of high-performance computing initiatives.

Join Rise to see the full answer
Common Interview Questions for Lead Systems Engineer, High-Performance Computing
Can you explain your experience with GPU as a Service?

When answering this question, detail specific projects where you deployed and optimized GaaS. Discuss the architecture you used, challenges you faced, and how you overcame them, showcasing your problem-solving skills and technical expertise.

Join Rise to see the full answer
What strategies do you employ for ensuring high availability in distributed IT infrastructures?

Describe your approach, including redundant systems, failover processes, and disaster recovery plans. Highlight tools and technologies you've used to monitor and maintain system health to minimize downtime and maximize availability.

Join Rise to see the full answer
How do you keep your technical knowledge up-to-date in the rapidly evolving field of high-performance computing?

Mention specific resources like industry blogs, forums, webinars, or certifications that you utilize. Highlight your engagement in professional communities and how continuous learning influences your work, demonstrating a proactive approach to professional growth.

Join Rise to see the full answer
What experience do you have with automation tools like Ansible?

Share detailed examples of how you’ve used Ansible to streamline deployment processes or reduce manual tasks. Discuss specific playbooks you’ve written and the impact of automation on enhancing operational efficiencies in your previous roles.

Join Rise to see the full answer
How do you evaluate new technologies for deployment in high-performance computing environments?

Discuss your systematic approach to evaluating new technologies, including performance benchmarking, testing methodologies, and criteria for compatibility with existing infrastructure. Outline how collaborative evaluations with your team enhance the decision-making process.

Join Rise to see the full answer
Can you provide an example of a complex problem you resolved in a high-performance computing environment?

Prepare a STAR (Situation, Task, Action, Result) format response. Explain the context, what you identified as the issue, how you addressed it, and the positive outcome that resulted from your efforts.

Join Rise to see the full answer
What role does scripting play in your work as a Systems Engineer?

Highlight the scripting languages you are proficient in and how you use them to automate tasks, improve efficiency, and enhance system interactions. Provide examples of specific scripts you've crafted to solve problems or streamline processes.

Join Rise to see the full answer
How do you manage lifecycle management for hardware in a high-performance setup?

Discuss your strategies for lifecycle management, such as regular audits, firmware updates, and OS driver certifications. Highlight how these practices ensure optimal performance and reliability of the computing resources in your environment.

Join Rise to see the full answer
Describe your experience working statistically with performance metrics.

Share how you collect, analyze, and leverage performance metrics to make informed decisions about system improvements, resource allocations, and workload distributions, emphasizing the significance of data-driven insights.

Join Rise to see the full answer
What is your approach to mentoring junior staff in your team?

Discuss your philosophy on mentorship, which could include regular check-ins, knowledge sharing sessions, and providing constructive feedback. Highlight how you help junior staff develop their skills and confidence while fostering a collaborative team environment.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
FinQuery Remote Remote, USA
Posted 5 days ago
Photo of the Rise User
Posted 12 days ago
Photo of the Rise User

Join Sanford Health as an Electronics Technician to ensure the safe operation of non-medical electronic devices in a key healthcare environment.

Photo of the Rise User
Posted 5 days ago

Join Woolpert as a Sanitary CCTV Operator and support essential utility inspections in a dynamic engineering environment.

Squad Remote No location specified
Posted 13 days ago
Photo of the Rise User
Posted 6 days ago

Visa Inc. operates as a payments technology company worldwide. The company facilitates commerce through the transfer of value and information among consumers, merchants, financial institutions, businesses, strategic partners, and government entiti...

8343 jobs
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
April 3, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!