Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Lead Systems Engineer, High-Performance Computing image - Rise Careers
Job details

Lead Systems Engineer, High-Performance Computing - job 10 of 20

IaaS Systems and Storage & Engineering (ISSE) team is part of the Operations & Infrastructure technology organization. Distributed Compute engineering (DCE) is part of ISSE and High-performance compute platform engineering is part of DCE. Our vision, mission and purpose are summarized as following:

Vision: To become a leading technical engineering professional, pioneering in the design and automation of server infrastructure. We envision creating highly secure and efficient operations environments that drive business success and technological advancement.

Mission: Our mission is to deliver high-quality server infrastructure design and automated implementation. We are committed to operating in complex, highly secure, and highly available environments, while maintaining rigorous operations, security, and procedural models.

Purpose: The purpose of this role is to utilize strong hands-on technical engineering skills to design and automate the implementation of server infrastructure based on business requirements. This role will interact with technology domain experts to maintain high security and availability in complex operational environments, thereby driving business efficiency and security.

Essential Functions:

  • GPU as a Service and High-Performance Compute Platform Support: Expertise in deploying, managing, and optimizing GPU as a Service (GaaS) and high-performance compute platforms to support advanced computational workloads.
  • Extensive Datacenter Experience: Proficient in managing complex, geographically distributed IT infrastructures to ensure high availability and performance.
  • Advanced Technical Knowledge: Profound understanding of high-performance, highly available, and secure computing systems utilizing x86 technologies and protocols (NVME, GPU, PCI-E).
  • Enterprise Server and Component Expertise: In-depth knowledge of server components (storage/network controllers, HBA, SSDs) and their functionalities, essential for maintaining high-performance compute environments.
  • Processor and GPU Systems Proficiency: Strong grasp of Intel/AMD architectures, GPU systems, memory hierarchy, and hardware-level security to enhance system performance and reliability.
  • Out-of-Band, UEFI, and BIOS Expertise: Comprehensive understanding of out-of-band management, UEFI, BIOS settings, and their impact on system performance and security in high-performance computing environments.
  • Hardware Lifecycle Management: Experienced in hardware lifecycle management, including firmware and OS driver certifications, to ensure the longevity and reliability of compute resources.
  • Infrastructure Management and Automation: Proficient in installing, configuring, supporting, and maintaining compute infrastructure management tools, with skills in Ansible for automation to streamline deployment and operational tasks.
  • Performance Benchmarking and Tech Evaluation: Capable of running performance benchmarks and evaluating new technologies for various platforms (Linux, Windows, containerized, and virtualized) to ensure optimal performance.
  • Scripting Proficiency: Advanced skills in scripting languages such as PowerShell and Python to automate and optimize infrastructure tasks.
  • Team and Independent Work: Highly motivated, excellent team player, capable of working independently, with strong analytical and troubleshooting abilities to resolve complex issues and mentor junior staff.

This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.

Average salary estimate

$135000 / YEARLY (est.)
min
max
$120000K
$150000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Lead Systems Engineer, High-Performance Computing, Visa

As a Lead Systems Engineer for High-Performance Computing at our Ashburn office, you'll play a pivotal role in our IaaS Systems and Storage & Engineering (ISSE) team, part of the Operations & Infrastructure technology organization. Here, you'll be at the forefront of Distributed Compute engineering, driving innovation in our high-performance computing platform. Your mission? To design and automate server infrastructure that not only meets complex business requirements but enhances security and availability in a fast-paced operational environment. You'll leverage your extensive experience with GPU as a Service, particularly focused on deploying, managing, and optimizing our computing platforms to support advanced computational workloads. We value a strategic mindset paired with strong technical expertise in datacenter management. Your familiarity with x86 technologies, NVME, GPU, and PCI-E protocols will be vital in ensuring our systems operate efficiently. Additionally, your skills in infrastructure management and automation—especially with tools like Ansible—will streamline our deployment processes. If you're excited about performance benchmarking across multiple platforms and enjoy collaborating with cross-functional teams while also thriving in independent tasks, this hybrid position could be a perfect fit for you. Join us in our mission to become industry leaders in secure and efficient operations environments!

Frequently Asked Questions (FAQs) for Lead Systems Engineer, High-Performance Computing Role at Visa
What are the main responsibilities of a Lead Systems Engineer at High-Performance Computing?

The Lead Systems Engineer at High-Performance Computing is primarily responsible for the design and automation of server infrastructure. This includes deploying and managing GPU as a Service, ensuring high availability in complex IT infrastructures, and performing advanced technical evaluations to optimize performance across multiple platforms. Their role requires collaboration with domain experts to maintain stringent security protocols while enhancing operational efficiencies.

Join Rise to see the full answer
What qualifications are needed for the Lead Systems Engineer role at High-Performance Computing?

To excel as a Lead Systems Engineer at High-Performance Computing, candidates should possess extensive experience in high-performance computing systems, particularly with x86 architectures and GPU technologies. A strong understanding of datacenter management, in-depth knowledge of server components, and proficiency in scripting languages like PowerShell or Python are essential. Additionally, familiarity with automation tools like Ansible is a significant advantage.

Join Rise to see the full answer
What skills are essential for a successful Lead Systems Engineer in High-Performance Computing?

Key skills for the Lead Systems Engineer position at High-Performance Computing include advanced technical knowledge in server components and architectures, strong analytical skills for troubleshooting complex issues, and proficiency in performance benchmarking tools. A collaborative spirit combined with the ability to work independently is also critical, alongside a commitment to maintaining a secure and efficient operational environment.

Join Rise to see the full answer
How does Teamwork play a role in the Lead Systems Engineer position at High-Performance Computing?

Teamwork is fundamental for the Lead Systems Engineer role at High-Performance Computing. You'll be collaborating with various technology domain experts to ensure that the server infrastructure aligns with security and operational requirements. While individual contributions are valued, the ability to mentor junior staff and promote a culture of teamwork enhances the effectiveness of the engineering processes.

Join Rise to see the full answer
What is the work environment for the Lead Systems Engineer at High-Performance Computing?

The work environment for the Lead Systems Engineer at High-Performance Computing is hybrid, allowing for a mix of remote work and onsite presence. This flexibility aims to enhance productivity while facilitating collaboration among team members. Team members are expected to work from the office 2-3 preset days a week, ensuring effective communication and teamwork while fulfilling business needs.

Join Rise to see the full answer
Common Interview Questions for Lead Systems Engineer, High-Performance Computing
Can you describe your experience with high-performance computing systems?

When answering this question, highlight specific projects where you designed, implemented, or optimized high-performance computing systems. Discuss the tools you used, the challenges you faced, and how you resolved them, ensuring to relate back to your hands-on skills and technical knowledge relevant to the Lead Systems Engineer role.

Join Rise to see the full answer
How do you ensure high availability and security in server infrastructure?

Detail your strategies for ensuring high availability and security within server infrastructure. Discuss specific monitoring tools, protocols, and standards you apply, such as your experience with GPU as a Service or x86 technologies, and how you implement robust security measures during deployment and configuration.

Join Rise to see the full answer
What automation tools have you used in your previous roles?

Indicate the automation tools you have experience with, particularly emphasizing Ansible or any scripting languages such as PowerShell and Python. Provide examples of how you’ve used them to streamline deployments or optimize operational tasks, which is critical for the Lead Systems Engineer position.

Join Rise to see the full answer
How do you approach performance benchmarking for computational workloads?

Describe your methodology for performance benchmarking. Give examples of benchmarks you've performed, the metrics you measured, and how this impacted overall system performance. Relating your approach to the technologies relevant to the Lead Systems Engineer role will demonstrate your practical experience.

Join Rise to see the full answer
Can you discuss a time you resolved a complex technical issue?

Share a specific instance where you encountered a challenging technical problem, outlining the steps you took to diagnose and solve it. Focus on your analytical skills and technical expertise, tying the experience back to competencies needed in the Lead Systems Engineer role.

Join Rise to see the full answer
What is your experience with GPU as a Service deployments?

Talk about any direct experience you have with the deployment and management of GPU as a Service. Discuss the challenges you faced, the configurations you implemented, and any optimizations you achieved that will underline your value for the role of Lead Systems Engineer.

Join Rise to see the full answer
How do you prioritize tasks in a fast-paced technical environment?

Explain your approach to task prioritization, especially in situations with competing deadlines or high-pressure scenarios. Share specific techniques you employ—such as using task management tools or creating priority matrices—to ensure efficient and effective work outputs relevant to the Lead Systems Engineer's responsibilities.

Join Rise to see the full answer
What is your understanding of hardware lifecycle management?

Explain your comprehension of hardware lifecycle management, highlighting your experience with firmware updates, OS driver certifications, and maintaining compute resources. Show how this knowledge is pertinent for the Lead Systems Engineer role by discussing its significance in ensuring system reliability.

Join Rise to see the full answer
Describe your scripting experience and how it applies to your technical work.

Discuss the languages you are proficient in, like PowerShell or Python, and provide examples of how you've used these skills to solve problems, automate tasks, or enhance efficiency in past roles. Relating this back to the Lead Systems Engineer position will demonstrate your practical capabilities.

Join Rise to see the full answer
How do you handle feedback and mentoring of junior engineers?

Share your philosophy on feedback and mentoring, citing specific instances where you guided junior engineers or colleagues. Highlight the significance of teamwork in the Lead Systems Engineer role, showcasing your capacity to cultivate a supportive learning environment.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Posted 8 days ago
Rusken Packaging Hybrid US, Poinsett County, AR; Arkansas, Harrisburg, AR
Posted 9 days ago
Photo of the Rise User
Posted 6 days ago

Join Replicate as an Engineering Manager to lead a team enhancing AI models and contributing to the open-source community.

Photo of the Rise User

Join EPE as a Senior Power Systems Engineer and help shape the future of energy through cutting-edge projects.

Visa Inc. operates as a payments technology company worldwide. The company facilitates commerce through the transfer of value and information among consumers, merchants, financial institutions, businesses, strategic partners, and government entiti...

8905 jobs
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
April 3, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!