Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Senior High-Performance LLM Training Engineer image - Rise Careers
Job details

Senior High-Performance LLM Training Engineer - job 2 of 2

We are now looking for a Senior High-Performance LLM Training Engineer!

NVIDIA is seeking experienced engineers specializing in performance analysis and optimization to improve the efficiency of LLM training workloads, which are shaping the world's most advanced computing systems. This position focuses on optimizing NVIDIA’s high-performance LLM software stack in frameworks like PyTorch and JAX for high-performance training on thousands of GPUs, while also helping shape hardware roadmaps for the next generation of GPUs powering the AI revolution.

What you will be doing:

  • Understand, analyze, profile, and optimize AI training workloads on innovative hardware and software platforms.

  • Understand the big picture of training performance on GPUs, prioritizing and then solving problems across all state-of-the-art neural networks.

  • Implement production-quality software in multiple layers of NVIDIA's deep learning platform stack, from drivers to DL frameworks.

  • Build and support NVIDIA submissions to the MLPerf Training benchmark suite.

  • Implement key DL training workloads in NVIDIA's proprietary processor and system simulators to enable future architecture studies.

  • Build tools to automate workload analysis, workload optimization, and other critical workflows.

What we want to see:

  • PhD in Computer Science, Electrical Engineering or Computer Engineering and 5+ years; or MS (or equivalent experience) and 8+ years of meaningful work experience.

  • Strong background in deep learning and neural networks, in particular training.

  • A deep background in computer architecture and familiarity with the fundamentals of GPU architecture.

  • Proven experience analyzing and tuning application performance & processor and system-level performance modelling.

  • Programming skills in C++, Python, and CUDA.

GPU computing is the most productive and pervasive platform for deep learning and AI. It begins with the most advanced GPUs and the systems and software we build on top of them. We integrate and optimize every deep learning framework. We work with the major systems companies and every major cloud service provider to make GPUs available in data centers and in the cloud. We craft computers and software to bring AI to edge devices, such as self-driving cars and autonomous robots. AI has the potential to spur a wave of social progress unmatched since the industrial revolution.

Widely considered to be one of tech's most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. Additionally, this opportunity offers you the ability to collaborate with some of the most forward-thinking and hard-working people in the world, shaping the future of AI in a creative and autonomous work environment that encourages innovation. If you're excited to work across the full hardware & software stack—from GPU architecture to application code—to achieve optimal performance, we want to hear from you!

#LI-Hybrid

The base salary range is 184,000 USD - 356,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

NVIDIA Glassdoor Company Review
4.6 Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon
NVIDIA DE&I Review
No rating Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon
CEO of NVIDIA
NVIDIA CEO photo
Jensen Huang
Approve of CEO

Average salary estimate

$270250 / YEARLY (est.)
min
max
$184000K
$356500K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Senior High-Performance LLM Training Engineer, NVIDIA

NVIDIA is on the lookout for a Senior High-Performance LLM Training Engineer in beautiful Santa Clara, CA! If you're passionate about pushing the limits of AI and deep learning, this could be the perfect opportunity for you. In this role, you'll dive headfirst into optimizing NVIDIA's extensive LLM software stack, particularly within frameworks like PyTorch and JAX. The goal? To ensure that training workloads run efficiently on thousands of GPUs, driving the AI revolution forward. You'll get to analyze and profile AI training workloads, tackling performance issues for various state-of-the-art neural networks while implementing high-quality production software across multiple layers of NVIDIA's deep learning platform. Additionally, you'll have the unique chance to contribute to NVIDIA's submissions for the MLPerf Training benchmark suite and work with processor and system simulators to lay the groundwork for groundbreaking architectural studies. To thrive in this role, you should possess a PhD in a related field or equivalent experience, alongside a solid background in deep learning and GPU architecture. Programming skills in C++, Python, and CUDA will be essential tools in your toolkit. At NVIDIA, you'll collaborate with some of the brightest minds in tech, all while enjoying a creative, autonomous work environment and a competitive salary along with excellent benefits. If you're ready to revolutionize the field of AI and leave your mark on the future of technology, we invite you to join us!

Frequently Asked Questions (FAQs) for Senior High-Performance LLM Training Engineer Role at NVIDIA
What are the main responsibilities of a Senior High-Performance LLM Training Engineer at NVIDIA?

As a Senior High-Performance LLM Training Engineer at NVIDIA, you will primarily focus on optimizing the high-performance LLM software stack to enhance training workloads on GPUs. This includes analyzing training performance across various neural networks, implementing production-quality software, and contributing to benchmark submissions for MLPerf. You'll collaborate closely with innovative hardware and software teams to tackle complex performance challenges.

Join Rise to see the full answer
What qualifications do I need to apply for the Senior High-Performance LLM Training Engineer position at NVIDIA?

To qualify for the Senior High-Performance LLM Training Engineer role at NVIDIA, you typically need a PhD in Computer Science, Electrical Engineering, or Computer Engineering, along with at least 5 years of relevant experience. Alternatively, a Master's degree with significant experience (8+ years) will also be considered. A strong foundation in deep learning, neural networks, and GPU architecture is critical.

Join Rise to see the full answer
What programming skills are required for the Senior High-Performance LLM Training Engineer role at NVIDIA?

Candidates for the Senior High-Performance LLM Training Engineer position at NVIDIA should be proficient in programming languages such as C++, Python, and CUDA. These skills are vital for effectively optimizing and analyzing performance on NVIDIA's deep learning platform and ensuring high-performance training on GPUs.

Join Rise to see the full answer
How does NVIDIA support diversity and inclusion in the workplace for the Senior High-Performance LLM Training Engineer role?

NVIDIA is committed to fostering a diverse and inclusive work environment. As a Senior High-Performance LLM Training Engineer, you will be part of a workplace that values diversity and does not discriminate based on race, gender, or any other characteristic. This commitment to diversity extends to hiring practices and creating an inclusive culture where every employee can thrive.

Join Rise to see the full answer
How does the compensation for the Senior High-Performance LLM Training Engineer position at NVIDIA compare in the industry?

The compensation for the Senior High-Performance LLM Training Engineer position at NVIDIA is competitive within the industry, with a base salary range of $184,000 to $356,500, depending on experience and location. Alongside competitive salaries, NVIDIA also offers equity options and a comprehensive benefits package that is designed to support employee well-being.

Join Rise to see the full answer
Common Interview Questions for Senior High-Performance LLM Training Engineer
What strategies do you use to optimize LLM training workloads?

To optimize LLM training workloads, I analyze profiling data to identify bottlenecks within the training process. This involves using tools to monitor GPU utilization and memory usage, and implementing techniques such as data parallelism and mixed precision training to enhance performance.

Join Rise to see the full answer
Can you describe your experience with performance analysis in deep learning?

Performance analysis in deep learning is crucial to my work experience. I've employed various profiling tools to assess computational efficiency, identified critical areas for improvement, and implemented optimizations that led to significant speed-ups in training times across multiple neural networks.

Join Rise to see the full answer
How do you approach implementing production-quality software?

Implementing production-quality software starts with a thorough understanding of system requirements and architecture. I focus on writing clean, maintainable code, conducting rigorous testing, and utilizing version control systems to ensure robust deployment in high-performance environments.

Join Rise to see the full answer
What is your experience with NVIDIA's deep learning frameworks?

I have actively worked with NVIDIA's deep learning frameworks, particularly focusing on optimizing workflows in frameworks like PyTorch and JAX. My experience includes integrating new features, enhancing performance, and ensuring that these frameworks effectively leverage NVIDIA's GPU capabilities.

Join Rise to see the full answer
Discuss a challenging performance issue you've encountered and how you resolved it.

One challenging performance issue I faced involved suboptimal GPU utilization during deep learning training. After profiling the workload, I discovered data transfer delays between CPU and GPU. I resolved the issue by optimizing data pipelines and employing asynchronous data loading, significantly improving model training efficiency.

Join Rise to see the full answer
How do you stay current with advancements in GPU architecture and performance optimization?

To stay current with advancements in GPU architecture, I regularly read research papers, attend conferences, and participate in online courses. Engaging with developer communities also provides insights into the latest techniques in performance optimization that I can apply to my projects.

Join Rise to see the full answer
What tools do you find essential for workload analysis?

For workload analysis, I find tools like NVIDIA Nsight Systems and TensorBoard invaluable. These tools provide rich profiling capabilities, enabling detailed tracking of resource usage and pinpointing areas that need optimization for better performance.

Join Rise to see the full answer
How do you implement automation in workload optimization?

I implement automation in workload optimization by developing scripts and using tools to streamline repetitive tasks, such as performance benchmarking and workload analysis. This enables faster iteration cycles and allows the team to focus on high-level optimization strategies.

Join Rise to see the full answer
What considerations do you keep in mind when shaping hardware roadmaps for new GPU architectures?

When shaping hardware roadmaps for new GPU architectures, I consider current performance metrics, workload requirements, and user feedback. It's essential to analyze trends in AI applications to anticipate future needs and ensure that the architecture can address next-generation workloads effectively.

Join Rise to see the full answer
Explain your understanding of mixed precision training.

Mixed precision training involves using both 16-bit and 32-bit floating-point numbers during model training to balance performance and accuracy. This allows for faster computations and reduced memory consumption, particularly beneficial when training large deep learning models on GPUs.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Posted 3 days ago
Customer-Centric
Mission Driven
Inclusive & Diverse
Rise from Within
Diversity of Opinions
Work/Life Harmony
Growth & Learning
Transparent & Candid
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Child Care stipend
Paternity Leave
WFH Reimbursements
Flex-Friendly
Dental Insurance
Vision Insurance
Life insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Military leave

Join NVIDIA as a Senior ASIC Design Engineer to work on innovative DFT solutions for complex semiconductor chips.

Photo of the Rise User
Customer-Centric
Mission Driven
Inclusive & Diverse
Rise from Within
Diversity of Opinions
Work/Life Harmony
Growth & Learning
Transparent & Candid
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Child Care stipend
Paternity Leave
WFH Reimbursements
Flex-Friendly
Dental Insurance
Vision Insurance
Life insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Military leave

NVIDIA is looking for a Senior VLSI CAD R&D Engineer to enhance and innovate algorithms for advanced gate-level analysis tools.

Photo of the Rise User
Posted 10 days ago

Lead AECOM’s Civil and Structural Engineering Design Team for nuclear projects, shaping the industry's future through innovative solutions.

Photo of the Rise User
Medtronic Hybrid Dexter, Michigan, United States of America
Posted 4 days ago

Join Medtronic as a Manufacturing Engineer II, where you'll enhance manufacturing processes for critical healthcare equipment in a collaborative environment.

Photo of the Rise User
Posted 12 days ago

As a Senior Manager, Engineering at Beacon Biosignals, you will lead a team of engineers in a mission-driven company transforming brain treatment through innovative technology.

Photo of the Rise User
Boeing Hybrid US, Saint Louis County, MO; Missouri, Berkeley, MO
Posted 6 days ago

Join Boeing as a Senior Level Ground Hardware Architect to lead innovative developments in Ground Segment hardware architecture.

Photo of the Rise User

Join Charm Industrial as a Pyrolyzer Operator, where you'll operate innovative machinery to contribute to impactful climate solutions.

Photo of the Rise User

A dynamic Solution Architect opportunity awaits with a leading global food and services company, focusing on comprehensive solution design.

Photo of the Rise User

Seeking a skilled Commissioning Engineering Manager to lead commissioning processes in data center construction projects in San Antonio.

Photo of the Rise User
Posted 3 days ago
Customer-Centric
Mission Driven
Inclusive & Diverse
Rise from Within
Diversity of Opinions
Work/Life Harmony
Growth & Learning
Transparent & Candid
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Child Care stipend
Paternity Leave
WFH Reimbursements
Flex-Friendly
Dental Insurance
Vision Insurance
Life insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Military leave

Become a part of NVIDIA's innovative team as a Senior VLSI CAD Engineer and help shape the future of AI hardware design.

Photo of the Rise User
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Rapid Growth
Passion for Exploration
Dare to be Different
Dental Insurance
Life insurance
Health Savings Account (HSA)
Disability Insurance
Flexible Spending Account (FSA)
Vision Insurance
Mental Health Resources
401K Matching
Paid Time-Off
Snacks
Photo of the Rise User
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Rapid Growth
Passion for Exploration
Dare to be Different
Dental Insurance
Life insurance
Health Savings Account (HSA)
Disability Insurance
Flexible Spending Account (FSA)
Vision Insurance
Mental Health Resources
401K Matching
Paid Time-Off
Snacks
Photo of the Rise User
Inclusive & Diverse
Empathetic
Collaboration over Competition
Growth & Learning
Transparent & Candid
Medical Insurance
Dental Insurance
Mental Health Resources
Life insurance
Disability Insurance
Child Care stipend
Employee Resource Groups
Learning & Development
Photo of the Rise User
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Transparent & Candid
Growth & Learning
Fast-Paced
Collaboration over Competition
Take Risks
Friends Outside of Work
Passion for Exploration
Customer-Centric
Reward & Recognition
Feedback Forward
Rapid Growth
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Paternity Leave
Fully Distributed
Flex-Friendly
Some Meals Provided
Snacks
Social Gatherings
Pet Friendly
Company Retreats
Dental Insurance
Life insurance
Health Savings Account (HSA)
Photo of the Rise User
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Transparent & Candid
Growth & Learning
Fast-Paced
Collaboration over Competition
Take Risks
Friends Outside of Work
Passion for Exploration
Customer-Centric
Reward & Recognition
Feedback Forward
Rapid Growth
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Paternity Leave
Fully Distributed
Flex-Friendly
Some Meals Provided
Snacks
Social Gatherings
Pet Friendly
Company Retreats
Dental Insurance
Life insurance
Health Savings Account (HSA)

NVIDIA is a publicly traded, multinational technology company headquartered in Santa Clara, California. NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, and ignited the era of modern AI.

359 jobs
MATCH
Calculating your matching score...
BADGES
Badge ChangemakerBadge Diversity ChampionBadge Family FriendlyBadge Global CitizenBadge Work&Life Balance
CULTURE VALUES
Customer-Centric
Mission Driven
Inclusive & Diverse
Rise from Within
Diversity of Opinions
Work/Life Harmony
Growth & Learning
Transparent & Candid
BENEFITS & PERKS
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Child Care stipend
Paternity Leave
WFH Reimbursements
Flex-Friendly
Dental Insurance
Vision Insurance
Life insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Military leave
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, on-site
DATE POSTED
April 9, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!