At NVIDIA, we are at the forefront of the constantly evolving field of large language models, and their application in agentic and reasoning use cases. As the scale and complexity of these LLM systems continues to increase, we are seeking outstanding engineers to join our team and help shape the future of LLM inference.
Our team is dedicated to pushing the boundaries of what's possible with LLMs by improving the algorithmic performance and efficiency of systems that represent them. We constantly reflect on how to improve these systems, developing new inference algorithms and protocols, improving existing models, and seamlessly integrating improvements to ensure NVIDIA's solutions can efficiently handle large-scale, sophisticated tasks.
What you'll be doing:
Research and Development: Explore and incorporate contemporary research on generative AI, agents, and inference systems into the NVIDIA LLM software stack.
Workload Analysis and Optimization: Conduct in-depth analysis, profiling, and optimization of agentic LLM workloads to significantly reduce request latency and increase request throughput while maintaining workflow fidelity.
System Design and Implementation: Design and implement scalable systems to accelerate agentic workflows and efficiently handle sophisticated datacenter-scale use cases.
Collaboration and Communication: Advise future iterations of NVIDIA software, hardware, and system by engaging with a diverse set of teams at NVIDIA and external partners and formalizing the strategic requirements presented by their workloads.
What we need to see:
BS, MS, PhD in Computer Science, Electrical Engineering, Computer Engineering, or a related field (or equivalent experience).
8+ years of experience in deep learning and deep learning systems design.
Proficiency in Python and C++ programming
Strong understanding of computer architecture, and GPU/parallel datacenter computing fundamentals.
Proven interest in analyzing, modeling, and tuning application performance.
Ways to stand out from the crowd:
Experience in building large-scale LLM inference systems, especially those involving compound AI.
Experience with processor and system-level performance modeling.
GPU programming experience with CUDA or OpenCL.
NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent.
The base salary range is 184,000 USD - 356,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Senior Data Scientist sought by NVIDIA to drive finance-focused AI and data science initiatives within a leading technology company.
NVIDIA is looking for an experienced HPC Middleware Developer to innovate high-performance communication frameworks for the world’s largest supercomputers.
Kiddom is looking for an experienced Senior Full Stack Engineer to build innovative AI-powered educational features in a fully remote role.
Experienced Principal Software Engineer needed to direct technology strategy and lead engineering teams in California, driving innovative system architecture and organizational productivity.
Samsara is seeking a technical leader Staff Software Engineer to drive architectural innovation and build scalable platform systems for their connected operations cloud.
Experienced senior software development leader needed to oversee a team focused on innovative software solutions in the United States.
SprintRay is looking for an Android Software Engineer to create and optimize Android applications for medical device IoT solutions.
Senior Software Engineer needed to lead full-stack development in healthcare software at Ambry Genetics, offering remote or hybrid work options.
Lead the architectural vision and development of a cutting-edge embedded IoT platform at nVent, a mature global technology provider.
Lead complex application systems development and provide expert analysis and programming support within a global financial institution.
Contribute as a Software Engineer II to the development of cutting-edge virtualization and kernel technologies that power Microsoft's cloud and operating system platforms.
Lead the development and scaling of cutting-edge simulation technology at Waymo to advance autonomous vehicle capabilities.
Kanopi Studios seeks an experienced WordPress Technical Lead to drive project execution and mentor developers in a fully remote environment across the US and Canada.
Seeking a seasoned Director-level Lead Software Engineer at Morgan Stanley to drive full-stack development and innovation within the Wealth Management Technology group.
Innovate as a Software Engineer at John Deere, crafting scalable solutions for global sourcing and supply chain applications.
NVIDIA is a publicly traded, multinational technology company headquartered in Santa Clara, California. NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, and ignited the era of modern AI.
492 jobsSubscribe to Rise newsletter