Job details

Senior On-Device Model Inference Optimization Engineer

Get a free resume review

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.

We are seeking a highly-skilled Senior On-Device Model Inference Optimization Engineer to join our team and lead efforts in improving the performance and efficiency of AI models enabling the next generation of autonomous vehicles technology at NVIDIA!

What you'll be doing:

Develop and implement strategies to optimize AI model inference for on-device deployment.
Employ techniques like pruning, quantization, and knowledge distillation to minimize model size and computational demands.
Optimize performance-critical components using CUDA and C++.
Collaborate with multi-functional teams to align optimization efforts with hardware capabilities and deployment needs.
Benchmark inference performance, identify bottlenecks, and implement solutions.
Research and apply innovative methods for inference optimization.
Adapt models for diverse hardware platforms and operating systems with varying capabilities.
Create tools to validate the accuracy and latency of deployed models at scale with minimal friction.
Recommend and implement model architecture changes to improve the accuracy-latency balance.

What we need to see:

MSc or PhD in Computer Science, Engineering, or a related field, or equivalent experience.
Over 5 years of confirmed experience specializing in model inference and optimization.
10+ overall years of work experience in a relevant area
Expertise in modern machine learning frameworks, particularly PyTorch, ONNX, and TensorRT.
Proven experience in optimizing inference for transformer and convolutional architectures.
Strong programming proficiency in CUDA, Python, and C++.
In-depth knowledge of optimization techniques, including quantization, pruning, distillation, and hardware-aware neural architecture search.
Skilled in building and deploying scalable, cloud-based inference systems.
Passionate about developing efficient, production-ready solutions with a strong focus on code quality and performance.
Meticulous attention to detail, ensuring precision and reliability in safety-critical systems.
Strong collaboration and communication skills for working optimally across multidisciplinary teams.
A proactive, diligent mentality with a drive to tackle complex optimization challenges.

Ways to stand out from the crowd:

Publications or industry experience in optimizing and deploying model inference at scale.
Hands-on expertise in hardware-aware optimizations and accelerators such as GPUs, TPUs, or custom ASICs.
Active contributions to open-source projects focused on inference optimization or machine learning frameworks.
Experience in designing and deploying inference pipelines for real-time or autonomous systems.

The base salary range is 184,000 USD - 356,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

AI Optimization Model Inference CUDA Machine Learning NVIDIA

NVIDIA Glassdoor Company Review

4.6

NVIDIA DE&I Review

No rating

CEO of NVIDIA

Jensen Huang

Approve of CEO

Average salary estimate

$270250 / YEARLY (est.)

min

max

$184000K

$356500K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs

Senior Account Manager - DoD

NVIDIA Hybrid US, CO, Remote

VIEW

Posted 12 days ago

Customer-Centric

Mission Driven

Inclusive & Diverse

Rise from Within

Diversity of Opinions

Work/Life Harmony

Growth & Learning

Transparent & Candid

Medical Insurance

Paid Time-Off

Maternity Leave

Mental Health Resources

Equity

Child Care stipend

Paternity Leave

WFH Reimbursements

Flex-Friendly

Dental Insurance

Vision Insurance

Life insurance

Health Savings Account (HSA)

Flexible Spending Account (FSA)

401K Matching

Military leave

Experienced Senior Account Manager needed to lead NVIDIA’s growth and technology adoption efforts within the USAF and USSF, leveraging deep defense sector expertise.

Senior Program Manager - Data Center Quality and Sustaining

NVIDIA Hybrid US, CA, Santa Clara

VIEW

Posted 11 days ago

Customer-Centric

Mission Driven

Inclusive & Diverse

Rise from Within

Diversity of Opinions

Work/Life Harmony

Growth & Learning

Transparent & Candid

Medical Insurance

Paid Time-Off

Maternity Leave

Mental Health Resources

Equity

Child Care stipend

Paternity Leave

WFH Reimbursements

Flex-Friendly

Dental Insurance

Vision Insurance

Life insurance

Health Savings Account (HSA)

Flexible Spending Account (FSA)

401K Matching

Military leave

Lead NVIDIA’s Data Center Quality efforts as a Senior Program Manager, driving customer satisfaction and operational excellence across enterprise partners.

Staff Software Engineer (AWS/SQL/ Python/TypeScript)

NBCUniversal Hybrid Englewood Cliffs, New Jersey, United States

VIEW

Posted 7 days ago

Experienced Staff Software Engineer needed to advance cloud-native applications and database management at NBCUniversal in a fully remote role.

CAS0008 .NET Developer

Trinetix Hybrid No location specified

VIEW

Posted 12 days ago

An exciting opportunity to contribute as a .NET Developer at TRINETIX, driving advanced software architecture solutions in a dynamic, global tech company.

Elixir Engineer (Full-time)

Strong Compute Hybrid San Francisco, CA

VIEW

Posted 14 days ago

Drive the development of highly available Elixir-based distributed systems for AI infrastructure at a cutting-edge startup.

Systems Software Engineer- School of Medicine, Radiology and Imaging Sciences

Emory University Hybrid Atlanta

VIEW

Posted 2 days ago

Contribute to innovative research software development at Emory University as a Systems Software Engineer specialized in medical imaging and data integration.

Software Engineer, AI Applications

IMO Health Hybrid Houston, TX

VIEW

Posted 4 days ago

IMO Health is looking for a Software Engineer specializing in AI Applications to build scalable, intelligent solutions advancing healthcare innovation.

Senior Software Engineer - DGX Cloud

NVIDIA Hybrid US, WA, Seattle

VIEW

Posted 8 days ago

Customer-Centric

Mission Driven

Inclusive & Diverse

Rise from Within

Diversity of Opinions

Work/Life Harmony

Growth & Learning

Transparent & Candid

Medical Insurance

Paid Time-Off

Maternity Leave

Mental Health Resources

Equity

Child Care stipend

Paternity Leave

WFH Reimbursements

Flex-Friendly

Dental Insurance

Vision Insurance

Life insurance

Health Savings Account (HSA)

Flexible Spending Account (FSA)

401K Matching

Military leave

Contribute as a Senior Software Engineer at NVIDIA, advancing cloud-native technologies and container orchestration for GPU and DPU accelerated computing.

Senior Software Engineer, Domain Modeler

ResMed Hybrid San Diego, CA, United States

VIEW

Posted 12 days ago

Contribute to ResMed’s digital health platform as a Senior Software Engineer specializing in API and data model design, driving innovation in a collaborative global environment.

Space Domain Software Developer

Peraton Hybrid Chantilly

VIEW

Posted 2 days ago

Peraton is hiring a skilled Full Stack Software Developer to build cutting-edge software tools for space domain analysis and asset protection.

AI Developer

Initiate Government Solutions Hybrid Washington, District of Columbia, United States

VIEW

Posted 8 days ago

Innovative IT firm IGS is hiring a remote AI Developer to build, deploy, and integrate data-driven AI solutions for federal government clients.

Wallet Engineer (React Web)

Backpack Hybrid No location specified

VIEW

Posted 6 days ago

Contribute to blockchain mass adoption as a Wallet Engineer at Backpack, building and optimizing wallet features with a hybrid work setup.

Principal Software Engineer in Test (Prisma SASE)

Palo Alto Networks Hybrid Santa Clara, CA

VIEW

Posted 3 days ago

Palo Alto Networks is looking for a Principal Software Engineer in Test to advance cutting-edge cloud-based cybersecurity products through automated testing and innovative problem-solving.

Software Engineer - Frontend

Swoop Technologies Hybrid Minneapolis

VIEW

Posted 5 days ago

Innovate front-end experiences at Swoop by building interactive 2D and 3D visualizations for a cutting-edge infrastructure platform in a hybrid Minneapolis-based role.

Software Engineer, ML Accelerator

Waymo Hybrid Mountain View, CA, USA

VIEW

Posted 6 days ago

Social Impact Driven

Empathetic

Collaboration over Competition

Growth & Learning

Contribute to cutting-edge autonomous vehicle technology by designing and optimizing firmware and drivers for ML accelerator chips at Waymo.

Get a free resume review

NVIDIA

NVIDIA is a publicly traded, multinational technology company headquartered in Santa Clara, California. NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, and ignited the era of modern AI.

493 jobs

MATCH

Calculating your matching score...

BADGES