We’re building the infrastructure that powers GR00T, NVIDIA’s general-purpose humanoid robotics platform. This is not a typical DevOps job. You’ll help engineer the cloud-native backend that drives simulation, synthetic data generation, multi-stage model training, and robotic deployment—all at massive scale. Our orchestration system, NVIDIA OSMO, is built to handle real-time robotics workflows in cloud environments across thousands of GPUs. We’re looking for a pragmatic Kubernetes-native backend and infrastructure engineer who excels in solving complex orchestration problems in distributed AI/ML systems.
What you’ll be doing:
Architect, develop, and deploy backend services supporting NVIDIA GR00T using Kubernetes and cloud-native technologies.
Collaborate with ML, simulation, and robotics engineers to deploy scalable, reproducible, and observable multi-node training and inference workflows.
Extend and maintain OSMO’s orchestration layers to support heterogeneous compute backends and robotic data pipelines.
Develop Helm charts, controllers, CRDs, and service mesh integrations to support secure and fault-tolerant system operation.
Implement microservices written in Go or Python that power GR00T task execution, metadata tracking, and artifact delivery.
Optimize job scheduling, storage access, and networking across hybrid and multi-cloud Kubernetes environments (e.g., OCI, Azure, on-prem).
Build tooling that simplifies deployment, debugging, and scaling of robotics workloads.
What we need to see:
BS, MS, or PhD degree in Computer Science, Electrical Engineering, Computer Engineering, or related field (or equivalent experience)
5+ years of work experience in DevOps, backend, or cloud infrastructure engineering.
Hands-on experience building and deploying microservices in Kubernetes-native environments.
Proficiency in Golang or Python, especially for backend systems and operators.
Experience with Helm, or other Kubernetes templating and config management tools.
Familiarity with GitOps workflows, observability stacks (e.g., Prometheus, Grafana), and container CI/CD pipelines.
Strong understanding of container networking, storage (e.g., PVCs, ephemeral), and scheduling.
Ways to stand out from the crowd:
Experience with ML training workflows, distributed job orchestration (e.g., MPI, Ray, Triton Inference Server).
Knowledge of robotics frameworks (e.g., ROS2) or simulation tools (e.g., Isaac Sim, Omniverse).
Background with GPU cluster management and scheduling across cloud providers.
Contributions to open-source Kubernetes projects or custom operators/controllers.
NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you are creative and autonomous, we want to hear from you!
The base salary range is 148,000 USD - 287,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Lead AI training and certification initiatives at NVIDIA’s Deep Learning Institute to empower developers with expertise in Predictive and Generative AI, LLMs, and Omniverse.
Contribute as a Software Test Development Engineer at NVIDIA, focusing on deep learning software quality assurance to support cutting-edge AI applications.
Innovate GPU firmware tools and infrastructure as a Firmware Infrastructure Engineer at NVIDIA, pushing the boundaries of GPU technology.
A highly skilled Staff Software Engineer is needed at ATG to lead technical design and build mission-critical AI and infrastructure systems within the financial technology domain.
Seeking a seasoned Java Developer with strong JavaScript and Node.js skills for an exclusive onsite W2 contract opportunity in Sunnyvale, CA.
A Senior Software Engineer role at Sentry offers the opportunity to lead technical initiatives and collaborate in a hybrid work model within a stable and growth-oriented insurance company.
Pylon is looking for a skilled Frontend Engineer to build fast, expressive interfaces for a cutting-edge mortgage platform.
Contribute to building cutting-edge AI voice agents at Strada by designing, deploying, and refining customer-facing solutions in the insurance industry.
.Net Lead role at Tietoevry Create requires extensive experience in Microsoft technologies with an emphasis on .NET Framework, Azure, and secure software architecture to drive innovative digital solutions.
Contribute as a Senior Software Engineer at NVIDIA, advancing cloud-native technologies and container orchestration for GPU and DPU accelerated computing.
Shape the future of government contracting as a Founding Full Stack Engineer at GovSignals, working on an advanced AI platform using a modern TypeScript tech stack.
Lead Abbott's Software Engineering organization to create high-quality medical device and digital health software using modern technologies and SAFe Agile practices.
Experienced Full Stack Developer needed at Brandes Associates Inc. to architect and maintain critical DoD systems while mentoring junior staff.
Contribute expert software engineering skills at Anduril Industries to build cutting-edge defense technology solutions.
Experienced Senior Software Developer needed to drive innovative, secure microservices development leveraging Web3 technology at Via.
NVIDIA is a publicly traded, multinational technology company headquartered in Santa Clara, California. NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, and ignited the era of modern AI.
493 jobsSubscribe to Rise newsletter