Job details

Software Engineer - ML Platform

Get a free resume review

At Replicate, we’re on a mission to redefine AI infrastructure. We’re not just another AI company; we’re a team of developers, engineers, and innovators from organizations like Docker, Spotify, Dropbox, GitHub, Heroku, NVIDIA, and more. We’ve built foundational technologies like Docker Compose and OpenAPI, and now, we’re applying that expertise to make AI deployment as intuitive and reliable as web deployment.

Our goal is straightforward: build the best platform for creating, deploying, and running machine learning models. As an Infrastructure Engineer on the Platform team, you’ll play a key role in making generative AI available to everyone.

The Platform team at Replicate oversees the entire lifecycle of models, from packaging and deployment to serving, scaling, and monitoring. You’ll be developing the infrastructure that supports thousands of models and powers millions of predictions daily. This is a chance to build something truly innovative, where each decision you make has a tangible impact and allows your creativity to shine.

What you’ll be doing:

Designing and building our deployment and model-serving platform.
Building technology to operate the latest advancements in the ML and AI space.
Designing systems to maximize the utilization and reliability of our Kubernetes clusters and GPUs, including multi-regional traffic shifting and failover capabilities.
Owning and optimizing fair and reliable task allocation and queuing across a diverse set of customers with heterogeneous workloads.
Working with our Models team to speed up model inference through techniques like caching, weights management, machine configurations, and runtime optimizations in Python and PyTorch.
Working with technologies such as
- Python, Go, and Node.js
- Kubernetes and Terraform
- Redis, Google BigQuery, and PostgreSQL

We're looking for the right person, not just someone who checks boxes, but it’s likely you have…

Experience building platforms at scale.
Worked in complex systems with many moving parts; you have opinions on monoliths vs. services.
Designed and implemented developer-friendly APIs to enable scalable and reliable integration.
Hands-on experience setting up and operating Kubernetes.
A passion for building tools that empower developers.
Strong communication and collaboration skills, with the ability to understand customer needs and distill complex topics into clear, actionable insights. We believe that most of programming isn’t just about writing code; building a platform requires a collaborative approach.
At least 3 years of full time software engineering experience.

These aren’t hard requirements, but we definitely want to talk with you if…

You have worked on machine learning platform teams in the past.
You have experience working with or on teams that have put ML/AI into production, even though this role does not entail building ML models directly.
You have some exposure to serving Generative AI features where GPUs are costly commodities and workloads can take significant time to finish.

This role can be remote (anywhere in the United States) or in-person. We have a strong preference for people in PST. If possible, we like people to come into our San Francisco office at least 3 days a week.

Software Engineer Machine Learning Kubernetes AI Infrastructure Remote Work

Average salary estimate

$130000 / YEARLY (est.)

min

max

$100000K

$160000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs

Partnerships Lead

Replicate Hybrid San Francisco

VIEW

Posted 3 days ago

Drive strategic and ecosystem partnerships at Replicate as their first Partnerships Lead, connecting cutting-edge ML model providers with a rapidly growing platform.

Java Software Developer - Space Missions

Peraton Hybrid Herndon

VIEW

Posted 6 days ago

Experienced Java Developer needed at Peraton to build cutting-edge software solutions for space mission data processing and distribution.

Staff Backend Software Engineer

Sirona Medical Hybrid United States

VIEW

Posted 6 days ago

Contribute as a Staff Backend Software Engineer at Sirona Medical to innovate and optimize real-time medical imaging data streaming and workflows on a cloud-native radiology platform.

Software Engineer - Full Stack

Veeva Systems Hybrid Pennsylvania - Philadelphia

VIEW

Posted 5 days ago

Inclusive & Diverse

Rise from Within

Mission Driven

Diversity of Opinions

Family Medical Leave

Maternity Leave

Paternity Leave

Lactation Facilities

Family Coverage (Insurance)

Medical Insurance

Dental Insurance

Vision Insurance

Mental Health Resources

Life insurance

Disability Insurance

Health Savings Account (HSA)

Flexible Spending Account (FSA)

401K Matching

Paid Time-Off

Paid Volunteer Time

Contribute to Veeva's impactful life sciences cloud platform as a Full Stack Software Engineer, innovating scalable solutions in an agile, hybrid work environment.

Senior Full Stack Engineer, Middle Office Internal Tool

Altruist Hybrid San Francisco, California, United States

VIEW

Posted 11 days ago

Altruist is looking for a Senior Full Stack Engineer to build scalable internal tools that drive operational efficiency and support financial advisors.

Senior Automation Developer

NVIDIA Hybrid US, CA, Remote

VIEW

Posted 12 days ago

Customer-Centric

Mission Driven

Inclusive & Diverse

Rise from Within

Diversity of Opinions

Work/Life Harmony

Growth & Learning

Transparent & Candid

Medical Insurance

Paid Time-Off

Maternity Leave

Mental Health Resources

Equity

Child Care stipend

Paternity Leave

WFH Reimbursements

Flex-Friendly

Dental Insurance

Vision Insurance

Life insurance

Health Savings Account (HSA)

Flexible Spending Account (FSA)

401K Matching

Military leave

Lead automation efforts at NVIDIA's Silicon Solutions Group as a Senior Python Developer enhancing datacenter product workflows remotely.

Staff Fullstack Software Engineer, AI Evaluation

Wayve Hybrid Sunnyvale, California, United States

VIEW

Posted 10 days ago

Drive innovation as a Staff Fullstack Software Engineer at Wayve, developing critical AI evaluation tools to enhance autonomous vehicle performance.

Software Engineer, AI

Entera Hybrid New York, United States

VIEW

Posted 14 days ago

Contribute to Entera's AI solutions development as a Software Engineer focused on Large Language Models and backend integration.

Sr. SW Engineer

Visa Hybrid Austin, TX, USA

VIEW

Posted 12 days ago

Visa is seeking a Senior Software Engineer for their Austin team to develop and maintain cutting-edge payment technology solutions.

Senior Software Engineer, Backend

Agtonomy Hybrid South San Francisco, CA

VIEW

Posted 7 days ago

Agtonomy seeks a Senior Backend Software Engineer to build scalable cloud services that power autonomous agricultural machinery and software platforms.

Senior Software Engineer - Fullstack (Java and React)

Visa Hybrid Austin, TX

VIEW

Posted 5 days ago

Experienced Full Stack Software Engineer needed at Visa to design and build cutting-edge payment technology solutions on a global scale.

Android Engineer (Women's Health)

Oura Hybrid No location specified

VIEW

Posted 13 days ago

Shape the future of women's health as an Android Engineer at Oura by designing and delivering cutting-edge mobile features on a remote, cross-functional team.

Engineers

American Express Hybrid Phoenix, Arizona, United States

VIEW

Posted 13 days ago

Inclusive & Diverse

Empathetic

Collaboration over Competition

Growth & Learning

Transparent & Candid

Medical Insurance

Dental Insurance

Mental Health Resources

Life insurance

Disability Insurance

Child Care stipend

Employee Resource Groups

Learning & Development

Software Engineers at American Express will develop digital automation solutions and optimize software systems to enhance customer experiences.

Senior Backend PHP Software Engineer (Remote)

Cloudbeds Hybrid North Dakota, United States

VIEW

Posted 10 days ago

Mission Driven

Diversity of Opinions

Inclusive & Diverse

Empathetic

Collaboration over Competition

Growth & Learning

Contribute to Cloudbeds' innovative hospitality platform as a Senior Backend PHP Engineer, driving scalable, high-quality backend solutions remotely.

Get a free resume review

Replicate

Machine learning can now do some extraordinary things, but its still hard to use. You spend all day battling with messy Python scripts, broken Colab notebooks, perplexing CUDA errors, misshapen tensors. Its a mess. The reason machine learning is s...

17 jobs

MATCH

Calculating your matching score...

FUNDING

Growth

DEPARTMENTS

Software Engineering

SENIORITY LEVEL REQUIREMENT

Mid-Level

TEAM SIZE

11-50

HQ LOCATION