Inference.net is seeking a Senior Distributed ML Systems Engineer to join our team. This role involves developing large-scale, fault-tolerant distributed systems that handle millions of large language model inference requests per day. If you are passionate about developing next-generation ML systems that operate at scale, we want to hear from you.
You will be responsible for designing and implementing the core systems that power our globally distributed LLM inference network. You'll work on problems at the intersection of distributed systems, machine learning, and resource optimization.
About Inference.net
We are building a distributed LLM inference network that combines idle GPU capacity from around the world into a single cohesive plane of compute that can be used for running large-language models like DeepSeek and Llama 4. At any given moment, we have over 5,000 GPUs and hundreds of terabytes of VRAM connected to the network.
We are a small, well-funded team working on difficult, high-impact problems at the intersection of AI and distributed systems. We primarily work in-person from our office in downtown San Francisco. Our investors include A16z CSX and Multicoin. We are high-agency, adaptable, and collaborative. We value creativity alongside technical prowess and humility. We work hard, and deeply enjoy the work that we do.
Key Responsibilities
Design and implement scalable distributed systems for our inference network
Develop models for efficient resource allocation across a network of heterogeneous hardware and quickly changing topology
Optimize network latency, throughput, and availability
Build robust logging and metrics systems to monitor network health and performance
Conduct reviews of architecture and system design to ensure use of best practices
Collaborate with founders, engineers, and other stakeholders to improve our infrastructure and product offerings
What We're Looking For
Very strong problem-solving skills and ability to work in a startup environment
5+ years of experience in building high performance systems
Strong programming skills in Typescript, Python, and one of Go, Rust, or C++
Solid understanding of distributed systems concepts
Knowledge of orchestrators and schedulers like Kubernetes and Nomad
Use of AI tooling in development workflow (ChatGPT, Claude, Cursor, etc)
Experience with LLM inference engines like vLLM or TensorRT-LLM is plus
Experience with GPU programming and optimization (CUDA experience is a plus)
Compensation
We offer competitive compensation, equity in a high-growth startup, and comprehensive benefits. The base salary range for this role is $180,000 - $250,000, plus equity and benefits, depending on experience.
Equal Opportunity
Inference.net is an equal opportunity employer. We welcome applicants from all backgrounds and don't discriminate based on race, color, religion, gender, sexual orientation, national origin, genetics, disability, age, or veteran status.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
E2 Consulting Engineers is seeking a Cathodic Protection Engineer to lead design and management of corrosion control systems in natural gas infrastructure.
Lead the VMS Avionics Control Account as a Principal Engineer at Northrop Grumman, driving earned value management and program integration for innovative aerospace solutions.
Lead performance engineering efforts for Palo Alto Networks' Cortex Cloud, optimizing scalability and reliability in dynamic cloud environments.
Seeking a detail-oriented Roadway Design Engineer, EI to independently plan and design diverse transportation projects with DRMP in Raleigh, NC.
An opportunity to design, prototype, and validate innovative space hardware within an agile team at IRPI, a leader in space system development.
Explore opportunities with Mindera, a collaborative software engineering company looking to connect with passionate professionals across multiple tech disciplines for future roles.
Lead a dynamic team of engineers at Northrop Grumman to drive innovation and technical excellence in Tactical Fighters RF and power electronics design.
Experienced engineers skilled in underground transmission line design and project coordination are needed to join Sargent & Lundy, a leader in power engineering.
FloSports seeks a skilled Broadcast Engineer Level 1 to provide Tier 2 live event support and collaborate across teams to enhance our digital streaming platform.
Lead end-to-end mechanical design of functional test systems at Apple, working closely with multidisciplinary teams to support cutting-edge product manufacturing.
Shape immersive gameplay and AI behaviors as a Game Designer at Sony Interactive Entertainment's Santa Monica Studio.
RS&H is looking for a skilled Water Resources Engineer to support innovative infrastructure projects with hybrid work flexibility in Fort Lauderdale.
Contribute as an Electrical Hardware Design Engineer at General Dynamics Mission Systems, developing advanced airborne mission computer systems in a dynamic team setting.
Subscribe to Rise newsletter