Job details

Member of Technical Staff - ML Performance

About Us:

At Modal, we build foundational technology, including an optimized container runtime, a GPU-aware scheduler, and a distributed file system.

We're a small team based out of New York, Stockholm and San Francisco, and have raised over $23M. Our team includes creators of popular open-source projects (e.g., Seaborn, Luigi), academic researchers, international olympiad medalists, and experienced engineering and product leaders with decades of experience.

The Role

We are looking for strong engineers with experience in making ML systems performant at scale. If you are interested in contributing to open-source projects and Modal’s container runtime to push language and diffusion models towards higher throughput and lower latency, we’d love to hear from you!

Details

Work in-person, in our NYC, San Francisco or Stockholm office
Full medical, dental, vision insurance
Competitive salary and equity

Requirements

5+ years of experience writing high-quality production code.
Experience working with torch, huggingface libraries, modern inference engines (vLLM or TensorRT).
Familiarity with Nvidia GPU architecture and CUDA.
Familiarity with low-level operating system foundations (Linux kernel, file systems, containers, etc.)
Experience with ML performance engineering (tell us a story of when you pushed GPU utilization higher!)

Modal Glassdoor Company Review

3.4

Modal DE&I Review

No rating

CEO of Modal

Sameer Bhalla

Approve of CEO

Average salary estimate

$135000 / YEARLY (est.)

min

max

$120000K

$150000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Member of Technical Staff - ML Performance, Modal

At Modal, we're on a mission to revolutionize machine learning performance, and we're excited to invite you to join us as a Member of Technical Staff focused on ML Performance in our vibrant New York office. Here, you'll be part of a dynamic team that is passionate about creating foundational technology, such as our optimized container runtime and GPU-aware scheduler. We believe in the power of collaboration and innovation, with team members who are creators of popular open-source projects and accomplished researchers. In this role, you'll leverage your 5+ years of experience in writing high-quality production code to enhance the performance of our ML systems, ensuring they operate at scale with maximum throughput and minimum latency. If you have expertise in working with torch, huggingface libraries, and modern inference engines like vLLM or TensorRT, and if you're familiar with Nvidia GPU architecture and CUDA, we want to hear your story about pushing GPU utilization higher! Enjoy the perks of full medical, dental, and vision insurance along with a competitive salary and equity options. If you thrive in an environment that supports professional growth and encourages contributions to cutting-edge open-source projects, Modal is the place for you. Come help us reshape the future of machine learning!

Frequently Asked Questions (FAQs) for Member of Technical Staff - ML Performance Role at Modal

What responsibilities does a Member of Technical Staff - ML Performance at Modal have?

As a Member of Technical Staff - ML Performance at Modal, you'll be responsible for optimizing and enhancing the performance of machine learning systems. This involves working on our GPU-aware scheduler and optimized container runtime to push the boundaries of language and diffusion models. You'll be tasked with writing high-quality production code, collaborating closely with fellow engineers, and contributing to open-source projects. Your expertise in performance engineering will be fundamental in increasing throughput and reducing latency in our ML frameworks.