Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
GPU Performance Engineer image - Rise Careers
Job details

GPU Performance Engineer

About the team

Adaptive ML is helping companies build singular generative AI experiences by democratizing the use of reinforcement learning. We are building the foundational technologies, tools, and products that allow models to learn directly from user interactions and self-improve based on simple guidelines. Our founders come from diverse backgrounds and previously worked together in creating state-of-the-art open-access large language models. We closed a $20M seed with Index & ICONIQ in early 2024 and are live with our first enterprise customers.

Our Technical Staff develops the foundational technology that powers Adaptive ML in alignment with requests and requirements from our Commercial and Product teams. We are committed to building robust, efficient technology and conducting at-scale, impactful research to drive our roadmap and deliver value to our customers.

About the role

As a GPU Performance Engineer in our Technical Staff, you will help ensure that our LLM stack (Adaptive Harmony) delivers state of the art performance across a wide variety of settings; from latency-bound regimes where serving requests with sub-second response times is key, to throughput-bound regimes during training and offline inference. You will help build the foundational technology powering Adaptive ML by delivering performance improvements directly to our clients as well as to our internal workloads. We are looking for self-driven, business-minded, and ambitious individuals interested in supporting real-world deployments of a highly technical product. As this is an early role, you will have the opportunity to shape our research efforts and product as we grow.

This role is ideally in-person at our Paris or New York office, but we are also open to fully remote work.

Your responsibilities

  • Build and maintain fast and robust GPU code, focusing on delivering performance improvements in real world applications;

  • Write high-quality software in CUDA, CUTLASS, or Triton with a focus on performance and robustness;

  • Profile dedicated GPU kernels, optimizing across latency/compute-bound regimes for complex workloads;

  • Contribute to our product roadmap, by identifying promising trends that can improve performance;

  • Report clearly on your work to a distributed collaborative team, with a bias for asynchronous written communication.

Your (ideal) background

The background below is only suggestive of a few pointers we believe could be relevant. We welcome applications from candidates with diverse backgrounds; do not hesitate to get in touch if you think you could be a great fit, even if the below doesn't fully describe you.

  • A M.Sc. /Ph.D. in computer science, or demonstrated experience in software engineering, preferably with a focus on GPU-optimization;

  • Strong programming skills, preferably with a focus on systems and general purpose GPU programming;

  • A track record of writing high performance kernels, having preferably demonstrated ability to reach state of the art performance on well defined tasks;

  • Contributions to relevant open-source projects, such as CUTLASS, Triton and MLIR;

  • Passionate about the future of generative AI, and eager to build foundational technology to help machines deliver more singular experiences.

Benefits

  • Comprehensive medical (health, dental, and vision) insurance;

  • 401(k) plan with 4% matching (or equivalent);

  • Unlimited PTO — we strongly encourage at least 5 weeks each year;

  • Mental health, wellness, and personal development stipends;

  • Visa sponsorship if you wish to relocate to New York or Paris.

Average salary estimate

$125000 / YEARLY (est.)
min
max
$100000K
$150000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About GPU Performance Engineer, Adaptive ML

At Adaptive ML, we're on a mission to revolutionize the AI landscape, and as a GPU Performance Engineer, you would be at the heart of it all! Based in the vibrant New York City, you'll join a talented team that is democratizing reinforcement learning to create exceptional generative AI experiences. This role requires you to dive deep into our LLM stack, Adaptive Harmony, ensuring that it performs exceptionally across various scenarios. Imagine optimizing GPU code to handle everything from millisecond latency during user requests to maximizing throughput during training sessions! Your expertise in CUDA, CUTLASS, or Triton will shine as you collaborate with both our Commercial and Product teams to implement performance enhancements that directly impact our clients and internal projects. Furthermore, being an early entrant means you'll have the unique opportunity to shape our product and research initiatives, facilitating real-world deployments of sophisticated technologies. If you're self-motivated, passionate about the future of generative AI, and equipped with skills in writing high-performance kernels, we want to hear from you! Plus, with perks like unlimited PTO, comprehensive health insurance, and even the possibility of visa sponsorship, this could be the chance you've been waiting for. Come help us build the future at Adaptive ML!

Frequently Asked Questions (FAQs) for GPU Performance Engineer Role at Adaptive ML
What does a GPU Performance Engineer do at Adaptive ML?

As a GPU Performance Engineer at Adaptive ML, you’ll optimize and enhance the performance of our LLM stack, Adaptive Harmony, by delivering robust GPU code and implementing performance improvements that support both our clients and internal workloads.

Join Rise to see the full answer
What qualifications are needed for the GPU Performance Engineer position at Adaptive ML?

Candidates for the GPU Performance Engineer role at Adaptive ML typically hold a M.Sc. or Ph.D. in computer science or have robust experience in software engineering with a focus on GPU optimization, showcasing strong programming skills, especially in CUDA or Triton.

Join Rise to see the full answer
What programming skills are essential for a GPU Performance Engineer at Adaptive ML?

Essential programming skills for a GPU Performance Engineer at Adaptive ML include proficiency in CUDA, CUTLASS, or Triton, along with a proven track record of writing high-performance GPU kernels that deliver exceptional performance in complex tasks.

Join Rise to see the full answer
How does the GPU Performance Engineer contribute to Adaptive ML's product roadmap?

The GPU Performance Engineer at Adaptive ML actively contributes to the product roadmap by identifying trends and insights that can lead to significantly improved performance, directly influencing the future trajectory of our technology and its applications.

Join Rise to see the full answer
What is the work environment like for a GPU Performance Engineer at Adaptive ML?

The work environment for a GPU Performance Engineer at Adaptive ML is dynamic and collaborative, often involving asynchronous communication with a distributed team, where self-driven individuals are encouraged to share their ideas and innovations.

Join Rise to see the full answer
Common Interview Questions for GPU Performance Engineer
Can you explain your experience with CUDA and how it relates to GPU optimization?

When answering this question, detail specific projects where you utilized CUDA for performance improvements. Highlight your strategies for optimizing GPU workloads and any challenges you encountered, showcasing your problem-solving skills.

Join Rise to see the full answer
What strategies do you use to profile and optimize GPU kernels?

Discuss specific tools and methodologies you've employed to profile GPU kernels, such as Nsight or similar profiling tools. Explain your approach to balancing between latency and throughput optimizations, providing insights about any benchmarks you've achieved.

Join Rise to see the full answer
What challenges have you faced while working on performance improvements in software engineering?

Reflect on a challenging situation related to performance improvements, outlining the problem, your approach to finding a solution, and the successful outcome. This illustrates your resilience and technical problem-solving abilities.

Join Rise to see the full answer
How do you approach writing high-performance GPU code?

Explain your coding practices that prioritize performance, such as minimizing memory access and ensuring coalescing in your code. You may also want to touch on how you adopt best practices from relevant open-source projects.

Join Rise to see the full answer
Why are you passionate about generative AI and how do you see its future?

Share your thoughts on generative AI's potential impact in various industries, incorporating recent advancements you find exciting. This demonstrates your engagement with the field and vision for how your work can contribute to it.

Join Rise to see the full answer
Describe your experience contributing to open-source projects.

Highlight specific open-source contributions, focusing on what your role was and the impact of your work. This showcases your collaborative spirit and ability to work within a community, which is crucial for technical roles.

Join Rise to see the full answer
How do you communicate complex technical information to a non-technical audience?

Discuss your approach to simplifying complex concepts, perhaps through analogies or visuals. Provide examples of previous experiences where you successfully conveyed technical ideas to stakeholders or clients.

Join Rise to see the full answer
What’s your experience with cutting-edge tools in GPU programming?

Discuss your familiarity with tools like Triton or MLIR and how you've applied them in your work. Provide examples of any outcomes that resulted from your use of these tools in improving performance or workflow efficiency.

Join Rise to see the full answer
How do you stay updated with advancements in GPU architecture?

Mention resources you rely on, such as journals, online courses, or community forums. Highlighting proactive learning shows your commitment to personal growth and adapting to changes in the tech landscape.

Join Rise to see the full answer
What role do you think a GPU Performance Engineer plays in a growing company like Adaptive ML?

Articulate the crucial role a GPU Performance Engineer has in driving product innovation and performance, emphasizing how such roles are foundational in tech development and align with business objectives in a startup-like environment.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Posted 2 days ago
Photo of the Rise User
Posted yesterday
Photo of the Rise User
Posted 6 days ago
Photo of the Rise User
Techo-Bloc Hybrid Douglassville, PA 19518, USA
Posted 8 days ago
Photo of the Rise User
Devoteam Remote Claude Debussylaan 10, 1082 MD Amsterdam, Netherlands
Posted 8 days ago
Photo of the Rise User
Posted 2 days ago
Photo of the Rise User
Aerotek Hybrid Rio Rancho, NM
Posted 3 days ago
Photo of the Rise User
Mission Driven
Social Impact Driven
Passion for Exploration
Reward & Recognition

Every user interaction, advancing your use case. Test, serve, monitor, and iterate on your large language models in your cloud with Adaptive ML.

3 jobs
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
December 25, 2024

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!