Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Software Engineer, SystemML - AI Networking image - Rise Careers
Job details

Software Engineer, SystemML - AI Networking

In this role, you will be a member of the AI Networking Software team and part of the bigger DC networking organization. The team develops and owns the software stack around NCCL (NVIDIA Collective Communications Library), which enables multi-GPU and multi-node data communication through HPC-style collectives. NCCL has been integrated into PyTorch and is on the critical path of multi-GPU distributed training. In other words, nearly every distributed GPU-based ML workload in Meta Production goes through the SW stack the team owns. At the high level, the team aims to enable Meta-wide ML products and innovations to leverage our large-scale GPU training and inference fleet through an observable, reliable and high-performance distributed AI/GPU communication stack. Currently, one of the team’s focus is on building customized features, SW benchmarks, performance tuners and SW stacks around NCCL and PyTorch to improve the full-stack distributed ML reliability and performance (e.g. Large-Scale GenAI/LLM training) from the trainer down to the inter-GPU and network communication layer. And we are seeking for engineers to work on the space of GenAI/LLM scaling reliability and performance.

Responsibilities

Tech-leading the collective communication library development on Meta's large-scale GPU training infra with a focus on GenAI/LLM scaling

Qualifications

Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience Proven C/C++ and Python programming skills Proven track record of leading successful projects Effective leadership and communication skills Specialized experience in one or more of the following machine learning/deep learning domains: Distributed ML Training, GPU architecture, ML systems, AI infrastructure, high performance computing, performance optimizations, or Machine Learning frameworks (e.g. PyTorch). PhD in Computer Science, Computer Engineering, or relevant technical field Experience with NCCL and distributed GPU performance analysis on RoCE/Infiniband Experience working with DL frameworks like PyTorch, Caffe2 or TensorFlow Experience with both data parallel and model parallel training, such as Distributed Data Parallel, Fully Sharded Data Parallel (FSDP), Tensor Parallel, and Pipeline Parallel Experience in AI framework and trainer development on accelerating large-scale distributed deep learning models Experience in HPC and parallel computing Knowledge of GPU architectures and CUDA programming Knowledge of ML, deep learning and LLM

Average salary estimate

$135000 / YEARLY (est.)
min
max
$110000K
$160000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs
Photo of the Rise User
Meta Hybrid Menlo Park, California, United States
Posted 7 days ago
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Take Risks
Collaboration over Competition
Fast-Paced
Growth & Learning
Transparent & Candid
Feedback Forward
Dare to be Different
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Paternity Leave
Flex-Friendly
Snacks
Social Gatherings
Company Retreats
Fitness Stipend
Paid Holidays
Summer Fridays
Work Visa Sponsorship
Bias Training
Flexible Spending Account (FSA)
Health Savings Account (HSA)
Vision Insurance
Dental Insurance
Life insurance

Elevate your career at Meta as a Business Operations Manager, essential to the Product Risk & Compliance team's strategic success.

Photo of the Rise User
Meta Hybrid Burlingame, California, United States
Posted 7 days ago
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Take Risks
Collaboration over Competition
Fast-Paced
Growth & Learning
Transparent & Candid
Feedback Forward
Dare to be Different
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Paternity Leave
Flex-Friendly
Snacks
Social Gatherings
Company Retreats
Fitness Stipend
Paid Holidays
Summer Fridays
Work Visa Sponsorship
Bias Training
Flexible Spending Account (FSA)
Health Savings Account (HSA)
Vision Insurance
Dental Insurance
Life insurance

As a Design Program Manager for Wearables, you will orchestrate design execution and partner with cross-functional teams to drive product visions.

Photo of the Rise User
Amplitude Remote San Francisco, California, United States
Posted 7 days ago
Inclusive & Diverse
Empathetic
Growth & Learning
Social Impact Driven
Dental Insurance
Disability Insurance
Flexible Spending Account (FSA)
Health Savings Account (HSA)
Vision Insurance
Performance Bonus
Paid Holidays

Join Amplitude's Activation team as a Staff Software Engineer to drive innovative solutions in data insights and analytics.

Photo of the Rise User
Posted 5 days ago

Join Ai2 as a Senior Full-Stack Engineer and contribute to innovative AI solutions in a dynamic, hybrid work environment.

Photo of the Rise User

Boeing is looking for experienced Quantum Applications Software Engineers to advance aerospace technology through quantum innovations.

Photo of the Rise User
Posted 8 days ago

Allegiant is looking for a Software Frontend Engineer I to develop user interfaces for travel management applications in a dynamic team environment.

GenPT Remote Krakow, Malopolskie, Poland
Posted 3 days ago

Join Genuine Parts Company as a Lead Engineer to spearhead innovative e-commerce solutions using JavaScript and React.

Photo of the Rise User
Posted 10 days ago
Customer-Centric
Rapid Growth
Diversity of Opinions
Reward & Recognition
Friends Outside of Work
Inclusive & Diverse
Empathetic
Feedback Forward
Work/Life Harmony
Casual Dress Code
Startup Mindset
Collaboration over Competition
Fast-Paced
Growth & Learning
Open Door Policy
Rise from Within
Maternity Leave
Paternity Leave
Flex-Friendly
Family Coverage (Insurance)
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Paid Holidays
Paid Sick Days
Paid Time-Off

Join Sports Card Investor as a Senior Backend Engineer and help shape the future of the sports card hobby with your expertise in scalable backend systems.

Photo of the Rise User
Posted 15 hours ago

Join Zensurance as a Tech Lead to drive innovation and scalability in their engineering team for redefining commercial insurance in Canada.

Photo of the Rise User

Join Roof Stacks as a Senior Frontend Developer and contribute to innovative Smart TV applications in a fast-paced, tech-driven environment.

Photo of the Rise User
ProPublica Remote New York, New York, United States
Posted 7 days ago

Join ProPublica as an AI Engineering Fellow to leverage AI for innovative investigative journalism.

Photo of the Rise User
Helix Electric Remote No location specified
Posted 4 days ago

Exciting opportunity at Helix Tech IT Services Inc for a Software Engineer to contribute to innovative projects and grow in a supportive environment.

Photo of the Rise User
Posted 60 minutes ago

At Bits, we seek an experienced Senior Software Engineer to help revolutionize credit access while working in a dynamic and impactful environment.

Photo of the Rise User
Edgemony Remote No location specified
Posted 10 days ago

Join FULLTRUCK as a Back-End Developer to contribute to revolutionary logistics solutions using advanced technology.

Photo of the Rise User
Posted 13 days ago

Join Peraton as a Full Stack Developer to build advanced applications in support of national security missions.

Meta's mission is to build the future of human connection and the technology that makes it possible.

264 jobs
MATCH
Calculating your matching score...
CULTURE VALUES
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Take Risks
Collaboration over Competition
Fast-Paced
Growth & Learning
Transparent & Candid
Feedback Forward
Dare to be Different
BENEFITS & PERKS
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Paternity Leave
Flex-Friendly
Snacks
Social Gatherings
Company Retreats
Fitness Stipend
Paid Holidays
Summer Fridays
Work Visa Sponsorship
Bias Training
Flexible Spending Account (FSA)
Health Savings Account (HSA)
Vision Insurance
Dental Insurance
Life insurance
FUNDING
SENIORITY LEVEL REQUIREMENT
INDUSTRY
TEAM SIZE
EMPLOYMENT TYPE
Full-time, on-site
DATE POSTED
April 22, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!
LATEST ACTIVITY
Photo of the Rise User
Someone from OH, Cincinnati just viewed Personal Shopper - Sam's at Walmart
F
Someone from OH, Cincinnati just viewed CART - Captionist at Focus Interpreting
Photo of the Rise User
6 people applied to Software Engineer I at Fearless
P
Someone from OH, Brecksville just viewed Verification Specialist at Planned Parenthood of Illinois
Photo of the Rise User
100+ people applied to Scrum Master-Remote at DICE
Photo of the Rise User
Someone from OH, Sheffield Lake just viewed Busser/Server Assistant at Chili's Grill and Bar
Photo of the Rise User
Someone from OH, Cincinnati just viewed Furniture Sales Representative at Furniture Fair
Photo of the Rise User
12 people applied to Frontend Engineer at MoralesHR
Photo of the Rise User
Someone from OH, Dayton just viewed Data Engineer | NDA at GT
Photo of the Rise User
Someone from OH, Dayton just viewed Data Engineer at Koddi
Photo of the Rise User
Someone from OH, Dayton just viewed Data Engineer at Launchpad Technologies
Photo of the Rise User
Someone from OH, Columbus just viewed Accounts Receivable Specialist at Brixio
Photo of the Rise User
11 people applied to Junior Frontend Engineer at Bayzat
Photo of the Rise User
Someone from OH, Cincinnati just viewed Training & Education Specialist at Finalsite
L
Someone from OH, Cincinnati just viewed Head of Performance Marketing (Remote) at Lavendo
O
Someone from OH, Cincinnati just viewed VP of Marketing at OnePlan Solutions
T
Someone from OH, Cincinnati just viewed Senior Director, Demand Generation at Typeface
Photo of the Rise User
Someone from OH, Cincinnati just viewed Marketing Analyst at Waymo
Photo of the Rise User
Someone from OH, Cincinnati just viewed Marketing Director at Nextdoor
Photo of the Rise User
Someone from OH, Cincinnati just viewed Director of Demand Generation at Relay
Photo of the Rise User
Someone from OH, Cincinnati just viewed Leader, Demand Generation at Benchling