Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Senior / Principal Recommendations Infrastructure Engineer - ML Platform image - Rise Careers
Job details

Senior / Principal Recommendations Infrastructure Engineer - ML Platform

Every day, tens of millions of people come to Roblox to explore, create, play, learn, and connect with friends in 3D immersive digital experiences– all created by our global community of developers and creators. 

At Roblox, we’re building the tools and platform that empower our community to bring any experience that they can imagine to life. Our vision is to reimagine the way people come together, from anywhere in the world, and on any device. We’re on a mission to connect a billion people with optimism and civility, and looking for amazing talent to help us get there. 

A career at Roblox means you’ll be working to shape the future of human interaction, solving unique technical challenges at scale, and helping to create safer, more civil shared experiences for everyone.

As a Senior / Principal Recommendations Infrastructure Engineer on ML Platform you will build the next generation of ML Ecosystem Tooling for recommendation systems. ML Platform today supports billions of requests per day across our homepage, marketplace, economy, and more. We are looking for accomplished engineers to help build out the next generation of ML platform tooling for recommender systems in a quickly innovating space.

You Are:

  • Have 4+ years of professional experience and a tool chest of system design experience upon which to draw to build scalable, reliable platforms for all of Roblox.
  • Have significant experience running large-scale recommendation systems that recommend hundreds of millions of items to millions of users.
  • Experienced building complex distributed systems that scale to real-time ML inference serving millions of QPS, particularly for real-time recommendation systems.
  • Passionate about supporting and working cross functionally with internal partners (Data Scientists and ML Engineers) to meet and understand their needs.
  • A reliability nut: you love digging into tricky postmortems and identifying and fixing weaknesses in complicated systems.
  • Ideally familiar with ML model inference frameworks like Triton Inference Server, TensorRT, KServe.
  • Bachelor's degree or higher in Computer Science, Computer Engineering, Data Science, or a similar technical field.

You Will:

  • Set technical strategy and oversee development of high scale, reliable infrastructure systems for recommender systems, especially as we scale up both inference qps and model size.
  • Dig into performance bottlenecks all along the recommendation inference stack, spanning from model optimizations to infrastructure optimizations.
  • Stay abreast of industry trends in machine learning and infrastructure to ensure the adoption of leading-edge technologies and practices.
  • Bootstrap and maintain infrastructure for ML Platform components--Serving Layer, Metadata Store, Model Registry, and Pipeline Orchestrator.
  • Partner across organizations to build tooling, interfaces, and visualizations that make the ML@Roblox a delight to use.

For roles that are based at our headquarters in San Mateo, CA: The starting base pay for this position is as shown below. The actual base pay is dependent upon a variety of job-related factors such as professional background, training, work experience, location, business needs and market demand. Therefore, in some circumstances, the actual salary could fall outside of this expected range. This pay range is subject to change and may be modified in the future. All full-time employees are also eligible for equity compensation and for benefits.

Annual Salary Range
$233,840$283,780 USD

Roles that are based in our San Mateo, CA Headquarters are in-office Tuesday, Wednesday, and Thursday, with optional in-office on Monday and Friday (unless otherwise noted).

You’ll Love: 

  • Industry-leading compensation package
  • Excellent medical, dental, and vision coverage
  • A rewarding 401k program
  • Flexible vacation policy (varies by exemption status)
  • Roflex - Flexible and supportive work policy 
  • Roblox Admin badge for your avatar
  • At Roblox HQ: 
    • Free catered lunches five times a week and several fully stocked kitchens with unlimited snacks
    • Onsite fitness center and fitness program credit
    • Annual CalTrain Go Pass

Roblox provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. Roblox also provides reasonable accommodations for all candidates during the interview process.

Average salary estimate

$258810 / YEARLY (est.)
min
max
$233840K
$283780K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Senior / Principal Recommendations Infrastructure Engineer - ML Platform, Roblox

At Roblox, the role of Senior / Principal Recommendations Infrastructure Engineer for our ML Platform opens up a thrilling opportunity to make a massive impact on how millions of users interact with our ecosystem. Every day, tens of millions of people dive into Roblox's vibrant 3D experiences, and our mission is to connect a billion people with joy and civility. In this position, you will be at the heart of building the next generation of machine learning tooling for recommendation systems that influence everything from our homepage to our thriving marketplace. With your experience in designing scalable and reliable platforms, you will tackle unique technical challenges while collaborating closely with data scientists and ML engineers to understand and meet their needs. Your work will include managing complex distributed systems that handle real-time ML inference serving millions of queries per second. As a key player on our team, you’ll not only set the technical strategy but also identify performance bottlenecks and implement solutions that enhance our infrastructure. Moreover, you'll thrive in a collaborative environment where your creativity and insight can shine, ultimately crafting a delightful experience for our users. If you're passionate about leveraging cutting-edge technologies and practices while fostering a safe and respectful digital landscape, then the Senior / Principal Recommendations Infrastructure Engineer role at Roblox is the perfect fit for you.

Frequently Asked Questions (FAQs) for Senior / Principal Recommendations Infrastructure Engineer - ML Platform Role at Roblox
What are the main responsibilities of a Senior / Principal Recommendations Infrastructure Engineer at Roblox?

As a Senior / Principal Recommendations Infrastructure Engineer at Roblox, your primary responsibilities will include setting technical strategies for scalable infrastructure systems for recommendation systems, optimizing performance across the entire inference stack, and building tooling and visualizations that enhance the ML experience. You will spearhead initiatives to improve reliability and efficiency, engaging with teams across the organization to ensure effective collaboration.

Join Rise to see the full answer
What qualifications are necessary for the Senior / Principal Recommendations Infrastructure Engineer position at Roblox?

To qualify for the Senior / Principal Recommendations Infrastructure Engineer role at Roblox, you should have a Bachelor's degree or higher in Computer Science, Computer Engineering, Data Science, or a related technical field. Additionally, you will need at least 4 years of professional experience in system design for scalable platforms, alongside significant experience running large-scale recommendation systems and complex distributed systems.

Join Rise to see the full answer
What skills are essential for a Senior / Principal Recommendations Infrastructure Engineer at Roblox?

Essential skills for the Senior / Principal Recommendations Infrastructure Engineer role at Roblox include a deep understanding of machine learning frameworks, experience with real-time ML inference serving, and proficiency in building distributed systems. Familiarity with ML model inference frameworks like Triton Inference Server and a passion for collaboration with cross-functional teams are also crucial.

Join Rise to see the full answer
What kind of work environment can a Senior / Principal Recommendations Infrastructure Engineer expect at Roblox?

The work environment for a Senior / Principal Recommendations Infrastructure Engineer at Roblox is dynamic and collaborative, with a strong focus on innovation and technical excellence. You can expect a mix of in-office days and flexible work arrangements, along with a supportive culture that values employee well-being and professional growth.

Join Rise to see the full answer
What benefits does Roblox offer to its Senior / Principal Recommendations Infrastructure Engineers?

Roblox provides an industry-leading compensation package for Senior / Principal Recommendations Infrastructure Engineers, including excellent medical, dental, and vision coverage, a 401k program, and a flexible vacation policy. Additional perks include catered lunches, free snacks, an onsite fitness center, and opportunities for equity compensation, making Roblox a fantastic place to grow your career.

Join Rise to see the full answer
Common Interview Questions for Senior / Principal Recommendations Infrastructure Engineer - ML Platform
Can you explain your experience with building scalable recommendation systems?

When answering this question, detail specific projects that showcase your ability to design and implement scalable recommendation systems. Highlight technologies used, challenges faced, and outcomes achieved, emphasizing teamwork and the impact on users.

Join Rise to see the full answer
Describe a time you identified and resolved a performance bottleneck in a system.

Use the STAR method to structure your answer: describe the Situation, Task, Action, and Result. Be specific about how you diagnosed the bottleneck, the steps you took to resolve it, and the measurable improvements gained.

Join Rise to see the full answer
How do you stay updated with the latest machine learning trends?

Discuss your habits for staying informed, such as reading industry journals, participating in conferences, engaging with online ML communities, or taking relevant courses. Demonstrating a proactive attitude towards learning signals your passion and dedication to the field.

Join Rise to see the full answer
What tools and technologies are you proficient in for ML infrastructure?

List the tools and technologies you’ve worked with, especially those mentioned in the job description like Triton Inference Server or TensorRT. Share projects where you applied these tools and explain the impact on system performance.

Join Rise to see the full answer
How do you approach collaborating with data scientists and ML engineers?

Emphasize the importance of communication and understanding each other's needs. Share specific examples of how you’ve successfully collaborated in the past, leading to enhanced efficiencies or better project outcomes.

Join Rise to see the full answer
What strategies do you employ to ensure the reliability of ML systems?

Talk about your approach in regular testing, monitoring performance metrics, and conducting postmortems to identify issues. Highlight any frameworks you use for mitigating risks and ensuring uptime.

Join Rise to see the full answer
Describe your experience with real-time ML inference systems.

Outline your knowledge and experience with real-time ML inference systems, focusing on architecture, design choices, and the scale of operations. Explain how you've ensured efficient and low-latency responses to user queries.

Join Rise to see the full answer
How do you prioritize tasks when working on complex projects?

Discuss your workflow and prioritization methods, such as using agile methodologies or project management tools. Provide an example of a complex project and how your prioritization led to successful outcomes.

Join Rise to see the full answer
What challenges have you faced when scaling infrastructure systems, and how did you overcome them?

Be specific about past challenges, such as managing resource allocation or maintaining performance during peak loads. Discuss your thought process and the innovative solutions you implemented to address these challenges.

Join Rise to see the full answer
What do you consider when evaluating new technologies for your projects?

Discuss criteria such as performance, scalability, ease of integration, and community support. Illustrate your answer with an example of a technology you evaluated and the decision-making process that led to its adoption or rejection.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Posted yesterday
Photo of the Rise User
Posted 14 hours ago
Photo of the Rise User
Edge Autonomy Hybrid No location specified
Posted 13 days ago
Photo of the Rise User
Posted 7 days ago
Photo of the Rise User
AECOM Remote New York, NY, United States
Posted 2 days ago
Photo of the Rise User
Vivid Money Remote No location specified
Posted 2 days ago
Photo of the Rise User
Axon Hybrid Scottsdale, Arizona, United States
Posted 6 days ago

Roblox's mission is connect a billion people with optimism and civility. Our vision is to reimagine the way people come together.

48 jobs
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
December 5, 2024

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!