Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Research Engineer, Evaluation image - Rise Careers
Job details

Research Engineer, Evaluation

Captions is the leading video AI company, building the future of video creation. Over 10 million creators and businesses have used Captions to create videos for social media, marketing, sales, and more. We're on a mission to serve the next billion.

We are a rapidly growing team of ambitious, experienced, and devoted engineers, researchers, designers, marketers, and operators based in NYC. You'll join an early team and have an outsized impact on the product and the company's culture.

We’re very fortunate to have some the best investors and entrepreneurs backing us, including Index Ventures (Series C lead), Kleiner Perkins (Series B lead), Sequoia Capital (Series A and Seed co-lead), Andreessen Horowitz (Series A and Seed co-lead), Uncommon Projects, Kevin Systrom, Mike Krieger, Lenny Rachitsky, Antoine Martin, Julie Zhuo, Ben Rubin, Jaren Glover, SVAngel, 20VC, Ludlow Ventures, Chapter One, and more.

Check out our latest financing milestone and some other coverage:

The Information: 50 Most Promising Startups

Fast Company: Next Big Things in Tech

The New York Times: When A.I. Bridged a Language Gap, They Fell in Love

Business Insider: 34 most promising AI startups

Time: The Best Inventions of 2024

** Please note that all of our roles will require you to be in-person at our NYC HQ (located in Union Square) **

Overview

Captions is seeking an exceptional Research Engineer to advance the state-of-the-art in large-scale multimodal video diffusion models. You'll conduct novel research on generative modeling architectures, develop new training techniques, and scale models to billions of parameters. As a key member of our ML Research team, you'll work at the cutting edge of multimodal generation while building systems that enable natural, controllable video creation. We're already training large-scale models with demonstrated product impact, and we're excited to continue expanding the scope and capabilities of our research.

We're especially excited about pushing the boundaries of audio-video generation, with a focus on realistic and charismatic human behavior that enables natural storytelling and creative iteration. Our models power creative tools used by millions of creators, and we're tackling fundamental challenges in how to generate compelling human motion, expression, and speech. 

Key Responsibilities

Research & Architecture Development:

  • Design and implement novel architectures for large-scale video and multimodal diffusion models

  • Develop new approaches to multimodal fusion, temporal modeling, and video control

  • Research temporal video editing techniques and controllable generation

  • Research and validate scaling laws for video generation models

  • Create new loss functions and training objectives for improved generation quality

  • Drive rapid experimentation with model architectures and training strategies

  • Validate research directly through product deployment and user feedback

Model Training & Optimization:

  • Train and optimize models at massive scale (10s-100s of billions of parameters)

  • Develop sophisticated distributed training approaches using FSDP, DeepSpeed, Megatron-LM

  • Design and implement model surgery techniques (pruning, distillation, quantization)

  • Create new approaches to memory optimization and training efficiency

  • Research techniques for improving training stability at scale

  • Conduct systematic empirical studies of architecture and optimization choices

Technical Innovation:

  • Advance state-of-the-art in video model architecture design and optimization 

  • Develop new approaches to temporal modeling for video generation

  • Create novel solutions for multimodal learning and cross-modal alignment

  • Research and implement new optimization techniques for generative modeling and sampling

  • Design and validate new evaluation metrics for generation quality

  • Systematically analyze and improve model behavior across different regimes

Preferred Qualifications

Research Experience:

  • Master's or PhD in Computer Science, Machine Learning, or related field

  • Track record of research contributions at top ML conferences (NeurIPS, ICML, ICLR)

  • Demonstrated experience implementing and improving upon state-of-the-art architectures

  • Deep expertise in generative modeling approaches (diffusion, autoregressive, VAEs, etc.)

  • Strong background in optimization techniques and loss function design

  • Experience with empirical scaling studies and systematic architecture research

Technical Expertise:

  • Strong proficiency in modern deep learning tooling (PyTorch, CUDA, Triton, FSDP, etc.)

  • Experience training diffusion models with 10B+ parameters

  • Experience with very large language models (200B+ parameters) is a plus

  • Deep understanding of attention, transformers, and modern multimodal architectures

  • Expertise in distributed training systems and model parallelism

  • Proven ability to implement and improve complex model architectures

  • Track record of systematic empirical research and rigorous evaluation

Engineering Capabilities:

  • Ability to write clean, modular research code that scales

  • Strong software engineering practices including testing and code review

  • Experience with rapid prototyping and experimental design

  • Strong analytical skills for debugging model behavior and training dynamics

  • Facility with profiling and optimization tools

  • Track record of bringing research ideas to production

  • Experience maintaining high code quality in a research environment

Team Culture

You'll work directly alongside our research and engineering teams in our NYC office. We've intentionally built a culture where technical innovation and research excellence are highly valued - your success will be measured by your contributions to improving our models and advancing the field, not by your ability to navigate politics. We're a team that loves diving deep into complex technical problems and emerging with practical breakthroughs.

  • Our team values:

    • Open technical discussions and collaboration

    • Rapid iteration and practical solutions

    • Deep technical expertise and continuous learning

    • Direct impact on research and product outcomes

  • What sets us apart:

    • Opportunity to advance the state-of-the-art in video generation

    • Direct impact on products used by millions of creators

    • Access to massive compute resources and diverse, large-scale datasets

    • Environment that values both research excellence and practical impact

    • Ability to validate research through direct product feedback

Benefits:

  • Comprehensive medical, dental, and vision plans

  • 401K with employer match

  • Commuter Benefits

  • Catered lunch multiple days per week

  • Dinner stipend every night if you're working late and want a bite!

  • Doordash DashPass subscription

  • Health & Wellness Perks (Talkspace, Kindbody, One Medical subscription, HealthAdvocate, Teladoc)

  • Multiple team offsites per year with team events every month

  • Generous PTO policy and flexible WFH days

Captions provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.

Please note benefits apply to full time employees only.

Captions Glassdoor Company Review
3.2 Glassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon Glassdoor star icon
Captions DE&I Review
3.8 Glassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon Glassdoor star icon
CEO of Captions
Captions CEO photo
Unknown name
Approve of CEO

Average salary estimate

$125000 / YEARLY (est.)
min
max
$100000K
$150000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Research Engineer, Evaluation, Captions

Are you passionate about pushing the boundaries of technology in video creation? Join Captions as a Research Engineer in Evaluation in New York, where you'll be part of a vibrant team dedicated to revolutionizing how videos are made. As a leader in video AI, Captions has empowered over 10 million creators and businesses, and we’re gearing up to transform the industry even further! In this role, you’ll dive deep into the realm of large-scale multimodal video diffusion models, advancing state-of-the-art techniques that will allow creators to tell stories in increasingly engaging and controllable ways. You'll collaborate closely with a dynamic Machine Learning Research team to design and implement novel architectures, develop innovative training techniques, and validate your research directly through product deployments. Your efforts won't just be theoretical; they will influence products used by millions, enabling natural and charismatic human behaviors in video content. With access to massive compute resources and a culture that fosters technical innovation and collaboration, your work will truly make an impact. Our NYC headquarters offers a passionate environment where your success is measured by your contributions and ideas. If you're ready to take on the challenge and drive innovation in video generation, Captions is the place for you!

Frequently Asked Questions (FAQs) for Research Engineer, Evaluation Role at Captions
What are the key responsibilities of the Research Engineer, Evaluation position at Captions?

As a Research Engineer, Evaluation at Captions, your key responsibilities will include designing and implementing novel architectures for large-scale video diffusion models, developing new multimodal training techniques, and conducting systematic empirical studies to enhance model optimization and evaluation. You will also work on creating new loss functions, scaling laws for video generation models, and exploring temporal video editing techniques to improve generation quality.

Join Rise to see the full answer
What qualifications are required for the Research Engineer position at Captions?

The ideal candidate for the Research Engineer, Evaluation position at Captions will hold a Master’s or Ph.D. in Computer Science, Machine Learning, or a related field. A proven track record in research contributions at renowned ML conferences, deep expertise in generative modeling approaches, and proficiency in modern deep learning tools, such as PyTorch and CUDA, are also essential qualifications for this role.

Join Rise to see the full answer
How does Captions support the professional growth of its Research Engineers?

At Captions, we prioritize the continuous development of our Research Engineers by promoting a culture of open technical discussions and collaborative learning. You'll have access to state-of-the-art tools and resources, participate in innovative projects, and benefit from direct feedback on your research through product deployments, facilitating rapid iteration and practical solutions.

Join Rise to see the full answer
What is the team culture like for the Research Engineer role at Captions?

The team culture at Captions emphasizes collaboration, deep technical expertise, and rapid iteration. As a Research Engineer, you will work closely with both research and engineering teams, engaging in vibrant debates over complex technical problems. We celebrate individual contributions and prioritize research excellence and practical impact, creating an environment where your ideas are valued.

Join Rise to see the full answer
What benefits does Captions offer to its employees working as Research Engineers?

Captions provides a comprehensive benefits package for its employees, including medical, dental, and vision plans, a 401K with employer match, and ample health and wellness perks. Additionally, you'll enjoy catered lunches multiple times per week, dinner stipends for late-night work, and opportunities for team offsites and events, alongside a generous PTO policy.

Join Rise to see the full answer
Common Interview Questions for Research Engineer, Evaluation
Can you explain your experience with generative modeling approaches?

When answering this question, be specific about the generative modeling techniques you’ve employed, such as diffusion models or VAEs, and provide examples of projects where you've implemented these techniques. Highlight your role in both the theoretical and practical aspects, emphasizing how your contributions improved model performance.

Join Rise to see the full answer
How do you approach designing and implementing novel architectures for large-scale models?

Discuss your thought process when innovating architectures, including how you identify challenges and validate your designs. Mention your use of empirical studies to support architectural choices, and give an example of a model architecture you developed, explaining its impact on a project.

Join Rise to see the full answer
What strategies do you use to optimize model training and improve efficiency?

Outline your approach to optimizing model training, including techniques such as model surgery or memory optimization. Share specific tools you’ve utilized, like DeepSpeed or FSDP, and discuss a time when your optimizations led to significant improvements in training stability or speed.

Join Rise to see the full answer
How do you handle debugging complex model behaviors during experimentation?

Explain the debugging techniques you utilize, such as profiling tools and systematic empirical studies, to analyze model performance. Share a concrete example of a debugging challenge you faced and how your problem-solving skills led to successful model adjustments or improvements.

Join Rise to see the full answer
Describe a project where you had to validate your research with real-world applications.

Prepare to talk about a specific project where your research translated into a deployed product. Emphasize how you collected user feedback, iterated on your designs based on that feedback, and the impact your research had on user experience.

Join Rise to see the full answer
What is your experience with collaborative research in a team environment?

Share your experiences working in teams, emphasizing effective communication and collaboration strategies you employed. Provide examples from previous roles where collaboration led to breakthroughs or advancements in research.

Join Rise to see the full answer
Can you discuss a time when you had to iterate quickly on a research idea?

Narrate a scenario where rapid iteration was crucial. Focus on how you adapted your approach based on early findings and feedback, and explain the positive outcomes that resulted from your quick iterations.

Join Rise to see the full answer
How do you stay updated with the latest advancements in machine learning and generative modeling?

Talk about your strategies for keeping abreast of the latest research, such as regularly attending conferences, participating in online courses, following prominent research publications, or contributing to relevant discussions in academic communities.

Join Rise to see the full answer
What challenges do you foresee in advancing video generation technology?

Demonstrate your industry knowledge by discussing potential challenges, such as scaling models or ensuring realistic human behaviors in generated content. Explain your thoughts on tackling these challenges through innovative research and collaboration within teams.

Join Rise to see the full answer
Why do you want to work at Captions as a Research Engineer?

Express your enthusiasm for Captions’ mission and innovative environment. Focus on your alignment with the company’s goals in transforming video creation and how your skills and interests suit the role, demonstrating your eagerness to contribute to their vision.

Join Rise to see the full answer
MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
No info
EMPLOYMENT TYPE
Full-time, on-site
DATE POSTED
January 10, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!