Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Research Engineer (Pre-training & Post-training) image - Rise Careers
Job details

Research Engineer (Pre-training & Post-training)

Job title: Research Engineer (Pre-training & Post-training) / Member of Technical Staff

Who We Are
WaveForms AI is an Audio Large Language Models (LLMs) company building the future of audio intelligence through advanced research and products. Our models will transform human-AI interactions making them more natural, engaging and immersive.

Role overview: The Research Engineer – Pre-training & Post-training role integrates responsibilities across all phases of the AI model lifecycle, including pre-training, post-training, and data preparation. This position involves building and optimizing large-scale data pipelines, handling multimodal datasets (audio and text), conducting pre-training with a focus on compute efficiency and scalability, and refining models with cutting-edge techniques like supervised fine-tuning (SFT), reinforcement learning from human feedback (RLHF) and generative modeling. The ideal candidate will leverage advanced methods, including GANs and diffusion models, to push the boundaries of multimodal AI systems focused on audio and text.

Key Responsibilities

  • Lead the pre-training and fine-tuning of large-scale language models (LLMs), maximizing compute efficiency and scaling infrastructure.

  • Optimize model performance using advanced techniques, including RLHF, reward modeling (RM), instruction-tuning, distillation, GANs, and diffusion models.

  • Develop robust evaluation pipelines to monitor, refine, and improve model performance throughout training phases.

  • Build and optimize scalable, distributed data pipelines to support multimodal (audio + text) AI training.

  • Handle and process massive datasets (PiB scale) for pre-training and post-training, ensuring efficient preparation, annotation, and data flow.

  • Collaborate with research and engineering teams to ensure seamless integration of data preparation and training workflows for multimodal systems.

Required Skills & Qualifications

  • Proven experience in training large language models (LLMs), including pre-training, fine-tuning, and post-training optimization.

  • Strong background in distributed systems, compute efficiency, and scaling model training infrastructure.

  • Expertise in designing and managing large-scale, distributed data pipelines for multimodal datasets, particularly audio + text.

  • Proficiency in advanced techniques such as RLHF, instruction-tuning, reward modeling, distillation, GANs, and diffusion models.

  • Proficiency in Python, PyTorch, and distributed frameworks (e.g., Fully Sharded Data Parallel)

  • Familiarity with cloud platforms like AWS, GCP, or Azure for managing distributed environments.

  • Knowledge of multimodal AI systems combining audio and text for training and evaluation.

What You Should Know About Research Engineer (Pre-training & Post-training), WaveForms AI

At WaveForms AI, we are on a mission to revolutionize the field of audio intelligence, and we're looking for a talented Research Engineer (Pre-training & Post-training) to join our team. As a vital member of the technical staff, you'll immerse yourself in every phase of the AI model lifecycle, from the exciting stages of pre-training to the impactful work of post-training. Your role will involve building and optimizing large-scale data pipelines that efficiently handle multimodal datasets, combining audio and text. With a focus on maximizing compute efficiency and scalability, you'll conduct pre-training and refine our models using cutting-edge techniques like supervised fine-tuning and reinforcement learning from human feedback. We're particularly enthusiastic about candidates who can harness advanced methods, including GANs and diffusion models, to push the boundaries of what multimodal AI systems can achieve. In this collaborative environment, you'll work alongside brilliant researchers and engineers, making meaningful contributions to the training workflows that drive our innovative products. If you have a passion for AI and are excited to tackle challenges at the intersection of audio and text, this is the perfect opportunity for you to shine at WaveForms AI.

Frequently Asked Questions (FAQs) for Research Engineer (Pre-training & Post-training) Role at WaveForms AI
What are the main responsibilities of a Research Engineer (Pre-training & Post-training) at WaveForms AI?

As a Research Engineer (Pre-training & Post-training) at WaveForms AI, you'll be tasked with leading the pre-training and fine-tuning of large-scale language models (LLMs). Your responsibilities will also include optimizing model performance using advanced techniques, developing evaluation pipelines, and ensuring the seamless integration of data preparation and training workflows for multimodal systems.

Join Rise to see the full answer
What qualifications are required to become a Research Engineer (Pre-training & Post-training) at WaveForms AI?

To qualify for the Research Engineer (Pre-training & Post-training) position at WaveForms AI, candidates should have proven experience in training large language models, a strong background in distributed systems, and expertise in managing large-scale data pipelines. Proficiency in advanced techniques such as RLHF, GANs, and experience with tools like Python and PyTorch are also essential.

Join Rise to see the full answer
How does WaveForms AI ensure efficient data management for the Research Engineer (Pre-training & Post-training) role?

WaveForms AI emphasizes the importance of building and optimizing distributed data pipelines as part of the Research Engineer (Pre-training & Post-training) role. You will manage massive datasets and ensure efficient preparation, annotation, and data flow, which are crucial for successful AI model training.

Join Rise to see the full answer
What technologies should a Research Engineer (Pre-training & Post-training) at WaveForms AI be familiar with?

A Research Engineer (Pre-training & Post-training) at WaveForms AI should be proficient in Python, PyTorch, and distributed frameworks like Fully Sharded Data Parallel. Familiarity with cloud platforms such as AWS, GCP, and Azure for managing distributed environments is also preferred to ensure effective model training and deployment.

Join Rise to see the full answer
What is the significance of multimodal AI systems in the Research Engineer (Pre-training & Post-training) role at WaveForms AI?

Multimodal AI systems, which combine audio and text, are central to the research and development at WaveForms AI. As a Research Engineer (Pre-training & Post-training), you will leverage these systems to enhance human-AI interactions, making them more engaging and immersive—a key focus for our innovative audio intelligence products.

Join Rise to see the full answer
Common Interview Questions for Research Engineer (Pre-training & Post-training)
Can you explain your experience with training large language models?

When answering this question, highlight specific projects where you have been involved in training LLMs. Focus on the techniques you used, the scale of the models, and any challenges you overcame, particularly with pre-training and post-training processes.

Join Rise to see the full answer
What techniques do you use to optimize model performance?

Discuss various optimization techniques such as reinforcement learning from human feedback (RLHF), instruction-tuning, or distillation. Provide examples of when you've implemented these methods and the impact they had on model performance.

Join Rise to see the full answer
How do you approach building data pipelines for large-scale datasets?

Share your best practices for building effective, scalable data pipelines. Mention any tools or frameworks you have used and how you've ensured data quality and efficiency throughout the pipeline.

Join Rise to see the full answer
What is your experience with multimodal datasets?

Discuss your familiarity with multimodal datasets, particularly those that include both audio and text. Provide examples of the challenges they present and how you have successfully managed these types of data in previous projects.

Join Rise to see the full answer
Why is compute efficiency important in AI model training?

Explain how compute efficiency affects training times, costs, and the environmental impact of AI operations. Provide examples from your experience where optimizing compute efficiency led to better resource allocation and performance.

Join Rise to see the full answer
Can you describe a challenging project you've worked on related to AI model lifecycle management?

Share details about a specific project, the obstacles faced, and how you navigated them. Focus on your individual contributions and the results achieved to showcase your problem-solving skills.

Join Rise to see the full answer
How do you stay current with advancements in AI and machine learning?

Discuss the resources you rely on—journals, conferences, online courses, or networking with other professionals—to stay updated. Mention specific topics of interest and how you’ve applied new knowledge to your work.

Join Rise to see the full answer
What role does collaboration play in your work as a Research Engineer?

Emphasize the importance of cross-functional teamwork in AI projects. Provide examples of how you've collaborated with research teams and engineers to achieve project goals and the value it added to the final outcome.

Join Rise to see the full answer
How would you handle a situation where a model isn't performing as expected?

Illustrate your troubleshooting process, including evaluating the training data, considering adjustments to model parameters, and possibly implementing alternative techniques. This shows your analytical skills and resilience.

Join Rise to see the full answer
What are your long-term goals as a Research Engineer in the AI space?

Share your aspirations to advance your skills, contribute to significant projects, or lead research initiatives. Connecting your goals with the mission of WaveForms AI would reflect your alignment and passion for the role.

Join Rise to see the full answer
MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
No info
LOCATION
No info
EMPLOYMENT TYPE
Full-time, remote
DATE POSTED
December 9, 2024

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!