Sign up for our
weekly
newsletter
of fresh jobs
Do you have a passion for computer vision, large language models, and deep learning? The Video Engineering Data Analytics and Quality (DAQ) group is looking for an experienced Data Scientist with a strong background in computer vision, machine learning, and multi-modal LLM (MM-LLM) to join our dynamic team. The ideal candidate will be responsible for evaluating machine learning and MM-LLM models, developing performance metrics, and conducting thorough failure analysis. This role requires a deep understanding of ML algorithms, data processing, model optimization techniques, and modern evaluation approaches for vision-language models.
Description
Our organization supports a diverse array of programs passionate about evaluating ML algorithms and assessing model quality at scale, across domains like computer vision, audio, and multi-modal systems. You will collaborate with multi-functional teams, including domain experts and engineering leads, and adapt methodologies as new insights emerge.
In this role you will:
- Evaluate ML & MM-LLM Models: Analyze and validate computer vision, multi-modal, and large language models to ensure they meet accuracy, robustness, and usability standards.
- Develop Metrics: Design and implement metrics to measure the efficiency and accuracy of models.
- Failure Analysis: Conduct in-depth analysis on model failures across CV and MM-LLM pipelines to surface root causes and improvement areas.
- Data Processing: Clean, transform, and curate large-scale datasets for model evaluation and benchmarking.
- Model Optimization: Apply innovative techniques to optimize models for scalability and real-world deployment.
- Collaborate multi-functionally: Work closely with cross-functional teams, including software engineers, product managers, and other data scientists, to integrate models into production.
- Communicate Results: Present findings clearly and effectively to collaborators across levels of technical understanding.
Minimum Qualifications
BS in a quantitative field and a minimum of 3 years relevant industry experience.
Proven background in data science, machine learning, computer vision and statistical data analysis.
Advanced programming skills in data manipulation & processing (SQL & Python preferred).
Demonstrated experience in in-depth analysis of machine learning model failures.
Experience crafting, conducting, analyzing, and interpreting experiments and investigations.
Expertise in data wrangling and developing data visualizations & reporting with toolings such as Tableau, Superset, AWS etc.
Preferred Qualifications
Experience working with multi-modal foundation models such as GPT-4o, Gemini 2.5, Claudi 3/4, LLaVA, Flamingo, etc.
Familiar with machine learning interpretability method and standard processes.
Exposure to evaluating vision-language models in production or research settings.
Experience handling complex programs and collaborating across engineering, product, and data teams.
Detail-oriented to keep track of and understand the workings of sophisticated algorithms.
Strong attention to detail in working with large datasets and complex ML systems.
Curious, self-motivated, and able to drive improvements to model evaluation pipelines and annotation programs.
Outstanding communication skills – both written and verbal – with experience presenting to leadership.