Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Senior Machine Learning Engineer, RAG image - Rise Careers
Job details

Senior Machine Learning Engineer, RAG

👋 Klue Engineering is hiring!

We're looking for a Senior Machine Learning Engineer to join our ML team in Toronto, focusing on building and optimizing state-of-the-art RAG (Retrieval Augmented Generation) systems. You'll be joining us at an exciting time as we reinvent our RAG systems, making this an excellent opportunity for someone with strong ML and IR fundamentals who wants to dive deep into practical LLM applications.

💡 FAQ

Q: Klue who?

A: Klue is a VC-backed, capital-efficient growing SaaS company. Tiger Global and Salesforce Ventures led our US$62m Series B in the fall of 2021. We’re creating the category of competitive enablement: helping companies understand their market and outmaneuver their competition. We benefit from having an experienced leadership team working alongside several hundred risk-taking builders who elevate every day.

We’re one of Canada’s Most Admired Corporate Cultures by Waterstone HC, a Deloitte Technology Fast 50 & Fast 500 winner, and recipient of both the Startup of the Year and Tech Culture of the Year awards at the Technology Impact Awards.

Q: What are the responsibilities, and how will I spend my time? 

A: In this role, you'll focus on optimizing our RAG systems with scientific rigor and reproducible results. You'll measure and improve retrieval systems across the spectrum from BM25 to semantic search, using comprehensive evaluation metrics including Recall@K and Precision@K. A key challenge will be developing optimal chunking and enrichment strategies for diverse data sources including news articles, website changes, documents, CRM entries, call recordings and internal communications. You'll explore how different data types and formats impact retrieval performance and develop strategies to maintain high relevance across all sources.

Beyond RAG and retrieval, you'll work on prompt engineering to effectively utilize the retrieved context. This includes developing zero-shot and few-shot prompts with structured inputs/outputs, and implementing tight iteration loops with the right evaluation metrics. 

You'll also work on training and fine-tuning smaller, more efficient models that can match the performance of large LLMs at a fraction of the cost. This includes creating labeled datasets (sometimes using prompts), conducting careful hyperparameter optimizations, and building automated training pipelines. You'll also deploy and monitor these models in production, optimize their latency, and implement comprehensive offline/online metrics to track their performance. 

Throughout all this work, you'll apply your deep understanding of the latest breakthroughs in the field to connect new research advances to practical improvements in our systems. Working closely with backend engineers, you'll help build scalable, production-ready systems that turn cutting-edge ML experiments into reliable business value.

Q: What experience are we looking for? 

  • Masters or PhD in Machine Learning, NLP, or related field

  • 2+ years building and optimizing retrieval systems

  • 2+ years training/fine-tuning transformer models

  • Strong foundation in evaluating RAG systems - both retrieval and generation

  • Deep understanding of retrieval metrics and their trade-offs

  • Strong grasp of embedding models, semantic similarity techniques, and clustering similar content

  • Knowledge of query augmentation and content enrichment strategies

  • Expertise in automated LLM evaluation, including LLM-as-judge methodologies

  • Skilled at prompt engineering - including zero-shot, few-shot, and chain-of-thought

  • Experience deploying models to production and monitoring the health of the system and the predictions.  

  • Knowledge of ML infrastructure, model serving, and observability best practices

  • Proven ability to balance scientific rigor with driving business impact

  • Track record of staying current with ML research and breakthrough papers

Q: What makes you thrive at Klue? 

A: We're looking for builders who:

  • Take ownership and run with ambiguous problems

  • Jump into new areas and rapidly learn what's needed to deliver solutions

  • Bring scientific rigor while maintaining a pragmatic delivery focus

  • See unclear requirements as an opportunity to shape the solution

Q: What technologies do we use? 

  • LLM platforms: OpenAI, Anthropic, open-source models

  • ML frameworks: PyTorch, Transformers, spaCy

  • Search/Vector DBs: Elasticsearch, Pinecone, PostgreSQL

  • MLOps tools: Weights & Biases, MLflow, Langfuse

  • Infrastructure: Docker, Kubernetes, GCP

  • Development: Python, Git, CI/CD

How We Work at Klue:

  • Hybrid. Best of both worlds (remote & in-office)

  • Our main Canadian hubs are in Vancouver and Toronto. Ideally, this role would be located in Toronto.

  • You and your team will be in office at least 2 days per week.

Klue Glassdoor Company Review
4.9 Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon
Klue DE&I Review
4.8 Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon
CEO of Klue
Klue CEO photo
Jason Smith
Approve of CEO
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
November 7, 2024

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!