Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Software Engineer, ML - Document Processing image - Rise Careers
Job details

Software Engineer, ML - Document Processing

Why Harvey

Harvey is a secure AI platform for legal and professional services that augments productivity and automates complex workflows. Harvey uses algorithms with reasoning-adept LLMs that have been customized and developed by our expert team of lawyers, engineers and research scientists. We’ve found product market fit and are scaling our team very quickly. Some reasons to join Harvey are:

  • Exceptional product market fit: We have partnered with the largest law firms and professional service providers in the world, including Paul Weiss, A&O Shearman, Ashurst, O'Melveny & Myers, PwC, KKR, and many others.

  • Strategic investors: Raised over $500 million from strategic investors including Sequoia, Google Ventures, Kleiner Perkins, and OpenAI.

  • World-class team: Harvey is hiring the best talent from DeepMind, Google Brain, Stripe, FAIR, Tesla Autopilot, Glean, Superhuman, Figma, and more.

  • Partnerships: Our engineers and researchers work directly with OpenAI to build the future of generative AI and redefine professional services.

  • Performance: 4x ARR in 2024.

  • Competitive compensation.

Role Overview

Harvey has found a massive product-market fit within the legal space, and we are significantly expanding the scale and capabilities of our offering as we grow. Many use cases in legal involve asking questions or extracting information from a collection of documents (either in-house documents, client documents, or publicly available data), so ingesting & processing documents for use in our AI systems is a critical component of our product. As we work with the biggest firms on their most complex projects, we envision building systems that seamlessly store, index, and process hundreds of millions of documents, and retrieve the right information in a fraction of a second. 

In this role, you will build at the boundary of what is possible in document understanding and incorporate new advancements in OCR, semantic chunking, and vector storage into Harvey’s core system.

The ideal candidate for this role has strong backend fundamentals (distributed systems, data processing) and experience in building production systems that require experimentation. We’re looking for someone who is hands-on and execution-focused in their approach to experimentation - you get things done and can navigate trade-offs between precision, cost, and speed.

What You’ll Do

  • Design and build a robust evaluation system for document understanding. Build and extend our large set of complex documents, like handwritten text from decades-old governing law or large Excel files containing the complex calculations of a corporate merger. Establish reliable baseline labels by working with legal domain experts or leveraging synthetic labeling.

  • Iterate on representation schemes for different data types: what’s the best way to represent a spreadsheet cell in a retrieval database? How should models treat strike-throughs? 

  • Benchmark and implement modern advancements across various modalities, like vision and audio models, into the Harvey stack.

  • Improve the scalability, observability, and fault tolerance of our document processing service.

What You Have 

  • 3+ YoE (post-BS/MS) in an engineering or research role.

  • Demonstrated experience working cross-functionally with other engineering teams: you’ll need to prioritize investment in processing quality based on our product needs.

  • Experience with using a data-driven approach to guide engineering decisions, like recommendation engines or LLM providers.

  • Experience with search infrastructure or vector databases is a plus.

  • Track record of shipping reliable products and a strong attention to detail.

  • Grit - experience working at early-stage startups is a plus.

Harvey is an equal opportunity employer and does not discriminate on the basis of race, gender, sexual orientation, gender identity/expression, national origin, disability, age, genetic information, veteran status, marital status, pregnancy or related condition, or any other basis protected by law.

We are in the early innings of a generational company. Joining early at a hypergrowth startup has proven to lead to exponential growth in responsibility, access, and ability. Apply here today!

Harvey Glassdoor Company Review
No rating Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon
Harvey DE&I Review
No rating Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon
CEO of Harvey
Harvey CEO photo
Unknown name
Approve of CEO

Average salary estimate

$140000 / YEARLY (est.)
min
max
$120000K
$160000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Software Engineer, ML - Document Processing, Harvey

Join Harvey as a Software Engineer focused on Machine Learning for Document Processing in San Francisco! At Harvey, we are pioneering a secure AI platform designed for legal and professional services, making complex workflow automation a reality. Our unique algorithms enhanced by reasoning-adept Large Language Models (LLMs) have already captured the interest of major law firms and professional service providers worldwide. As we scale rapidly, we are looking for talented engineers to contribute to our mission of revolutionizing the legal landscape through innovative AI. In this role, you will dive into the intricate details of document ingestion and processing, working on projects that involve vast collections of documents—from handwritten legal texts to complex spreadsheets. Your expertise will help us build systems that can reliably retrieve the right information in mere seconds. We value hands-on execution; if you thrive on iteration and experimentation, we want you to help enhance our document understanding capabilities through modern advancements in OCR, semantic chunking, and database management. With strong backend fundamentals and a knack for collaboration, you'll navigate the challenges of trade-offs between precision and performance. This is your chance to join a world-class team that’s redefining professional services, grow with us, and help shape the future of generative AI. Let’s build something great together!

Frequently Asked Questions (FAQs) for Software Engineer, ML - Document Processing Role at Harvey
What are the key responsibilities of a Software Engineer, ML - Document Processing at Harvey?

As a Software Engineer specializing in ML for Document Processing at Harvey, you will be responsible for designing and building robust evaluation systems that handle complex documents, establishing baseline labels in collaboration with legal professionals, and improving the scalability and fault tolerance of our services. Your role will involve iterating on representation schemes for various data types and benchmarking advancements in machine learning across different modalities.

Join Rise to see the full answer
What qualifications are required for the Software Engineer, ML - Document Processing role at Harvey?

To qualify for the Software Engineer, ML - Document Processing position at Harvey, candidates should have 3+ years of experience in a relevant engineering or research role, experience in cross-functional collaboration, and a data-driven approach to decision-making. Additional skills in search infrastructure or vector databases, as well as a demonstrated track record of shipping reliable products, are highly valued.

Join Rise to see the full answer
How does Harvey utilize AI in Document Processing for legal services?

At Harvey, AI plays a crucial role in Document Processing for legal services by automating the extraction of information from large collections of documents. Our platform uses advanced ML techniques to process documents quickly and accurately, enabling legal professionals to retrieve information in real-time, which greatly enhances productivity and efficiency within the legal sector.

Join Rise to see the full answer
What type of projects will a Software Engineer work on at Harvey?

A Software Engineer at Harvey will engage in challenging projects that involve ingesting and processing diverse document types, such as handwritten texts and large complex spreadsheets. You will work on enhancing our document understanding systems and incorporating cutting-edge machine learning advancements to enable innovative solutions for our clients in the legal domain.

Join Rise to see the full answer
What is the work culture like for a Software Engineer at Harvey?

The work culture at Harvey is dynamic and energetic, characterized by a strong team-oriented environment. As a hypergrowth startup, we prioritize innovation and collaboration, ensuring that every team member feels empowered to contribute ideas and take ownership of their projects. Our goal is to foster an inclusive culture where everyone's talents are valued and growth is encouraged.

Join Rise to see the full answer
Common Interview Questions for Software Engineer, ML - Document Processing
Can you explain your experience with data processing and distributed systems?

When answering this question, highlight specific projects where you implemented data processing techniques. Describe the tools and technologies you used to build distributed systems, emphasizing your role in ensuring efficiency and scalability. Make sure to mention how your experience aligns with Harvey's focus on handling large volumes of documents.

Join Rise to see the full answer
How do you approach collaboration with cross-functional teams?

To effectively answer this, provide examples of past interactions with other engineering teams or departments. Discuss strategies you've used to prioritize project needs and align your work with product requirements, showcasing your ability to communicate effectively and work towards common goals.

Join Rise to see the full answer
What machine learning models have you implemented in past projects?

Discuss specific machine learning models you've worked on, focusing on their application in document processing or information retrieval. Highlight the methodologies you employed, the performance metrics you tracked, and any challenges you overcame during implementation.

Join Rise to see the full answer
What challenges have you faced when working on document understanding projects?

Provide examples of unique challenges encountered in past document understanding projects, such as handling mixed data types, optimizing for speed versus accuracy, or ensuring compliance with legal standards. Discuss how you approached these challenges and what solutions you implemented.

Join Rise to see the full answer
How do you stay current with advancements in machine learning and AI?

Share your commitment to continuous learning through resources like research papers, online courses, and industry conferences. Mention any specific tools or communities you engage with to stay up-to-date with the latest trends or technologies relevant to machine learning and document processing.

Join Rise to see the full answer
What strategies do you use to test and validate machine learning models?

Explain your approach to testing ML models, including methods like cross-validation, A/B testing, and using benchmarks. Discuss how these strategies help ensure the reliability of models and the importance of maintaining quality in document processing applications at Harvey.

Join Rise to see the full answer
Can you describe a time when you had to troubleshoot a production issue?

Use the STAR (Situation, Task, Action, Result) framework to structure your response. Outline a specific scenario where you identified, analyzed, and resolved a production issue efficiently, demonstrating your problem-solving skills and technical expertise in a pressure situation.

Join Rise to see the full answer
What are your thoughts on the role of AI in the future of legal services?

Share your vision for how AI could transform the legal industry, drawing on current trends and technological advancements. Discuss the balance between automation and human expertise, and how organizations like Harvey can position themselves as leaders in integrating AI into legal services.

Join Rise to see the full answer
How do you prioritize tasks during a project with tight deadlines?

Describe your approach to time management and prioritizing tasks effectively under pressure. Share techniques like agile methodologies, sprint planning, or prioritization frameworks that have helped you meet deadlines while maintaining high-quality outputs.

Join Rise to see the full answer
What excites you about working in a startup environment like Harvey?

Articulate your enthusiasm for the fast-paced nature of startups and the opportunity for innovation. Discuss the potential for personal and professional growth, the chance to work closely with other talented professionals, and how you look forward to contributing to Harvey's mission.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Posted 4 days ago

Join Phocas as a Senior Software Engineer, driving innovative analytics solutions with a talented team in hybrid work settings.

Photo of the Rise User
Posted 3 days ago

Join Velotio Technologies, an innovator in product engineering, as a Senior Engineer where you'll lead backend development using Python and LangChain to create groundbreaking AI-powered applications.

Photo of the Rise User

Voltage Park is looking for a Principal Software Engineer to drive architectural innovation and lead high-performance engineering initiatives in AI computing infrastructure.

Shaker Recruitment Marketing Hybrid 1100 Lake St, Oak Park, IL 60301, USA
Posted 19 hours ago

As a Software Engineer at Shaker, you'll develop impactful applications for programmatic job distribution while working alongside a creative team.

Photo of the Rise User
Posted 14 days ago

Join MathWorks as a Software Engineer to shape the future of MATLAB through innovative software development in a hybrid work environment.

Photo of the Rise User
Posted 3 days ago

Join MongoDB as a Staff Engineer to leverage AI technologies and enhance enterprise applications for global clients.

Photo of the Rise User
Posted 10 days ago

Join Jobgether as a Senior Front-End Engineer to develop and maintain innovative front-end applications for a cutting-edge platform.

Photo of the Rise User
Posted 12 days ago

Join The Boeing Company's team as a Lead Software Engineer to develop cutting-edge network software engineering solutions.

Photo of the Rise User
Posted 9 months ago
Photo of the Rise User
Posted 8 months ago

Harvey is a trusted generative AI company headquartered in San Francisco, California. We provide a suite of AI products tailored to lawyers and law firms across practice areas and workflows.

83 jobs
MATCH
Calculating your matching score...
BADGES
Badge ChangemakerBadge Future MakerBadge InnovatorBadge Future Unicorn
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, on-site
DATE POSTED
March 4, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!
LATEST ACTIVITY
Photo of the Rise User
Someone from OH, Cincinnati just viewed Learning Content Designer at QuantHub
Photo of the Rise User
Someone from OH, Tallmadge just viewed Manufacturing and Process Engineer at CVRx
Q
Someone from OH, Columbus just viewed Part-Time Medical Assistant at QualDerm Partners
Photo of the Rise User
Someone from OH, Cincinnati just viewed Summer 2025 Intern – Finance – Michigan at Stryker
Photo of the Rise User
17 people applied to NodeJs developer at BlackStone eIT
Photo of the Rise User
Someone from OH, Cleveland just viewed Remote Customer Service Representative at Conduent
Photo of the Rise User
Someone from OH, Cleveland just viewed Customer Support Team Lead (6-month Contract) at Jane App
o
Someone from OH, Cincinnati just viewed Marketing and Communications Consultant at osu
Photo of the Rise User
Someone from OH, Toledo just viewed Registered Nurse (Part-time) at Calibrate
Photo of the Rise User
Someone from OH, Toledo just viewed Clinical Research Associate II at Alimentiv
Photo of the Rise User
Someone from OH, Cleveland just viewed IT Support Engineer at Level AI
Photo of the Rise User
Someone from OH, Dayton just viewed Customer Content Specialist at Cision
Photo of the Rise User
Someone from OH, Cuyahoga Falls just viewed Senior Corporate Communications Manager at Bumble Inc.