Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Software Engineer, Data image - Rise Careers
Job details

Software Engineer, Data

Why Harvey

Harvey is a secure AI platform for professionals in law, tax, and finance that augments productivity and automates complex workflows. Harvey uses algorithms with reasoning-adept LLMs that have been customized and developed by our expert team of lawyers, engineers and research scientists. We’ve found product market fit and are scaling our team very quickly. Some reasons to join Harvey are:

  • Exceptional product market fit: We have partnered with the largest law firms and professional service providers in the world, including Paul Weiss, A&O Shearman, Ashurst, O'Melveny & Myers, PwC, KKR, and many others.

  • Strategic investors: Raised over $200 million from strategic investors including Sequoia, Google Ventures, Kleiner Perkins, and the OpenAI Startup Fund.

  • World-class team: Harvey is hiring the best talent from DeepMind, Google Brain, Stripe, FAIR, Tesla Autopilot, Glean, Superhuman, Figma, and more.

  • Partnerships: Our engineers and researchers work directly with OpenAI to build the future of generative AI and redefine professional services.

  • Performance: $0-30M ARR in the last 18 months.

  • Compensation: Top of market cash and equity compensation.

Role Overview

As a Software Engineer, Data on the Engineering team at Harvey, you will own and lead engineering projects across our product lines. We are looking for individuals who have strong backend and infrastructure fundamentals and have experience building products where data is a core component.

This role is based in San Francisco, CA. We use an in-person work model and offer relocation assistance to new employees.

What You’ll Do

  • Develop distributed crawlers, data pipelines, and storage infrastructure to ingest data from numerous sources including: websites, APIs, law firm knowledge bases, and Harvey’s data partners. These must handle real-time updates, while being performant and robust.

  • Work directly with domain experts and customers to understand how to structure complex, referential datasets in Legal, Tax, Finance, etc, then translate that into technical data systems.

  • Build and scale our Retrieval platform which provides knowledge and citations to our widely used RAG products.

Representative Projects

What You Have

  • 3+ YoE (post-BS/MS) in an engineering role.

  • Experience with shipping and scaling an impactful product powered by data, e.g.  data pipelines, databases, and backend platforms.

  • Experience with search infrastructure or vector databases is a plus.

  • Track record of shipping reliable products and a strong attention to detail.

  • Grit - experience working at early-stage startups is a plus.

Harvey is an equal opportunity employer and does not discriminate on the basis of race, gender, sexual orientation, gender identity/expression, national origin, disability, age, genetic information, veteran status, marital status, pregnancy or related condition, or any other basis protected by law.

Harvey Glassdoor Company Review
No rating Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon
Harvey DE&I Review
No rating Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon
CEO of Harvey
Harvey CEO photo
Unknown name
Approve of CEO

Harvey is a trusted generative AI company headquartered in San Francisco, California. We provide a suite of AI products tailored to lawyers and law firms across practice areas and workflows.

43 jobs
MATCH
Calculating your matching score...
BADGES
Badge ChangemakerBadge Future MakerBadge InnovatorBadge Future Unicorn
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, on-site
DATE POSTED
November 24, 2024

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!
What You Should Know About Software Engineer, Data, Harvey

Join Harvey as a Software Engineer, Data, and be at the forefront of revolutionizing productivity in the legal, tax, and finance sectors! Based in the vibrant city of San Francisco, this exciting role offers you the opportunity to lead engineering projects that drive our innovative AI platform. At Harvey, we pride ourselves on our exceptional product market fit and strong partnerships with some of the largest law firms and professional service providers globally. Your work will focus on developing robust data pipelines, distributed crawlers, and scalable storage infrastructure that empower our team to scrape, structure, and index complex datasets from various sources. Collaborating with domain experts, you will gain insights into handling legal, tax, and financial data, which will fuel the development of impactful tools for our clients. With a world-class team coming from top-tier companies and a commitment to cutting-edge technology, you will contribute to projects that transform how professionals access information. Plus, our generous compensation package reflects our belief in the value of top talent. If you're ready to make a significant impact in a fast-paced environment, while enjoying the perks of working alongside brilliant minds at Harvey, let's chat about this amazing opportunity!

Frequently Asked Questions (FAQs) for Software Engineer, Data Role at Harvey
What qualifications are needed for the Software Engineer, Data position at Harvey?

To thrive as a Software Engineer, Data at Harvey, you should possess at least 3 years of experience in an engineering role, with a strong background in shipping and scaling impactful data-driven products. Experience with data pipelines and backend systems is highly beneficial, and familiarity with search infrastructure or vector databases can give you an edge. Being detail-oriented, having grit, and being flexible to adapt to the early-stage startup culture will also set you up for success!

Join Rise to see the full answer
What responsibilities does the Software Engineer, Data role at Harvey include?

As a Software Engineer, Data at Harvey, you'll take the reins on various engineering projects aimed at handling extensive datasets. Key responsibilities include developing distributed crawlers, creating data pipelines, and building storage infrastructure while ensuring real-time updates. You will work closely with both technical teams and domain experts to construct complex data systems tailored to legal, finance, and tax sectors, driving the advancement of our innovative products.

Join Rise to see the full answer
Is the Software Engineer, Data position at Harvey remote or in-person?

The Software Engineer, Data role at Harvey is based in-person in San Francisco, CA. This setup encourages collaboration and innovation within our talented engineering team. We appreciate the synergy that comes from working together physically, and we also offer relocation assistance for successful candidates to make this transition smoother.

Join Rise to see the full answer
What makes Harvey a unique company to work for as a Software Engineer, Data?

Harvey stands out as a unique employer due to its exceptional product market fit and partnerships with industry-leading law firms and professional service providers. We are constantly innovating, working with cutting-edge technology in generative AI. Our investment from renowned firms like Sequoia and Google Ventures further emphasizes our commitment to shaping the future. A world-class team ensures you're surrounded by the best talent in the industry, providing an inspiring environment to learn and grow.

Join Rise to see the full answer
Can you describe the team culture at Harvey for the Software Engineer, Data role?

The team culture at Harvey is vibrant and collaborative! As a Software Engineer, Data, you'll join a world-class group of engineers and researchers who are passionate about what they do. We foster an environment that encourages innovation and creativity, with opportunities for continuous learning and professional growth. Grit and resilience are valued qualities here, as we tackle challenges and drive new initiatives in the fast-evolving field of AI.

Join Rise to see the full answer
What projects might I work on as a Software Engineer, Data at Harvey?

As a Software Engineer, Data at Harvey, you'll engage in a variety of exciting projects! Some examples include building a comprehensive legal dataset for AI applications, ingesting tax codes from multiple jurisdictions, and scaling data infrastructure to unlock efficient search and retrieval capabilities. You will also explore groundbreaking embedding search technologies to enhance our offerings, making a tangible difference in the way our clients manage their information.

Join Rise to see the full answer
What are the compensation benefits for the Software Engineer, Data position at Harvey?

Harvey is committed to attracting and retaining top talent, which is reflected in our compensation package for the Software Engineer, Data position. We offer top-of-market cash and equity compensation, along with additional benefits that prioritize work-life balance and professional development. Our focus on creating an inclusive and rewarding workplace makes Harvey a fantastic place to advance your career.

Join Rise to see the full answer
Common Interview Questions for Software Engineer, Data
Can you explain your experience with data pipelines in previous projects?

When discussing your experience with data pipelines, be specific about the projects you've worked on. Highlight how you designed, developed, and optimized data flows, the technologies involved, and how those pipelines improved performance or usability. Explain any challenges you faced and how you overcame them, emphasizing your problem-solving skills.

Join Rise to see the full answer
How do you approach designing scalable data storage solutions?

In an interview for Software Engineer, Data at Harvey, focus on your methodology for evaluating requirements and data characteristics before selecting a storage solution. Elaborate on your experience with different storage options (SQL, NoSQL, cloud-based solutions), the trade-offs between them, and how you ensure data integrity and optimal performance.

Join Rise to see the full answer
What steps do you take to maintain data integrity and security?

Discuss your approach to data security and integrity, which may include implementing encryption, access controls, and regular audits. Share specific examples of how you've implemented these measures in previous roles and the impact they had on the overall security posture of your projects. Your commitment to safeguarding data should be clear.

Join Rise to see the full answer
Have you worked with legal datasets or complex databases? How did you manage them?

Discuss any experience you have with legal datasets or similarly complex databases. Talk about the specific tools or frameworks you used, how you structured the data for efficient access, and what challenges arose during those projects. Highlight your familiarity with regulatory compliance in managing sensitive information if applicable.

Join Rise to see the full answer
Can you describe a project where you built a data crawler? What were the main challenges?

When answering this question, provide details about the design, technologies used, and the data sources targeted. Discuss the challenges you faced such as handling real-time updates, ensuring performance, and data cleaning processes. Highlight the results of the project and any lessons learned that could be applied to future crawlers.

Join Rise to see the full answer
How do you stay updated with emerging technologies in data engineering?

Emphasize your commitment to professional development, mentioning specific resources like online courses, conferences, or influential publications you follow. Talk about any community involvement, projects, or contributions that demonstrate your proactive approach to staying ahead of industry trends and advancements in data engineering.

Join Rise to see the full answer
What methodologies do you use for testing data pipelines?

Discuss your systematic approach to testing data pipelines, from unit testing individual components to integration testing full workflows. Highlight any testing frameworks or tools you frequently use, and provide examples of how thorough testing has led to successful deployments or uncovered issues before they became critical.

Join Rise to see the full answer
Describe your experience working with stakeholders to understand data requirements.

Illustrate your communication skills by sharing experiences where you've collaborated with various stakeholders (e.g., legal experts, clients). Explain how you gathered their requirements and ensured that the data solutions you built met their needs. Emphasize your ability to bridge the gap between technical and non-technical terms effectively.

Join Rise to see the full answer
How do you prioritize tasks in fast-paced project environments?

Share strategies you use to manage your workload effectively in a fast-paced environment. Discuss tools like Agile methodologies or project management software that you find useful. Provide examples of how you’ve successfully managed multiple priorities while maintaining the quality and timely delivery of your projects.

Join Rise to see the full answer
What excites you about working with data in the AI field?

Be genuine in your response, discussing your enthusiasm for the potential of data in driving AI innovations. Share specific aspects, like the opportunity to create impactful user experiences through data-driven insights, and highlight any relevant projects or experiences that shaped your interests in AI and data engineering.

Join Rise to see the full answer