Job details

AI Engineer

Get a free resume review

About Us

Our mission is to make healthcare reimbursement fair and transparent, so providers can spend more time caring for patients and less time haggling over costs. We specifically focus on the most complex AI challenges that require novel R&D, building products that are fit for purpose in healthcare. We are backed by some of the top healthcare investors and growing fast. Join us!

The Role

As an AI Engineer on our team, you will architect and optimize the training and inference infrastructure that underpins our healthcare language models. You'll collaborate closely with research scientists, product teams, and end-users to ensure our AI solutions are robust, scalable, and deployable for real-world clinical applications. You will work with state-of-the-art open-source LLMs running on GPUs, helping us advance healthcare NLP in production environments.

We are looking for a candidate who can come into our NYC (Soho) office 3+ days per week.

Key Responsibilities

LLM Training & Inference Infrastructure:
Develop and maintain GPU-accelerated systems for large-scale training and inference, ensuring high throughput and low latency. Optimize distributed training pipelines, handle multi-node clusters, and evaluate state-of-the-art frameworks for open-source language models.
Model Optimization & Deployment:
Implement techniques such as model parallelism, quantization, knowledge distillation, and efficient serving to deliver cost-effective and fast inference for mission-critical healthcare applications.
Collaboration with AI Research:
Work with our NLP research team to integrate new model architectures, fine-tuned weights, and evaluation benchmarks into production pipelines. Establish best practices for version control, reproducible experiments, and continuous model improvement.
Healthcare Data Integration:
Collaborate with data engineering teams to ingest and preprocess large clinical datasets (EHR, claims data, etc.) in GPU-friendly formats. Help define secure and scalable data workflows to comply with healthcare regulations.
Monitoring & Scalability:
Set up monitoring, logging, and alerting for AI systems in production, ensuring uptime and performance metrics are met. Implement strategies for autoscaling and distributed resource management.
Technical Leadership & R&D:
Stay current with the latest research in large-scale machine learning, GPU acceleration, and MLOps. Champion best practices to the broader team, sharing insights through presentations, docs, and code reviews.

About You

Technical Expertise

Educational Background:
MS/PhD in Computer Science, Electrical Engineering, or a related field (or equivalent industry experience).
Hands-on Experience:
2+ years of building and optimizing ML infrastructure for large-scale training and inference. Familiarity with GPU-accelerated computing, distributed systems, and open-source LLMs.
Deep Knowledge of ML & MLOps:
Proficiency in Python and frameworks like PyTorch for large-scale model training. Experience with containerization (Docker/Kubernetes), experiment tracking, CI/CD, and monitoring for AI systems.
Performance Tuning & Deployment:
Track record of improving inference efficiency and throughput via techniques like model parallelism, quantization, or knowledge distillation.
Startup Mindset:
Comfortable with ambiguity, rapid iteration, and owning projects end-to-end. Driven to deliver meaningful outcomes and iterate quickly on user feedback.

Benefits

Competitive Compensation:
Top-of-market salary plus equity.
Flexible PTO:
Generous vacation policy and a culture that supports work-life balance.
Team Culture:
Collaborative environment with regular team-building events. Mission-driven work that makes a tangible impact in healthcare.

Hiring Process

Initial Application:
Submit your resume/LinkedIn and a brief statement about why you’re interested.
Intro Call:
Discuss your background, career goals, and our mission to see if there’s a mutual fit.
Technical Interviews (2x):
Includes a programming or system design exercise focused on large-scale training/inference and GPU workflows.
Referees:
Provide 2 references who can speak to your professional/technical accomplishments.
Culture Interview:
Explore ways of working, team fit, and give you a chance to ask questions.
Offer
We’ll extend a competitive offer for the right candidate to join our growing team.

Average salary estimate

$140000 / YEARLY (est.)

min

max

$120000K

$160000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About AI Engineer, Phare Health

Are you ready to make a significant impact in the healthcare sector? We're on the lookout for an AI Engineer to join our dynamic team focused on revolutionizing healthcare reimbursement processes. As an AI Engineer, your primary responsibility will be to architect and optimize cutting-edge training and inference infrastructures that drive our healthcare language models. You'll collaborate with top-notch research scientists and product teams to ensure our AI solutions are not only robust but also scalable for real-world clinical applications. Your expertise in working with state-of-the-art open-source large language models (LLMs) running on GPUs will be crucial in enhancing healthcare natural language processing (NLP) in production settings. We're a fast-growing company based in NYC (Soho), and we believe that great ideas come from diverse teams. If you're passionate about pushing boundaries in AI and healthcare, we want to hear from you! Join us in transforming how healthcare operates so providers can focus on what truly matters—caring for patients.

Frequently Asked Questions (FAQs) for AI Engineer Role at Phare Health

What are the responsibilities of an AI Engineer at this company?

As an AI Engineer at our company, you will be responsible for developing and maintaining GPU-accelerated systems for large-scale training and inference. You'll optimize distributed training pipelines, work closely with the NLP research team to integrate new model architectures, and collaborate with data engineering teams to handle large clinical datasets. Additionally, you will set up monitoring, logging, and performance metrics for AI systems in production.