JOB SUMMARY
In this role, you’ll play a pivotal part in building and optimizing data pipelines that transform large, multi-modal datasets into high-quality training inputs for cutting-edge AI models for drug discovery. You’ll help evolve our data pipeline and storage infrastructure to support faster, more reliable turnarounds for research and development of new models.
You’ll join a multidisciplinary team, collaborating closely with ML scientists, software developers and DevOps engineers to improve the performance and reliability of Python-based workflows. As a key contributor, you’ll participate in the design, testing, and maintenance of core software systems, conduct thoughtful code reviews, and champion engineering best practices—including version control, testing, and documentation.
This role is remote, with preference for candidates on the East Coast or UK.
KEY RESPONSIBILITIES
Design and improve data pipelines that process large, multi-modal datasets from a variety of internal and external sources into training datasets for AI models.
Evolve our data storage layer to support analytics, schema evolution, reproducibility, and efficient data access.
Collaborate with ML engineers to improve the performance and reliability of Python-based data processing workflows.
Collaborate on the creation, testing and maintenance of software systems
Code review for pull requests in adjoining areas
Maintenance of and mentorship in software best practices, including version control, testing and documentation
Clear oral communication of work in meetings and company demos, at a level suited to the audience
QUALIFICATIONS
Minimum of 8 years of related experience with a Bachelor’s degree; or 6 years and a Master’s degree; or a PhD with 3 years experience; or equivalent experience.
Proven ability to design flexible, maintainable ETL systems.
Experience with data pipeline orchestration tools such as Prefect, Airflow, Argo, Databricks, or Spark.
Understanding of the ML model lifecycle; prior work with scientific or ML workflows is a plus.
Hands-on experience with multi-terabyte scale data processing.
Familiarity with AWS; Kubernetes experience is a bonus.
Knowledge of data lake technologies such as Parquet, Iceberg, AWS Glue etc.
Strong Python software engineering skills.
Pragmatic mindset — able to evaluate tradeoffs find solutions that empower ML researchers to move quickly.
Background in bioinformatics or chemistry is a plus.
ABOUT IAMBIC THERAPEUTICS
Founded in 2019 and headquartered in San Diego, California, Iambic Therapeutics is disrupting the therapeutics landscape with its unique AI-driven drug-discovery platform. Iambic has assembled a world-class team that unites pioneering AI experts and experienced drug hunters with strong track records of success in delivering clinically validated therapeutics. The Iambic platform has been demonstrated to deliver high-quality, differentiated therapeutics to clinical stage with unprecedented speed and across multiple target classes and mechanisms of action. The Iambic team is advancing an internal pipeline of clinical assets to address urgent unmet patient needs. Learn more about the Iambic team, platform, and pipeline at iambic.ai.
MISSION & CORE VALUES
The culture and work at Iambic Therapeutics are profoundly strengthened by the diversity of our people and our differences in background, culture, national origin, religion, sexual orientation, and life experiences. We are committed to building an inclusive environment where a diverse group of talented humans work together to discover therapeutics and create technologies.
PAY AND BENEFITS
We offer a competitive compensation package, pension contributions, and flexible holiday allowances to our team. Our UK office provides a modern and collaborative work environment, right in the centre of Bristol.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
As a Software Data Engineer at Iambic Therapeutics, you will be at the forefront of revolutionizing drug discovery through innovative AI technologies. Based in the vibrant city of Bristol, you'll play a key role in building and optimizing data pipelines that convert complex, multi-modal datasets into high-quality training inputs for our state-of-the-art AI models. Working closely with a dynamic team of ML scientists, software developers, and DevOps engineers, you'll not only help evolve our data storage infrastructure to ensure speedy and reliable results but also elevate the performance of Python-based workflows. Your responsibilities will include designing and implementing flexible ETL systems, maintaining best practices in software development, and conducting thoughtful code reviews. We value oral communication, so your ability to present your work clearly will be essential in meetings and company demos. Ideally, we seek candidates with a strong background in data processing, particularly with experience in tools like Prefect, Airflow, or Spark. At Iambic, we believe that diversity fuels innovation, and our inclusive culture fosters unmatched collaboration. If you have a pragmatic mindset coupled with solid Python programming skills and a passion for improving research efficiency, we want to hear from you! Join us in making a significant impact on patient care with disruptive therapeutics. Make a difference with Iambic Therapeutics!
Join Iambic Therapeutics as a Fall Graduate Research Intern and work on cutting-edge machine learning for protein structure prediction.
Intel is looking for a Data Scientist to innovate in AI/ML algorithms for semiconductor manufacturing processes.
Join Mandolin as an Applied-ML Engineer to accelerate the delivery of groundbreaking therapies through advanced machine learning techniques.
Join CoStar Group as a Visual Data Journalist Intern to push the boundaries of data storytelling in real estate.
Join Employment Hero as a Senior Data Analytics Engineer, where you'll shape marketing analytics and data infrastructure in a remote-first organization.
Join a health tech innovator as a Scientific Developer, leveraging genetics and data science to improve healthcare outcomes.
Join TWG Global as a Staff Machine Learning Engineer to lead the development of cutting-edge ML solutions that transform business operations.
At AbbVie, we are seeking a skilled Manager to oversee the development of clinical data standards ensuring adherence to industry best practices.
Join Visa's Fraud Modeling Team as a Data Scientist/Data Engineer and leverage your analytical skills to enhance payment security.
Subscribe to Rise newsletter