Biology Data Quality Engineer
Bioptimus is building the first universal AI foundation model for biology to fuel breakthrough discoveries and accelerate innovation in biomedicine. With more than $75M in funding, Bioptimus is a fast-growing start-up headquartered in Paris, incorporated in October 2023. Backed by leading international venture capitalists, our world-class team of scientists and engineers is redefining the frontiers of AI and life sciences.
The Opportunity
We are looking for a meticulous and detail-oriented Biology Data Quality Engineer to ensure the integrity and usability of the various and complex datasets that are central to our mission. In this critical role, you'll leverage your expertise in biology, data science, and machine learning to ensure the quality and consistency of biological data used to train and evaluate our foundation models. You'll work in collaboration with the R&D team and our engineers, using your skills to ensure our data meets the highest standards.
What You'll Be Doing
As a Biology Data Quality Engineer, you will own the following tasks:
- Data Validation Pipeline Development: Develop and implement comprehensive data validation protocols for diverse biological datasets (histology, omics, clinical). Ensure data integrity, consistency, and accuracy through rigorous quality checks. Design and implement automated data quality pipelines to streamline data validation and identify potential issues early in the data processing workflow.
- Data Curation & Standardization: Establish and enforce data standardization practices to facilitate seamless integration and analysis across different data types. Curate datasets to enhance their usability for machine learning.
- Collaboration & Communication: Work closely with the R&D team to understand data requirements and address data quality concerns. Communicate data quality findings and recommendations effectively to technical and non-technical stakeholders. Communicate and synchronize with external data providers.
- Documentation & Reporting: Maintain a detailed documentation of the data-quality assessment procedures, validation results, and data specifications. Generate regular reports on data quality metrics and trends.
- Data Source Evaluation: Evaluate and validate external public data sources, ensuring they meet our quality standards and are suitable for inclusion in our foundation model training.
- Continuous Improvement: Stay up-to-date with the latest data quality best practices and tools in the biological domain. Propose and implement improvements to our data- quality assessment processes and pipelines.
What You'll Bring
The successful candidate will have a ‘team-first’ kind of attitude; be independent, curious, and detail-oriented; thrive in a dynamic, fast-paced environment; and be fun to work with. We value individuals who bring strong domain expertise in biology alongside strong computational, hands-on skills.
- Omics Data Expertise. Deep understanding of transcriptomics data types (bulk, single-cell, spatial) and their specific quality considerations. Good knowledge of genomics and proteomics data.
- Data Quality Management: Proven experience in implementing data quality control procedures and pipelines. Familiarity with data validation tools and techniques.
- Analytical Skills: Strong analytical and problem-solving skills to identify and resolve data quality issues.
- Programming & Data Analysis: Proficiency in Python, good knowledge of data visualization libraries (e.g. matplotlib).
- Communication Skills: Excellent written and verbal communication skills to effectively convey data quality findings and recommendations.
- Educational Background: MSc in Biology, Computational Biology, Bioinformatics.
How to Stand Out:
- Computational Pathology Data Expertise: Experience in machine learning analysis of histology images.
- Cloud expertise: Experience working with AWS.
- Data Annotation Experience: Experience with developing and implementing data annotation guidelines and processes. Experience with data ontologies.
- Proven experience building or contributing to large-scale data collections (e.g. Human Cell Atlas).
- Spatial alignment of multimodal datasets (e.g. alignment between different imaging modalities)
The Candidate Journey
- Screening: Once you have applied, the hiring team will review your application to determine if your work experience and skills align with the necessary proficiencies of this position.
- Technical Assessment: Given the technical nature of the role, you will be expected to complete a short set of Python exercises in a dedicated platform to assess your proficiency.
- Interviews:
- Data Curation Project: The hiring team will have an introductory call with you to share expectations for the role and to provide you with a Data Curation assignment. This assignment covers one or more key data modalities and consists of the submission of Python code and a detailed report about your work. You will get to present your work to the team, and have an interview to explore your knowledge on data modalities and relevant technical artefacts.
- Team Fit: The hiring team will have a call with you to discuss your motivations, expectations for the role, and an overall Q&A based on your CV and previous experiences.
- Offer: Following the completion of the interviews, our hiring team will make a final decision and will be in touch to share the outcome of your interviews. If the team would like to move forward, the recruiter will discuss the details of our proposed offer with you.
- Onboarding: We are happy to have you joining the team. Once you have accepted and signed your offer, we will be in touch to begin the process of onboarding you to Bioptimus.
Why This is a Unique Opportunity
- Be part of a trailblazing team working at the intersection of AI, biotech, and biomedical research.
- Take on a high-impact leadership role, shaping the future of biomedical AI through strategic data partnerships.
- Work in a collaborative, innovation-driven environment with top researchers and industry experts.
We believe that the unique contributions of all Bioptimists create our success. To ensure that our culture continues to incorporate everyone’s perspectives and experience, we never discriminate based on race, religion, national origin, gender identity or expression, sexual orientation, age, or marital, or disability status. Decisions related to hiring are made fairly, and we provide equal employment opportunities to all qualified candidates. We take responsibility for always striving to create an inclusive environment that makes every employee and candidate feel welcome.
Create a Job Alert
Interested in building your career at Bioptimus? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field