Senior Data & AI Engineer
bit.bio is an award-winning spinout from the University of Cambridge. Our breakthrough technology combines synthetic and stem cell biology for the precise, efficient and consistent reprogramming of human cells used in research, drug discovery, and cell therapy. At bit.bio, we are passionate about engineering human cells that will enable the medicine of the future. To do this we need talented and curious people who want to make an impact on the future of science and therapeutics.
As a team of individuals, we value science, collaboration, openness, curiosity and creativity. We are united by trust and respect for each other.
Location: Babraham Research Campus, Cambridge
Type: Full time, permanent
Start: Immediate
Salary: Competitive / Hours: 40 p/w
Office Based Position - Cambridge, UK with hybrid working available
Your role in our team:
At bit.bio, we’re setting our sights on embedding data and AI into our technology stack in new and innovative ways. We are looking for an experienced and passionate Senior Data & AI Engineer to partner with and support our company by addressing their data management challenges and help us implement an easy-to-use, compliant data and AI platform. In this highly collaborative role you will help acquire, curate, store and retrieve any type of data and metadata required by the teams. You will be working with data from cutting edge, novel biological assays, and experiments at never before seen scales.
As Senior Data & AI Engineer you will apply the latest technologies in highly translational biotech research where you will be directly contributing to cutting edge discoveries. You will support the development of a data pipeline for continuous ingestion of new datasets from external/internal sources into the relevant databases so that it may be integrated into decision making and discovery. This will involve you working closely with multiple teams within the company to standardise, industrialise and maintain our data platforms and ensure they are AI-compatible, supporting our vision to create a seamless, AI-first data infrastructure that supercharges our ability to innovate and discover.
Your key responsibilities will include:
- Create and maintain AI-ready data flows and cross-talk from different sources (AWS, LIMS/ELN, Google Workspace, Confluence, Veeva, laboratory instrumentation).
- Develop and oversee the design of data architectures, including data models, data integration, and data storage solutions. Collaborate with the IT team to implement robust data infrastructure and ensure seamless data flow across systems and AI-compatibility.
- Enable the use of AI-driven data analytics and business intelligence tools to derive valuable insights from data. Promote data-driven decision-making across the organisation.
- Operationalise ML, GenAI & RAG - strengthen our LLM-based assistant so colleagues can ask plain-English questions and get clear answers
- Implement data quality management processes to identify and rectify data errors or inconsistencies. Monitor data quality metrics and take corrective actions as needed. Ensure data accuracy and reliability.
- Advise on optimised data analysis infrastructure and resources.
- Develop libraries and APIs for access to data.
- Provide secure AI tools and access to data.
- Keep abreast of the evolving AI landscape, identify potential opportunities that align with company objectives, and present them to the team.
- Champion “AI-first” best practices.
You…
- Have a Bachelor's degree in a field with a strong quantitative and informatics aspect (such as Computer Science / Mathematics / Statistics / AI) or equivalent experience, followed by significant experience developing and working with a variety of databases and data sets, preferably within the Life Science sector.
- Have demonstrable experience independently supporting the data needs of small groups of data scientists and informaticians.
- Are a strong collaborator, used to working cross-functionally across all levels within a growing organisation and able to manage multiple requests and priorities in a fast paced environment.
- Have a deep interest in data as a resource, with instinct and passion for supporting missions that are data-driven and depend on getting data provisioning right.
- Have a proactive problem-solving approach and a high level of initiative, along with excellent organisational skills.
- Are highly proficient in spoken and written English.
With essential experience in…
- Have hands-on experience with RAG architectures (LangChain/LlamaIndex) and vector DBs (pgvector); embedded model fine-tuning & embeddings workflow.
- Developing and working with a variety of databases and data sets.
- Developing / optimising high-volume data pipelines, large datasets and big-data architectures.
- Deploying or supporting AI/ML workflows.
- Successfully building processes for transforming data, creating unique data structures to suit end uses, ensuring sufficiency of metadata, and developing methods for automated delivery of data sets (software tools, APIs).
- Working on building and using data stores in AWS.
- Working with a variety of stakeholders and cross-functional teams, performing analysis of their data requirements and documenting it.
- Big data tools and stream-processing systems such as: Hadoop, Spark, Kafka, Storm, Spark-Streaming.
- Relational SQL and NoSQL databases, including Postgres and Cassandra.
- Experience designing and implementing knowledge graphs for data integration and analysis.
- Data pipeline and workflow management tools: Luigi, Airflow, etc.
- AWS cloud services: EC2, S3, Glue, Athena, API Gateway, Redshift.
- Experience with object-oriented and scripting languages: Python, R.
- Designing and building APIs (RESTful, etc.)
- Understanding of FAIR principles.
and possibly...
- Genomics or life-science datasets.
- Experience measuring ROI of GenAI tools (usage analytics, time‑saved metrics, user‑feedback loops).
- Experience working within a rapidly expanding, scale up biotech environment.
- Practical experience working to ISO standards.
More reasons to join us:
bit.bio provides a vibrant and dynamic work environment in an exciting, fast-moving time for biology. We work with cutting edge technologies and with our world-leading scientific advisory board. We conduct pioneering work with real-world impact.
We trust our people to make significant contributions early on with opportunities to be involved in projects that are key to the success and growth of our young company. We invest in people, creating opportunities for personal development in an inclusive multi-skilled team with ambitious goals that provide opportunities to learn on the job from each other.
Creativity and open minds are encouraged for everyone to contribute to the success of the company.
For information on how we will manage your data please see our Candidate Privacy Notice
Apply for this job
*
indicates a required field