Data Quality Engineer (Cell Therapy sphere)
Company Background
Our client is revolutionizing the field of cell therapy manufacturing by developing an advanced, scalable, and cost-effective solution that improves accessibility for patients in need. With a mission-driven culture and a multidisciplinary team, they are dedicated to accelerating life-saving therapies using cutting-edge technologies and innovation.
Project Description
You will be part of the data engineering team responsible for maintaining the accuracy, reliability, and integrity of data within a cloud-based, Databricks Lakehouse platform. The focus will be on implementing automated data validation across the entire pipeline - ensuring data quality from ingestion to processing in the Bronze, Silver, and Gold layers of the Medallion architecture. This is a highly visible, cross-functional role in a fast-paced biotech environment.
Technologies
- Python, Pytest, SQL
- Databricks, Delta Lake, Lakehouse architecture
- Azure (Blob Storage, Data Services)
- Unity Catalog (data governance & lineage)
What You'll Do
- Build and maintain automated data validation tests using Databricks notebooks and Pytest
- Validate data ingestion, transformation, and loading processes across the Medallion architecture
- Test for data accuracy, completeness, consistency, timeliness, and uniqueness
- Perform data reconciliation between source systems and target tables
- Integrate automated data quality checks into the CI/CD pipeline to prevent regressions
- Monitor data freshness, schema evolution, and volume anomalies
- Collaborate with data engineers, product owners, and business stakeholders to ensure quality requirements
- Support data governance initiatives using tools like Unity Catalog (masking, sensitivity, lineage)
- Report progress and findings during daily stand-ups and agile ceremonies
Job Requirements
- 5+ years of experience in data quality engineering, data validation, or similar roles
- Strong Python skills for scripting automated data checks (Pytest preferred)
- Deep hands-on experience with Databricks and Delta Lake
- Proficiency in SQL for querying, troubleshooting, and data validation
- Solid understanding of data warehouse concepts, including dimensional modeling (star/snowflake schemas)
- Experience working with Azure data services
- Proven experience in testing strategy design and execution for data pipelines
- Ability to work independently and communicate effectively in cross-functional teams
- Bachelor's or Master’s in Computer Science, Engineering, or related field
- Strong problem-solving and analytical skills
- English level: B1+ or higher (both spoken and written)
What Do We Offer
The global benefits package includes:
- Technical and non-technical training for professional and personal growth;
- Internal conferences and meetups to learn from industry experts;
- Support and mentorship from an experienced employee to help you professional grow and development;
- Internal startup incubator;
- Health insurance;
- English courses;
- Sports activities to promote a healthy lifestyle;
- Flexible work options, including remote and hybrid opportunities;
- Referral program for bringing in new talent;
- Work anniversary program and additional vacation days.
Create a Job Alert
Interested in building your career at Coherent Solutions? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field