AWSOraclePostgresPythonSparkSQLRData EngineeringAmazon Web ServicesPostgreSQLSQL ServerCommunicationCollaboration
About this role
Role Overview
Collaborate with scientific teams to design scalable data capture systems and user-friendly visualizations of experimental results
Design and implement database architectures, data models, and automated data pipelines across platforms (e.g., Amazon Web Services, Laboratory Information Management Systems, Benchling, and contract research organization systems)
Develop structured data models capturing microbial strain metadata, growth conditions, and experimental outputs
Partner with researchers to standardize experimental metadata, ontologies, and data structures to improve data quality, reproducibility, and traceability
Build and maintain data warehouses supporting probiotic growth, strain characterization, and clinical datasets
Develop and implement robust data quality and integrity checks across data ingestion workflows, including integration of externally generated clinical data
Build pipelines integrating multi-modal datasets (e.g., sequencing, metabolomics, in vitro assays, and clinical endpoints)
Partner cross-functionally to prototype and deploy digital solutions that enhance research and development efficiency and scalability
Requirements
Ph.D. in Data Engineering, Computational Biology, Data Science, or a related field with relevant experience; or M.S. with equivalent professional experience
Strong Structured Query Language (SQL) skills and experience with database systems such as SQL Server, PostgreSQL, or Oracle
Proficiency in Python, R, or Spark for data processing and analysis
Experience working with biological or experimental datasets with complex metadata structures
Familiarity with statistical or computational analysis of biological systems
Demonstrated ability to manage multiple projects and work independently in a dynamic environment
Strong communication and collaboration skills, with experience partnering effectively with scientific teams
Experience designing scalable data architectures and optimizing data workflows for downstream users
Tech Stack
AWS
Oracle
Postgres
Python
Spark
SQL
Benefits
Opportunity to work on impactful, real-world data and digital transformation projects
Exposure to cutting-edge microbiome and probiotic research
Collaborative environment with highly skilled scientists, engineers, and data experts
Access to learning and development opportunities to grow technical and domain expertise
Inclusive and purpose-driven culture focused on innovation and sustainability