Ellipsis Health is creating cutting-edge AI/ML products that solve healthcare staffing issues and administrative burdens using conversation-based software and patented voice biomarker technology. They are seeking an experienced Senior Data Platform Engineer to lead the design and development of a scalable data platform that supports analytics and ML Ops while collaborating with various teams to implement end-to-end pipelines.

Responsibilities:

Lead the design, development, and operation of a scalable and secure data platform to support analytics, ML Ops, and business intelligence
Collaborate closely with Data Science, Machine Learning, Application and DevOps teams to implement end-to-end ML Ops pipelines
Architect and manage data warehousing solutions using Databricks, Dbt, and Spark
Develop and maintain ETL/data pipelines that handle structured and unstructured data across diverse sources
Optimize data storage, access, and processing for cost-efficiency and performance in GCP and AWS Cloud environments
Build and maintain dashboards and analytics solutions using tools such as Sigma, Metabase, and other BI platforms
Ensure compliance with data governance, security, and privacy best practices, including HIPAA, SOC-2, and other regulatory requirements
Evaluate and integrate third-party anonymization and security solutions to protect sensitive data
Provide strategic guidance on the evolution of the data platform to meet the company's growth and technical needs
Design and implement scalable infrastructure for Large Language Model (LLM) operations, including training, fine-tuning, and inference workflows
Collaborate with AI/ML teams to build and optimize LLM serving platforms for real-time and batch processing
Develop monitoring and observability solutions for LLMs, ensuring model performance, cost-efficiency, and compliance with ethical AI guidelines
Evaluate and integrate state-of-the-art LLM technologies into existing data platforms to enhance analytics and decision-making

Requirements:

Bachelor's or Master's Degree in Computer Science or equivalent experience
5+ years of industry experience in designing and building large-scale data platforms
Strong expertise in SQL, Data Modeling, and Data Warehousing (Databricks, Snowflake, Redshift, BigQuery, etc.)
Proficiency in writing Advanced SQLs and performance tuning
Strong proficiency in Python for building, optimizing, automating and maintaining data pipelines and services
Deep experience with Apache Spark and distributed data processing frameworks
Hands-on experience with modern ETL/Orchestration frameworks such as Airflow, dbt, and others
Knowledge of business intelligence tools such as Sigma, Metabase, Tableau, and Looker
Strong familiarity with cloud-based infrastructure and managed data services in GCP and AWS Cloud
Experience with CI/CD pipelines to automate testing, deployment and release of data engineering and analytics workflows using GitLab, GitHub etc
Experience with tools like Kubernetes, Terraform, Pubsub, Debezium
Exposure building data quality frameworks and automation
Understanding of data governance, privacy, and regulatory frameworks (HIPAA, SOC-2, HITRUST)
Experience working with ML Ops platforms and supporting Data Science teams
Experience with ML Ops tools such as MLflow, Streamlit, and vector databases
Familiarity with healthcare data standards (FHIR, HL7)
Experience in real-time data processing and event-driven architectures
Expertise in implementing data access controls and anonymization techniques

Senior Data Platform Engineer

Key skills

About this role

Responsibilities:

Requirements: