Nexaminds is on a mission to redefine industries with AI, focusing on innovation and collaboration. They are seeking a Senior Data Engineer to lead the development, optimization, and scaling of data solutions, particularly using Databricks in a fast-paced environment.

Responsibilities:

Design and build the core reusable ingestion engine in Python and ADF — parameterised, config-driven, zero hardcoding
Build Python ingestion modules: file readers, schema validators, format handlers (CSV, EDI X12, FHIR R4, Parquet, JSON)
Implement PySpark / Scala transformation components for batch and streaming at scale on Azure Databricks
Write config-driven SQL data models for Bronze, Silver, Gold medallion transformations
Develop metadata-driven validation layer: null checks, type enforcement, range rules, referential integrity
Build reusable utility libraries: logging, error handling, retry logic, dead-letter routing
Implement Databricks notebooks and DLT (Delta Live Tables) pipelines for declarative transformations
Build and maintain the onboarding template library v1 and v2 — parameterised, documented, production-ready
Onboard Provider, Claims, Member, Eligibility, and Reference data domains using the framework
Write unit tests, integration tests, and data contract tests (pytest, Great Expectations or equivalent)
Optimise Spark jobs: partitioning, caching, broadcast joins, Z-ordering on Delta tables
Participate in code review, follow GitHub branching standards, and contribute to documentation

Requirements:

5+ years Data Engineering in production Azure environments — Python, SQL, Spark
Python: production-grade OOP, config-driven design, no hardcoding, type annotations
PySpark / Spark: DataFrames, schema enforcement, partitioning, performance tuning
SQL: advanced window functions, CTEs, incremental load patterns, Delta Lake DML (MERGE, UPDATE, DELETE)
Azure Data Factory: parameterised pipelines, linked services, triggers, IR configuration
Azure Databricks: notebooks, Jobs API, DLT, cluster configuration, Unity Catalog access
ADLS Gen2, Delta Lake / Parquet format, Medallion store patterns
Testing discipline: pytest, unit and integration tests, data quality assertions
Git: feature branching, PR workflow, commit discipline, code review
Scala: Spark Dataset API, typed transformations, sbt build tooling
Healthcare data formats: EDI X12 (837/835/834), FHIR R4 resource parsing
Delta Lake: schema evolution, time travel, OPTIMIZE, VACUUM, Z-ordering
dbt (data build tool) for SQL transformation layering and lineage documentation
Databricks Asset Bundles (DABs) for pipeline-as-code deployment
DP-203 Azure Data Engineer Associate certification

Sr Data Engineer (Databricks) (Canada)

Key skills

About this role

Responsibilities:

Requirements: