Silverchair is the premier independent platform partner for scholarly and professional publishers, dedicated to expanding the reach of the world’s most valuable knowledge. The Data Engineer will build and maintain data pipelines that turn scholarly publishing activity into insights for clients, ensuring reliable data flow and supporting production data issues.

Responsibilities:

Design, build, and maintain data pipelines that ensure reliable data flow from source systems through transformation layers to reporting
Integrate data quality checks and validation into the pipeline workflow
Implement error handling, logging, and retry capabilities to keep pipelines robust and recoverable
Develop SQL and Python-based transformations that cleanse, enrich, and structure data for analytical use
Design and implement dimensional models including fact tables and dimension tables
Monitor and tune pipeline and query performance
Use execution plans and profiling tools to identify bottlenecks and improve throughput and efficiency
Troubleshoot and resolve production data issues using logs, monitoring tools, and systematic debugging
Ensure pipelines run reliably and data is delivered on schedule
Work closely with your scrum team and cross-functional partners across analytics, product, and engineering
Document pipeline designs, data lineage, and business rules
Participate in code reviews and contribute to team knowledge sharing

Requirements:

3-5 years of professional experience in data engineering or a closely related role
Bachelor's degree in Computer Science, Data Science, Information Systems, or a related field, or equivalent practical experience
Strong SQL skills including complex joins, CTEs, window functions, aggregations, views, functions, and stored procedures
Ability to write clean, modular Python using functions and classes
Experience designing dimensional models (star schema, fact/dimension tables)
Hands-on experience building data pipelines with orchestration tools
Production experience with Azure Data Factory and Azure Synapse Analytics (Dedicated SQL Pool, Serverless, Spark) is required
Understanding of data partitioning, shuffling, and distribution strategies
Proficient with Git for branching, merging, and pull request workflows
Comfortable working in an Agile/Scrum environment with CI/CD practices
Microsoft DP-700 (Fabric Data Engineer Associate) or Databricks Data Engineer Associate certification is a nice-to-have
Hands-on experience with modern lakehouse or unified analytics platforms (e.g., Databricks, Microsoft Fabric, Snowflake)
Familiarity with Kafka-based event streaming (we use Confluent)
Experience with Change Data Capture (CDC), incremental ingestion strategies, and preservation of historical data
Familiarity with BI tools such as Power BI, including an understanding of how dimensional models support semantic models and reporting
Comfortable using AI coding tools as part of your workflow (we use Claude Code)
Ability to work within Eastern Time Zone hours (8a-5p)

Data Engineer

Key skills

About this role

Responsibilities:

Requirements: