Edmunds is a trusted leader in the car buying industry, innovating ways to empower car shoppers. The Data Engineer role involves architecting and scaling foundational data platforms to support analytics and business intelligence, while collaborating with various teams to enhance data-driven decision making.
Responsibilities:
- Pipeline Architecture: Create and maintain scalable, maintainable and reliable data pipelines that process very large quantities of structured and unstructured data in both batch and real time
- Platform Leadership: Enhance and maintain the data lakehouse that powers the core of the company's decision making process
- Core Systems Engineering: Work hands-on with our transactions and pricing pipelines, building and optimizing data workflows using Spark, Databricks, SQL, Scala, and Python
- Process Engineering: Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc
- Infrastructure & Cloud: Support infrastructure and build processes, working within AWS to deploy and maintain reliable, scalable data systems
- Cross-functional Collaboration: Work with stakeholders including the Executive, Product, and Data teams to assist with data-related technical issues and support their data infrastructure needs
- Technical Guidance: Design solutions, troubleshoot pipeline issues, and ensure data quality across critical business systems while collaborating with team members with a goal of improving personal knowledge of the systems, ensuring that code changes meet business goals and technology best practices
Requirements:
- High proficiency in at least one object oriented or functional programming language (Almost all of our codebase is in Scala and Python)
- Fluency in SQL and demonstrated experience writing ETL Jobs and working with data at scale
- Experience writing and maintaining real time / streaming data pipelines
- Familiarity with some of the following: Spark, Scala, Python, AWS, Databricks, Airflow
- Demonstrated ability to design and write maintainable software, paired with an understanding of software engineering best practices, object oriented analysis & design, and design patterns & algorithms
- Experience enhancing and evolving existing systems
- Demonstrated problem solving, troubleshooting, and communication skills especially in a hybrid and remote environment
- Familiarity with cloud-based data platforms (particularly Databricks and AWS), CI/CD build pipelines
- Desire to learn new technologies
- Exposure to AI/ML workflows or an interest in integrating machine learning into data pipelines