Johns Hopkins University is seeking a Research Data Engineer II to support research investigators and leadership by designing and deploying complex data architectures. The role involves managing and supporting data pipelines for enterprise research data products, collaborating with various stakeholders to understand requirements and implement solutions.
Responsibilities:
- Contribute to the design, production, and maintenance of data pipelines for data acquisition, management, transformation, and back-end code development to power data web applications and convert raw data into usable information
- Write and maintain ETL/ELTs that operate on a variety of structured and unstructured sources
- Develop and maintain web data scraping systems for automatic data acquisition
- Help design data architecture and provide ongoing support
- Input/output data from databases and perform queries
- Create scripts to clean, transform, and analyze data
- Put into production data pipelines using data warehousing systems
- Create and implement production software to monitor data quality and detect data anomalies
- Perform daily manual data quality assurance tasks
- Support, maintain, and troubleshoot the software infrastructure
- Source data, conduct analyses, visualize data, and generate insights to support ongoing research projects and other requests across the organization
- Collaborate with developers, analysts, data scientists, researchers, policy experts, and other partners
- Communicate with Division leadership, and others on the team
- Collaborate with external partners, contractors, and vendors
- Other duties as assigned
Requirements:
- Bachelor's Degree
- Five years of related work experience focused within database management and design, and business requirements gathering
- Additional education may substitute for required experience and additional related experience may substitute for required education permitted by the JHU equivalency formula beyond a high school diploma/graduation equivalent, to the extent permitted by the JHU equivalency formula
- Experience with data standards such as controlled vocabularies (e.g., SNOMED, LOINC, ICD) and the OMOP common data model
- Experience working with EHR data, particularly EPIC data models such as Clarity and Caboodle
- Experience working with a variety of data types such as semi-structured and unstructured data
- Experience working in a highly decentralized, consensus-driven environment, such as an academic institution
- Experience directly engaging with end users to understand requirements and implement successful architectures
- Thorough knowledge of data warehouse and data management principles and processes and database development
- Strong proficiency in SQL programming, query writing, query performance tuning, and database technologies