Design, build, and maintain scalable data pipelines that support continuous data flow, analytics, and machine learning use cases
Develop and implement data models and schemas to ensure data integrity, accessibility, and usability
Manage and optimize relational and NoSQL database systems for performance and availability
Integrate data from multiple sources while ensuring consistency, quality, and reliability
Design and manage ETL processes to move and transform data across systems
Implement data governance practices to support data security, privacy, and compliance
Collaborate with data scientists, machine learning engineers, analysts, platform engineers, and business stakeholders to understand requirements and deliver effective solutions
Tune database and pipeline performance for efficient processing and querying of large datasets
Maintain clear documentation for data processes, models, architecture, and operational workflows
Partner with internal platform teams to build reliable, scalable, and reusable data solutions
Requirements
Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related field, or equivalent practical experience
Experience programming in Python, Java, Scala, or a similar language
Strong SQL skills and experience working with relational databases such as MySQL or PostgreSQL
Experience with NoSQL databases such as MongoDB, Cassandra, or similar technologies
Understanding of data modeling, data architecture, and ETL processes
Experience with big data technologies and frameworks such as Kafka, Flink, Parquet, Iceberg, or similar tools
Experience using workflow orchestration or ETL tools such as Apache Airflow
Experience with cloud platforms such as AWS, Azure, or Google Cloud
Familiarity with cloud data services such as AWS Glue, EMR, Redshift, S3, BigQuery, or similar
Experience with data warehousing solutions such as Snowflake, Redshift, or BigQuery
Experience with version control systems such as Git and CI/CD pipelines.
Tech Stack
Airflow
Amazon Redshift
Apache
AWS
Azure
BigQuery
Cassandra
Cloud
ETL
Java
Kafka
MongoDB
MySQL
NoSQL
Postgres
Python
Scala
SQL
Benefits
From health and financial benefits to time away and everyday wellness, we give Autodeskers the best, so they can do their best work.