Wheelhouse is a revenue management platform for the flex rental space, aiming to empower short and mid-length stay operators. They are seeking a Senior Data Engineer to oversee and optimize their data ingestion processes, ensuring the reliability and scalability of their data infrastructure as they manage a multi-terabyte dataset.
Responsibilities:
- Architect, optimize, and supervise the daily ingestion pipelines that process calendar data for 14+ million listings
- Tune and scale our AWS RDS PostgreSQL and Aurora databases to handle extreme high-throughput read/write operations and multi-TB storage
- Optimize background job orchestration to ensure timely and efficient data processing
- Design and build robust ETL processes to integrate new, complex data sources into our ecosystem
- Collaborate with application engineers to make diverse versions and aggregations of our data easily accessible to the core application
- Apply DevOps best practices to maintain and improve our AWS infrastructure, CI/CD pipelines, and infrastructure-as-code
- Build comprehensive monitoring, alerting, and observability tooling to catch data bottlenecks before they impact the business
Requirements:
- 5+ years of Data Engineering or Backend Engineering experience, specifically dealing with massive, high-velocity datasets (multi-TB scale) and 10+ years overall engineering experience at technology companies
- Deep database expertise: Advanced knowledge of relational databases, specifically PostgreSQL and AWS Aurora. You must know how to tune databases, optimize complex queries, and manage large-scale indexing
- Strong programming skills: Proficiency in Ruby and Ruby on Rails (or a strong willingness to learn, backed by expert-level experience in a similar language like Python) to navigate our core stack
- Job Orchestration: Experience with background processing frameworks at scale (Resque, Sidekiq, Celery, or similar)
- DevOps & Cloud: Hands-on experience with AWS cloud services, infrastructure provisioning (Terraform/CloudFormation), and CI/CD pipelines
- Data Modeling: Strong ability to design data models that balance fast ingestion with efficient application querying
- Problem Solver: A track record of identifying performance bottlenecks in complex distributed systems and implementing elegant solutions
- Remote DNA: Demonstrated ability to work autonomously, communicate asynchronously, and manage your own 'healthy hustle.'