ZoomInfo is where careers accelerate, and they are seeking a Senior Software Engineer to join their Web Data team. The role involves building the next generation of ZoomInfo's web crawling and data extraction infrastructure, focusing on engineering execution and collaboration.
Responsibilities:
- Design and implement components of scalable, fault-tolerant web crawling and extraction pipelines
- Write clean, production-grade code in Java and Python
- Build and operate ETL/ELT pipelines for large-scale data extraction and transformation
- Work with cloud infrastructure on GCP and AWS, primarily on GKE
- Improve observability, reliability, and operational excellence across the systems you contribute to
- Partner with product and data science teams to deliver impactful solutions
- Contribute to code reviews, documentation, and knowledge sharing across the team
- Stay current with evolving web technologies, anti-crawling mechanisms, and AI-powered extraction approaches
Requirements:
- 5+ years of professional software engineering experience building production systems
- Strong CS fundamentals: algorithms, data structures, concurrency, distributed systems
- Proficiency in Java and/or Python
- Track record of owning features end-to-end from design through deployment and operation
- Comfortable making sound architectural decisions at the component level
- Hands-on experience with cloud data warehouses such as BigQuery or Snowflake
- Experience designing and operating large-scale ETL/ELT pipelines
- Experience with orchestration tools such as Apache Airflow
- Experience with streaming or event-driven systems such as Apache Kafka
- Production experience on GCP (preferred) or AWS; multi-cloud exposure is a plus
- Hands-on experience with Kubernetes (GKE/EKS) for distributed workloads
- Familiarity with infrastructure-as-code tooling such as Terraform
- Strong communicator who can explain technical decisions clearly
- Comfortable operating in ambiguity and iterating quickly
- Bias toward action and pragmatic problem solving
- Self-starter who thrives in fast-paced, evolving environments
- Experience with web crawling at scale (Scrapy or similar frameworks)
- Familiarity with proxy infrastructure, rotation strategies, or anti-bot evasion techniques
- Experience in extracting structured and unstructured web data from diverse site architectures
- Knowledge of SERP (Search Engine Results Page) extraction
- Comfort with AI/LLM-based extraction approaches, applying language models to HTML at scale
- Experience working in a B2B data company or data-as-a-product environment