Sand Technologies is a global Physical AI company that partners with governments and enterprises to enhance essential systems across various industries. The Senior Data Engineer will design, build, and maintain scalable data architecture to support decision-support applications, collaborating with cross-functional teams to drive data initiatives.
Responsibilities:
- Architect and build a secure, scalable urban data platform integrating multi-agency and infrastructure datasets at scale
- Design resilient cloud-native architectures supporting batch, streaming, and near-real-time operational workloads
- Lead development of high-performance ingestion and transformation pipelines across legacy systems, APIs, IoT/telemetry, and structured data sources
- Implement distributed and event-driven processing systems (e.g., Spark, Kafka or equivalent) for large-scale analytical and operational use cases
- Establish platform reliability standards, including observability, automated data quality validation, lineage, monitoring, and defined SLAs/SLOs
- Design and enforce strong data governance and access control frameworks, including RBAC, encryption, auditability, and secure data handling practices
- Build modern lakehouse or equivalent architectures that enable advanced analytics, GIS, and production-grade machine learning
- Partner closely with data scientists, ML engineers, and senior stakeholders to operationalize AI and analytics at scale
- Optimize platform performance, scalability, and cost efficiency as adoption grows
- Contribute to long-term architectural direction and mentor engineering team members
Requirements:
- 6+ years designing and operating large-scale semi-distributed data platforms (hybrid centralised and distributed) in cloud or hybrid environments
- Proven experience architecting modern data systems (lakehouse, data mesh, or equivalent) supporting both analytical (descriptive and predictive) and operational workloads
- Deep hands-on expertise with distributed processing frameworks (e.g., Spark) and streaming/event systems (e.g., Kafka or similar)
- Strong experience building secure, governed data environments with robust access controls, encryption, lineage, and audit capabilities
- Experience designing secure data platforms in regulated or government environments, with strong understanding of compliance, auditability, and data protection standards
- Experience integrating heterogeneous data sources, including legacy systems, APIs, telemetry/IoT systems, and relational databases
- Demonstrated ability to design highly available, observable, production-grade data systems
- Experience enabling machine learning and advanced analytics through robust data infrastructure and feature pipelines
- Strong proficiency in Python, SQL, and ideally DBT with a track record of writing clean, production-quality code
- Experience deploying and operating solutions in AWS, Azure, or GCP, including CI/CD and infrastructure-as-code is beneficial
- Ability to operate effectively in complex, multi-stakeholder environments
- Strong systems-thinking mindset with a focus on scalability, modularity, and long-term platform evolution
- Experience designing data platforms in U.S. public sector or highly regulated environments, with working knowledge of applicable federal and state data privacy and security requirements (e.g., HIPAA, CJIS, FERPA, state-level privacy acts), and the ability to embed compliance, auditability, and data governance principles into architectural design