Defense Unicorns is a company focused on delivering secure solutions for continuous software integration and delivery. They are seeking a Data Engineer to work closely with mission heroes in the Department of Defense environments, deploying and integrating data capabilities while ensuring operational stability and compliance.

Responsibilities:

Deploy and configure UDS Data Capability in the mission hero's environment. Stand up the UDS Store (Iceberg, Rook/Ceph, pgvector, Postgres), wire up UDS Transit for air-gap data movement, configure UDS Govern policies (Pepr/Lula), and integrate UDS Connect (Strimzi/Kafka) where streaming or legacy connectors are required
Build and support integrations with existing mission systems. Connect UDS Data Capability to legacy databases, flat-file drops, SOAP/REST endpoints, message buses, existing object storage, and identity providers (Keycloak, mission-side SSO). Prior experience integrating with these types of systems is more relevant than experience building on them
Contribute to mapping complex data landscapes. Some engagements involve hundreds of interconnected systems of record with overlapping schemas and deeply interdependent data flows. You'll help trace how data moves across systems and identify dependencies alongside senior engineers and government stakeholders
Build pipelines that move data through classification boundaries, including ingestion, transformation, catalog registration, model/dataset packaging via Zarf, cross-domain transit, and eventual consistency across DDIL conditions
Implement data provenance, lineage, and governance practices. Track where data came from, how it transformed, and who can access it
Operate what you deploy. Day-2 ownership includes capacity, performance, backup/restore (Velero), observability (Vector/Loki), incident response, and upgrade paths. Hand off to the mission hero's ops team once it's stable
Generate accreditation artifacts, including STIG evidence, cATO documentation, FIPS validation notes, and policy mappings. You produce the evidence the mission hero's ISSM/ISSO needs to run this in IL4/IL5
Contribute field feedback back to product and engineering. File issues, write postmortems, and surface what's working and what's breaking in the mission environment
Support training and knowledge transfer. Contribute to runbooks, architecture docs, and working sessions that leave the mission hero's team self-sufficient

Requirements:

Unstructured data at scale. Experience storing and querying large unstructured datasets using data lake architectures. Spark experience preferred
Streaming & integration. Experience with stream processing infrastructure (Kafka, Redpanda, Flink, or equivalent) and bridging data from heterogeneous sources into modern pipelines
Data warehousing. Open-source data warehousing platform experience. These environments do not support proprietary platforms, so you need to be comfortable building without them
Pipelines & orchestration. Airflow, Dagster, Argo Workflows, or similar. Comfort building, scheduling, monitoring, and recovering production data pipelines
Data modeling & SQL. Fluent in SQL. Comfortable designing schemas for both analytical and operational workloads
Open-source orientation. You are comfortable building on open-source tooling and contributing back to it
U.S. citizenship and the ability to obtain and maintain a DoD security clearance. Clearance sponsorship available for the right candidate
Comfort working directly with mission heroes and government stakeholders. Clear communication with both technical and non-technical audiences
Comfort with periodic on-site work, sometimes for days at a stretch, and equal comfort working remotely
Bias toward delivery. Preference for shipping a working integration over perfecting a design that hasn't met a real workload
Self-direction. You will encounter environments and problems that are not yet documented and will need to work through them independently
Data provenance, lineage, and governance. Experience with lineage tracking, data catalogs, provenance systems, or governance frameworks. Depth here will be weighted heavily
DoD or defense program experience
Active Secret clearance (or higher)
Lakehouse & storage: Apache Iceberg (or Delta/Hudi), object storage (Ceph/S3-compatible), Postgres (including extensions like pgvector), columnar/OLAP engines (Trino, DuckDB, ClickHouse, Spark SQL)
Change Data Capture: Debezium or similar CDC patterns
Governance, catalog & access: REST catalogs (Iceberg REST, Polaris/Gravitino/Nessie family), ABAC/RBAC patterns, OIDC/OAuth, lineage and audit
Kubernetes awareness. Deep K8s expertise is not required; general familiarity with deployments, operators, and how applications run on Kubernetes is valuable
Linux fundamentals, container runtime behavior, networking, TLS, secrets management
IaC (Terraform, Pulumi, or similar) and GitOps patterns (Flux, ArgoCD)
Familiarity with the CNCF ecosystem, including the distinction between foundation projects and single-vendor projects
AI/ML awareness. General understanding of how data infrastructure supports model training, versioning, provenance, and AI operations
Familiarity with Air Force or Space Force systems of record (e.g., MILPDS, ARMS) and how data flows between them

FDE Data Engineer- Space

Key skills

About this role

Responsibilities:

Requirements: