GE Aerospace is a leading company in the aerospace industry, seeking a Senior AI Data Engineer to design and build the AWS-native data foundation for their enterprise AI applications. The role involves leading the design strategy for AI systems, hands-on engineering, and mentoring team members while ensuring data quality and governance.
Responsibilities:
- Lead the design and evolution of the knowledge graphs and ontologies powering our AI's reasoning, retrieval, and explainability
- Align enterprise data (engineering handbooks, parts, service manuals, DMAIC records, user files) into a coherent, queryable graph with clear provenance across structured, semi-structured, and unstructured sources
- Own the retrieval substrate — graph queries, vector indexes, and hybrid retrieval — and drive measurable improvements in grounding quality
- Curate grounding corpora, eval datasets, and retrieval benchmarks for LLM-based features
- Instrument metrics for retrieval quality, grounding accuracy, and freshness; drive regressions down over time
- Shape training and inference data contracts with AI engineers, including feedback loops from user signals
- Produce conceptual, logical, and physical data models for operational and analytical workloads; establish modeling standards, naming conventions, and reuse patterns
- Build ingestion and transformation pipelines in Python and SQL using AWS services — Glue, Lambda, Step Functions, S3, Athena, OpenSearch, Neptune — and AI services such as Bedrock and Bedrock Knowledge Bases
- Author infrastructure as code in CloudFormation (CDK welcome) and apply AWS best practices for IAM, security, cost, and observability
- Profile sources, identify data quality gaps, and design automated validation, monitoring, metadata, and lineage
- Partner with security and platform teams to integrate data access with enterprise identity and access policies, as we look to modernize for AI
- Define data contracts, attributes, and metadata that policy engines can reason over for attribute- and context-based access control
- Contribute to the technical data dictionary, business glossary, and data catalog
- Set the design direction for data and semantic modeling across the team
- Mentor engineers and citizen developers on modeling, ontology design, and retrieval engineering
- Communicate tradeoffs and value clearly to product, business, and executive stakeholders
Requirements:
- Bachelor's degree in Computer Science, Engineering, or a STEM field with 3+ years of data engineering experience; OR high school diploma / GED with 7+ years of equivalent experience
- Legal authorization to work in the U.S. is required. We will not sponsor individuals for employment visas, now or in the future, for this job
- 5+ years of hands-on data engineering with a track record of designing — not just implementing — data models and semantic layers
- Production experience with knowledge graphs and ontologies (Neo4j, Neptune, TigerGraph, RDF/SPARQL, or similar) and graph query languages (Cypher, Gremlin, SPARQL)
- Strong AWS proficiency required: CloudFormation (or CDK), Glue, Lambda, Step Functions, S3, IAM, Bedrock, Bedrock Knowledge Bases; OpenSearch and Neptune a plus
- Strong Python and SQL; comfort across relational, graph, vector, and document stores
- Experience supporting AI/ML or LLM systems — RAG pipelines, embeddings, eval datasets, grounding corpora
- Experience integrating data access with enterprise identity and policy systems
- Strong cross-functional collaboration and communication, including technical presentations to non-data audiences