Peraton is a next-generation national security company that drives missions of consequence spanning the globe. They are seeking a Data Scientist / ML Platform Engineer to contribute across the full ML development lifecycle, focusing on applied data science and MLOps while collaborating with dedicated infrastructure engineers.

Responsibilities:

Develop, train, and evaluate ML models (classification, regression, clustering, anomaly detection) and contribute to LLM-based capabilities such as RAG pipelines and prompt evaluation
Support model governance and deployment practices using MLFlow, including experiment tracking, model versioning, registry promotion workflows, and automated testing across the ML lifecycle
Contribute to production ML operations: model performance monitoring, drift detection, automated alerting, and incident escalation to maintain reliability and SLA compliance
Build and improve model serving infrastructure, feature pipelines, and lifecycle automation to support reproducible, scalable model development and inference
Apply explainability techniques (e.g., SHAP, LIME) and produce technical documentation to support stakeholder transparency and compliance requirements
Contribute to data ingestion, ELT/ETL transformation, and pipeline reliability using Spark and SQL-based frameworks within Snowflake and Databricks environments
Support pipeline orchestration, medallion architecture conventions, and data stewardship practices (metadata management, PII handling, lineage tracking in Unity Catalog)
Perform occasional system administration tasks in collaboration with platform teams, including environment configuration, access management, compute troubleshooting, and secrets handling using platform-native tools

Requirements:

Associate's with 6 years, or Bachelor's degree with 4+ years of relevant experience, or Master's degree with 2+ years of relevant experience or High School diploma with 8 years of experience in lieu of a degree
Demonstrated experience with SQL and Python, including Python-based ML frameworks (e.g., scikit-learn, XGBoost, PyTorch, or TensorFlow)
Hands-on experience with MLFlow or equivalent tools for experiment tracking, model governance, and lifecycle management
Strong understanding of SDLC fundamentals and experience with GitHub or equivalent version control
Experience with distributed compute environments (e.g., Spark, Databricks) and cloud-native services
Basic proficiency with Bash or shell scripting for automation and environment setup
Ability to collaborate across multidisciplinary teams and communicate technical concepts to varied audiences
Ability to obtain and maintain a Public Trust clearance
US citizenship required or Green Card holder and must have been in the USA for 3 of the last 5 years
Experience with MLOps practices including CI/CD for ML, containerization, feature pipeline automation, and model deployment frameworks
Experience with Databricks E2 components (Unity Catalog, Feature Store, Delta Live Tables) and/or model serving and drift monitoring tools (e.g., Databricks Model Serving, Evidenly, etc.)
Experience with LLM frameworks (e.g., LangChain, LlamaIndex, Hugging Face Transformers) and familiarity with model explainability libraries (e.g., SHAP, LIME)
Advanced Spark performance optimization experience and/or API development using Databricks REST APIs
Experience with healthcare analytics data (preferably Medicare or Medicaid) and familiarity with HIPAA or FedRAMP compliance constraints
Experience building data pipelines in a Snowflake or Databricks environment
Familiarity with orchestration tools (Airflow, Databricks Workflows)
Exposure to streaming data patterns using Spark Structured Streaming, Delta Live Tables, or Kafka
Familiarity with environment reproducibility tooling (Docker, conda) and scripting (Python, Bash) to support automation and CI/CD tasks

Data Scientist / ML Platform Engineer

Key skills

About this role

Responsibilities:

Requirements: