Transflo is seeking a Data Scientist / Data Analytics Engineer to design, build, and operationalize advanced analytics solutions that enhance their transportation and logistics operations. This role involves delivering predictive and operational analytics, managing data engineering tasks, and collaborating with various stakeholders to ensure the effectiveness of analytics products.
Responsibilities:
- Design, train, validate, and deploy predictive models (regression, classification, time-series forecasting, survival analysis, clustering, anomaly detection, and gradient-boosted / deep learning approaches as appropriate to the problem)
- Lead model selection, hyperparameter tuning, cross-validation, and rigorous performance evaluation using metrics aligned to business objectives (precision/recall trade-offs, MAPE, RMSE, lift, calibration, etc.)
- Develop data products in areas relevant to transportation, including operational metrics, fraud signals, pricing analytics, industry trends,etc
- Establish model monitoring, drift detection, retraining cadence, and explainability practices (SHAP, feature importance, partial dependence) to keep production models trustworthy and operationally self sustaining
- Produce point-in-time analytics, KPI scorecards, and exception reporting to support daily operational decisions across dispatch, fleet, customer success, finance, and product teams
- Partner with business stakeholders to translate questions into well-scoped analyses; deliver clear, defensible insights with documented assumptions and data lineage
- Build and maintain reusable analytical datasets, semantic layers, and certified metrics so the organization works from a consistent source of truth
- Build and maintain data pipelines (batch and streaming) on AWS using services such as Redshift, S3, Glue, Lambda, Step Functions, Kinesis / MSK, EMR, Athena, and SageMaker
- Implement medallion (bronze / silver / gold) architecture patterns to progressively refine raw operational data into analytics-ready and ML-ready datasets
- Apply STARR (Star schema / dimensional) modeling and related techniques to build performant, business-friendly data models in Redshift and the broader warehouse layer
- Drive data selection, curation, profiling, and quality enforcement: define source-of-truth datasets, document lineage, and codify data contracts and validation tests
- Collaborate with data engineering and platform teams on CI/CD for data and ML assets, infrastructure-as-code (e.g., Terraform / CloudFormation), and cost-aware design on AWS
- Take customer-facing analytics features and products from idea to implementation — partnering with product management, design, and engineering to turn ambiguous business questions into shipped capabilities embedded in customer-facing applications
- Contribute to product discovery: customer interviews, opportunity sizing, prototyping, and rapid iteration on analytical concepts before committing to full build-out
- Own the analytical correctness of customer-facing metrics, models, and visualizations — including definitions, edge cases, performance under real-world data conditions, and how results are explained to non-technical end users
- Define and instrument success metrics for shipped analytics features (adoption, engagement, accuracy in production, customer outcomes) and drive iterative improvements post-launch
- Translate complex analytical results into clear narratives, visualizations, and recommendations for both technical and non-technical audiences, including executive leadership and customers
- Partner cross-functionally with product, engineering, operations, and commercial teams to embed analytics into workflows, applications, and customer-facing products
- Mentor analysts and engineers on statistical rigor, modeling best practices, and modern data architecture
Requirements:
- Bachelor's degree in Statistics, Mathematics, or Supply Chain Management; a degree in Computer Science is also acceptable. Master's degree preferred but not required
- Demonstrated professional experience in the transportation, trucking, freight, logistics, or broader supply chain industry, with working knowledge of the underlying operational data (loads, stops, shipments, ELD/telematics, TMS, dispatch, billing, etc.)
- Proven track record of taking customer-facing analytics products or features from idea through implementation and launch — including product discovery, scoping, model and metric design, partnering with product/engineering, and supporting the feature in production with real customers. Candidates should be prepared to walk through at least one concrete example end-to-end
- Strong applied experience building advanced analytical models end-to-end, including problem framing, data selection and curation, feature engineering, model training and validation, and deployment
- Hands-on experience with AWS PaaS / analytics tooling, including Amazon Redshift and other relevant services such as S3, Glue, Lambda, Step Functions, Athena, Kinesis, EMR, and SageMaker
- Proficiency in SQL (advanced window functions, performance tuning on Redshift or comparable MPP warehouses) and at least one analytics-grade programming language — Python strongly preferred — with libraries such as pandas, scikit-learn, statsmodels, XGBoost/LightGBM, and PyTorch or TensorFlow as appropriate
- Experience designing and operating production data pipelines, with a clear understanding of orchestration, idempotency, observability, and data quality
- Solid grounding in statistical methods: hypothesis testing, experimental design, regression, time-series, and uncertainty quantification
- Master's degree in Statistics, Mathematics, Operations Research, Supply Chain, Computer Science, or a closely related quantitative field
- Experience implementing medallion architecture (bronze / silver / gold) in a cloud data lakehouse or warehouse environment
- Experience designing STARR / star-schema dimensional models for analytics consumption
- Experience with streaming and event-driven data (Kinesis, Kafka/MSK) for near-real-time analytics on transportation events
- Experience deploying and monitoring ML models in production using SageMaker, MLflow, or equivalent MLOps tooling
- Familiarity with BI / visualization tools (e.g., QuickSight, Power BI, Looker) and semantic layer / metrics layer concepts
- Exposure to optimization and operations research techniques (linear / mixed-integer programming, routing, network flow) applied to transportation problems
- Experience working with ELD/HOS data, telematics feeds, geospatial data, or TMS / dispatch system data, brokerage data, and general understanding of transportation backoffice operations and business processes