DockerPythonPyTorchAIMLLLMRAGMLOpsCollaborationRemote Work
About this role
Role Overview
Work on Sophea AI across LLM pre-training, training from scratch, fine-tuning, evaluation, and continuous model improvement
Build production-grade ML pipelines for inference, serving, deployment, monitoring, and model lifecycle management
Optimize model performance in production, including latency, throughput, cost efficiency, quantization, and GPU workload usage
Work with datasets, experiments, benchmarks, and evaluation methods to improve language model quality and domain-specific performance
Requirements
Strong hands-on experience with LLMs, including pre-training, training from scratch, fine-tuning, evaluation, and performance improvement
Strong ML engineering background, including Python, PyTorch, Docker, and production ML practices
Experience with model serving, inference optimization, quantization, GPU workloads, and frameworks such as vLLM, SGLang, NVIDIA Triton, TensorRT, TGI, or similar tools
Ability to build production-grade ML systems, not only research prototypes, scripts, basic RAG applications, or high-level AI integrations
Nice to have: Experience with ASR systems, speech models, or speech-to-text pipelines
Experience working with non-English language models, multilingual models, or low-resource language adaptation
Experience with MLOps infrastructure, experiment tracking, model serving pipelines, and GPU workload management
Contributions to open-source ML projects or published research in AI/ML
Tech Stack
Docker
Python
PyTorch
Benefits
Compensation: competitive package aligned with talent benchmarks
Impact: hands-on role working on Sophea AI, one of the most ambitious Greek-focused AI products in the market
Work format: remote work option, with relocation support available for candidates open to working from our Athens office
AI-native environment: real challenges across LLMs, training, fine-tuning, inference optimization, GPU workloads, and production AI systems
NVIDIA ecosystem: access to related conferences, certifications, internal knowledge sharing, and advanced AI infrastructure through Kiefer’s strategic collaboration
Culture: engineering-first, high autonomy, low bureaucracy, and space to build meaningful AI products