Persona AI is developing and commercializing rugged, multi-purpose humanoid robots that perform real work. They are seeking a highly skilled Data Pipeline Engineer to architect the systems that turn raw, unstructured multimodal data into high-fidelity training assets for their robots.
Responsibilities:
- Architect highly efficient, scalable pipelines to ingest, decode, and synchronously process thousands of hours of high-resolution egocentric video alongside rich sensor streams (IMUs, force-torque sensors, tactile pads, and joint proprioception)
- Develop sophisticated post-processing algorithms to analyze force interactions and infer unobservable or missing states from raw data. This includes calibrating and cleaning direct force-aware data collections, estimating contact forces from object deformation, tracking occluded objects during complex manipulation, or applying inverse kinematics to fill in missing joint trajectories
- Develop algorithms to translate 3D human hand tracking, wrist motion, and pose estimation into the specific 6DoF/joint-space coordinates of our humanoid’s end-effectors, relying on sensor fusion to ensure absolute precision
- Implement robust data augmentation strategies (spatial transformations, temporal scaling, synthetic viewpoints, and sensor noise injection) to expand expert trajectories and improve the robustness of our learning models
- Work closely with the Hardware Teleoperation Team (UMI & Console operators) to perfectly align human-robot play-data (haptics, force profiles, video, audio, telemetry) with large-scale pre-training datasets