Design the methodology to evaluate the fitness-for-purpose of real-world data (RWD) sources for insights or evidence generation
Lead and execute feasibility assessments for RWD sources (electronic health records, administrative claims, patient registries, wearable/digital health data) to determine suitability for specific research/business objectives
Develop and apply structured data assessment frameworks to evaluate data quality dimensions, including accuracy, completeness, validity, timeliness, longitudinally consistency, and integrity
Assess the availability and representativeness of patient populations within RWD sources available in Sanofi for both internal decision-making and regulatory-grade evidence generation
Evaluate the feasibility of extracting structured and unstructured data elements (e.g., clinical scores, patient-reported outcomes) from EHR systems, including NLP-based extraction from clinical notes
Document assessment outcomes in standardized feasibility reports and communicate findings clearly to cross-functional stakeholders
Identify and articulate limitations of RWD sources, such as proxy endpoint constraints, population coverage gaps
Design methodologically sound recommendations & minimize misuse of RWD, leading to unreliable insights or evidence generation
Ensure appropriate use of ICD codes, procedure codes, and other medical coding standards (sourced from peer-reviewed references such as PubMed, Embase, and Orphanet, etc.) for patient identification, healthcare provider segmentation, clinical site identification, and phenotyping
Apply advanced epidemiological and biostatistical methods including propensity score methods, time-to-event analyses, sensitivity analyses, and bias assessment
Provide methodological input on the use of clinical score proxies and surrogate endpoints in RWD contexts, clearly delineating their applicability for internal versus regulatory/publication use
Provide methodology advises ensuring deliverables from RWD Foundation, RWD Science, and RWD Products are based on medical evidence/guidelines, clinically & contextually relevant
Work closely with analysts & data scientists to ensure methodological recommendation is realistic and implementable
Partner with R&D, Business units (Vaccines, General Medicine and Specialty Care) & Digital teams on data identification and appropriate usage of RWD for insights / evidence generation across drug lifecycle
Serve as the methodological point of contact for fit-for-purpose data assessment inquiries from internal stakeholders
Collaborate with RWD Foundation, RWD Product Owners, RWD Data Sciences to ensure RWD are used appropriately to inform reliable decision making & to provide knowledge transfer on data domain expertise
Manage external data vendors and technology partners (e.g., EHR, claims, registries) to understand data limitations and to verify methodological recommendations when required
Requirements
Advanced degree (Master's or PhD) in Epidemiology, Biostatistics, Health Informatics, Health Economics, Pharmacoepidemiology, or a closely related quantitative discipline
Minimum 4-5 years for Master’s degree holder or 2-4 years for Doctoral degree holder of relevant experience in real-world data, commercial analytics, real-world evidence, health outcomes research, fit-for-purpose feasibility assessment, data quality assessment or a related field within the pharmaceutical, biotech, or health technology industry
Experience in predictive modeling using RWD to identify at risk patient populations with a publication record in peer-review journals
Experience in patient & healthcare provider segmentation to inform Medical and Commercial strategy
Demonstrated expertise in epidemiological study design and statistical methods such as propensity score matching, descriptive statistics, regression analysis, predictive modelling.
Strong proficiency in statistical programming languages: SQL, Python, R, and/or SAS
Solid working knowledge of Snowflake for database querying and data extraction
Familiarity with medical coding systems: ICD-10, CPT, SNOMED CT, LOINC, RxNorm and experience/knowledge on OHDSI OMOP CDM standardized data model for healthcare data
Understanding of US EHR, claims, disease registry data, public health surveillance data as well as US healthcare billing system
Knowledge of automation tools such as Power Automate, Power App (an asset not required)