Brand Strategy : Launch Planning

Real World Data in India: The Landscape, the Gaps, and How Pharma Can Generate Credible RWD

A practical map of India’s fragmented real world data ecosystem — and the pathways pharma teams use to generate evidence that holds up in payer and HTA review.

Executive Summary (TL;DR)

• The Reality: Real World Data in India is fragmented across hospital EMRs, private insurance claims, government scheme data, and disease registries — none of which were designed for pharmaceutical evidence generation.

• The Strategy: Credible RWD generation in India requires a multi-source approach: hospital partnerships for clinical depth, claims data for population breadth, and disease registries for longitudinal follow-up.

• The Imperative: Brands that build India-specific RWD partnerships now will have payer-grade evidence ready when PMJAY, state schemes, and the HTA cell require it; brands that wait will face submission delays and tier downgrades.

OneAlphaMed Research Desk

Pharma & Life Sciences Practice • Brand Strategy Intelligence

Updated:May 12, 2026

7 min read

India's RWD ecosystem is fragmented across data types and ownership models

Fig 1. India’s RWD ecosystem is fragmented across data types and ownership models, requiring a multi-source approach.

In this article:

A pharma brand prepares its India submission to PMJAY’s reimbursement panel. The HTA cell asks for real-world performance data. The team has international RWD from US claims databases and EU registries. The committee responds with a single observation: this data does not reflect Indian treatment patterns, Indian patient profiles, or Indian healthcare delivery. The submission stalls.

This is now standard procedure. Real World Data in India is at a tipping point. The data ecosystem is fragmented but evolving quickly, and payer expectations had outpaced those of most pharma RWD strategies. The Indian formulary access winners over the next three years will be the brands that formed India-specific RWD partnerships now. Meanwhile the ecosystem is still consolidating. This guide charts what data sources are available, what gaps need to be bridged, and what real-world options pharma teams have for producing believable evidence in this environment.

 

1. The State of India’s RWD Ecosystem

India’s Real World Data ecosystem comprises four broad data categories, each with distinct ownership, accessibility, and analytic value. Hospital electronic medical records sit primarily within private hospital chains and tertiary government institutions, capturing detailed clinical data on the patients those facilities treat. Insurance claims data spans private health insurers, government schemes such as PMJAY and CGHS, and corporate group health plans – each with different coverage populations and data depth. Pharmacy and prescription data is generated through retail pharmacies, hospital pharmacy systems, and increasingly through digital pharmacy platforms. Disease registries, although limited to select therapeutic areas, often provide the most valuable long-term patient follow-up data.

What truly defines India’s RWD landscape, however, is its fragmentation. Unlike markets such as the US, no single database captures a significant portion of the country’s 1.4 billion population. Linking data across systems, for example, connecting hospital records with insurance claims and prescription histories – remains both operationally challenging and legally sensitive under the Digital Personal Data Protection Act, 2023. As a result, organizations expecting highly integrated, centralized healthcare datasets often find that India operates very differently.

The opportunity is the consolidation underway. Ayushman Bharat Digital Mission has created the unique health identification framework (ABHA), private hospital chains have invested heavily in EMR systems over the past five years, and India’s HTA cell has signalled clear acceptance of multi-source RWD evidence in its submissions framework.

Key Insight

“India's RWD ecosystem covers an estimated 25–35% of the patient population through some structured data source — but no single source covers more than 10% on its own, making multi-source partnerships the only credible path to representative evidence.”

2. Hospital EMR Data: Available and Missing

Private hospital chains anchor India’s clinical RWD landscape. Apollo, Manipal, Fortis, Max, Medanta, and a growing tier of regional chains have built EMR systems that capture diagnosis codes, procedure data, prescriptions, lab results, and clinical notes. The depth of these data sources rivals US academic medical centre EMRs in many therapeutic areas — particularly oncology, cardiovascular disease, and complex surgical interventions.

The limitations are equally significant. Hospital EMR data captures patients during episodes of inpatient or specialist outpatient care; it does not capture the longitudinal primary care record that Western EMR networks provide. Coding consistency varies across hospital systems, with ICD-10 adoption uneven and clinical documentation following hospital-specific conventions. Patient follow-up after discharge is limited, particularly when patients transfer care to other facilities or to community physicians outside the chain.

Tier-2 and tier-3 city representation is the larger gap. India RWD drawn primarily from tier-1 metropolitan private hospitals reflects an affluent, urban patient demographic that differs systematically from the broader treated population. Brands generating Indian RWD for PMJAY submissions or state scheme dossiers must address this representativeness question directly — through partnerships with tier-2 and tier-3 hospital systems, government tertiary institutions, or regional medical colleges that capture the populations payers actually fund.

3. Claims Data: India’s Underutilised Asset

Insurance claims data is the most underutilised RWD source in India for pharmaceutical evidence generation. Private health insurers — Star Health, HDFC ERGO, Niva Bupa, ICICI Lombard, Bajaj Allianz, and the consolidating public insurers — collectively cover a meaningful share of India’s middle and upper-middle-class population. PMJAY adds large-scale claims data for India’s lower-income population. Corporate group health plans cover formal-sector employees through TPAs that maintain detailed claims records.

Each source has distinct analytic value. Private insurer data captures hospitalisation, treatment pattern, and outcome variables for working-age and older Indian populations. PMJAY claims data covers procedures and hospitalisations for an estimated 500 million eligible beneficiaries — a scale comparable to major US Medicaid datasets. CGHS data covers central government employees and pensioners. State scheme data, available through agreements with state health departments, captures regional variation that national datasets miss.

Operational challenges are real but tractable. Claims coding follows India-specific conventions that require local analytic expertise. De-identification and data governance frameworks under DPDPA must be built into every claims data partnership. Linkage to clinical outcomes typically requires hospital partnership augmentation. Sponsors that build these partnerships early — through TPA agreements, insurer data-sharing arrangements, or PMJAY’s evolving research access framework — generate the comparator and treatment-pattern evidence Indian payers expect.

→ Build a multi-source India RWD partnership before your next payer submission. → Engage OneAlphaMed Medical Affairs

4.Patient Registries and Disease-Specific Cohorts

India’s disease registry landscape has grown meaningfully over the years, particularly in areas where keeping track of patients over the long haul really matters. Some of the most useful real-world evidence we have today comes from these registries – because they don’t just capture a snapshot of a patient’s condition at one point in time. 

These cohorts show how diseases unfold, what possible treatments hold up and how, and how people actually fare beyond the controlled setting especially during a clinical trial. The Indian Council of Medical Research (ICMR) is the core to this effort responsible for national initiatives including the National Cancer Registry Programme and stroke-related registries. On the other hand, several specialty bodies have prepared their own dedicated focused databases for medical conditions like diabetes, kidney disease, and heart disorders. Altogether these efforts are building a more coherent picture of disease patterns across the country.

Pharma companies have been leaning on registries more heavily too, especially in areas like oncology, rare diseases, and advanced cardiovascular care. These days, many companies don’t wait long after a therapy launches before setting up a prospective registry – the logic being straightforward: understand how the drug actually performs once it’s out in the real world. Does the patient stay on treatment? Does it work as well as the trial suggested? What do outcomes look like a year or two in? The answers matter as the same data is often considered as the input for safety monitoring, health technology assessments, and conversations around reimbursement.

What sets registries apart from other data sources is the continuity they offer. A hospital record captures one visit. A claims database logs one transaction. A registry, on the other hand, follows a patient over time and that longitudinal view is what allows researchers and healthcare decision-makers to understand how diseases actually progress, and how people respond to therapy in the real world rather than in ideal conditions.

That said, running a registry is a serious undertaking. Keeping data collection consistent over years, coordinating with hospitals across multiple sites, managing ethics approvals and patient consent — it all adds up. Maintaining data quality doesn’t get easier with time, either. This is why companies that commit to registries tend to treat them as long-term infrastructure – not a quick one-off project, but a sustained investment in building evidence that holds up.

5. How Pharma Can Generate Credible RWD in India

Generating payer-grade real world data in India requires a deliberate multi-source strategy rather than reliance on any single data partner. Five principles guide that strategy.

Start with the evidence question, not the data source. The brand’s HTA submission, formulary defence, or label extension question dictates which data sources matter. A comparative effectiveness question against a specific competitor demands claims data with that competitor’s prescribing volume; a real-world adherence question demands prescription-fill data linked to claims; a long-term outcomes question demands registry-level follow-up.

Build cross-source partnerships during Phase 3, not after launch. Hospital EMR partnerships, insurer claims agreements, registry collaborations, and ICMR research partnerships each take six to eighteen months to negotiate and stand up. Brands waiting until post-launch lose the launch-window evidence opportunity.

Address representativeness explicitly. Indian payers know that tier-1 private hospital data over-represents affluent urban populations. Generating evidence that includes tier-2 city hospitals, government tertiary institutions, and PMJAY beneficiaries strengthens credibility substantially.

Invest in India-specific analytic capability. Indian claims coding, EMR conventions, and registry structures differ from Western data sources. Sponsors that hire or partner with India-resident health-data analysts generate evidence that payer reviewers find credible; sponsors that rely on global teams produce analyses with subtle errors that reviewers consistently flag.

Plan data governance under DPDPA from day one. India’s data protection framework now governs every patient-level data partnership. Consent mechanisms, de-identification standards, and data residency requirements must be built into RWD partnership architecture rather than retrofitted at submission.

The Strategic Imperative

Real World Data in India is moving from a fragmented research curiosity to a structured evidence pipeline that payers, HTA bodies, and regulatory authorities increasingly expect. The brands that build India-specific RWD capability now will own the evidence narrative when PMJAY, state schemes, and HTA cell submissions ramp up over the next three years. The brands that defer this investment will arrive at submission with the wrong data — global RWD that does not reflect Indian populations, or single-source data that does not satisfy multi-source representativeness expectations.

The economics increasingly favour early investment. Hospital EMR partnerships, claims data agreements, and registry collaborations cost a fraction of the formulary access value they protect. The strategic question is no longer whether to generate India RWD — it is whether the evidence will be ready when the next payer committee meets.

OneAlphaMed helps Medical Affairs and Market Access teams build multi-source RWD partnerships across India’s hospital, claims, and registry ecosystem. Explore our Medical Affairs practice →

Frequently Asked Questions

RWD is essentially patient information generated through routine healthcare - not clinical trials, but everyday interactions with the system. In India, it comes from several places: hospital EMRs from private networks and government institutions, insurance claims from private insurers and public schemes like Ayushman Bharat PMJAY, pharmacy and prescription data from retail and digital platforms, and disease registries run by organisations like ICMR, medical societies, and pharma companies.

Each source captures something different. Hospital records offer clinical depth, claims data reflects treatment patterns and access, and registries are particularly useful for tracking outcomes over time. No single source gives you the full picture - which is part of what makes working with RWD in India genuinely complex.

India generates an abundant amount of healthcare data. But the real problem is that this healthcare data is scattered across systems that don't interconnect with each other. This data is spread across hospital records, in the insurance claims, with pharmacy data, and even in the registries - which all sit in separate environments and rarely talk to each other. In other words, the data is huge but fragmented. 

In the US or Europe, large integrated databases can cover a significant share of the population in one place. India doesn't have that yet. Linking datasets here comes with both operational and regulatory challenges, so most organisations end up piecing together insights from multiple smaller sources - which works, but demands considerably more effort and care.

Credible Indian RWD generation uses a multi-source approach: hospital EMR partnerships for clinical depth, claims data for population breadth, registries for longitudinal follow-up, and explicit attention to tier-2 and tier-3 representativeness. Partnerships should begin during Phase 3, with India-resident analytic capability and DPDPA-compliant data governance built in from the start.

PMJAY claims data covers an estimated 500 million eligible beneficiaries, providing scale comparable to major US Medicaid datasets and capturing populations that private hospital data systematically under-represents. Access to PMJAY data is evolving through formal research frameworks, and brands that establish this access early secure evidence that government scheme payers find credible for formulary and tier-placement decisions.

Share on: