More diverse biobanks are reshaping precision medicine — and correcting a genetics problem hiding in plain sight
More diverse biobanks are reshaping precision medicine — and correcting a genetics problem hiding in plain sight
Precision medicine is often described as the future of healthcare: predict disease earlier, diagnose more accurately and tailor care to the biology of each patient. On paper, it is an elegant idea. In practice, though, it has long carried a serious weakness. Much of the genetic data used to build this future has come disproportionately from people of European ancestry.
That imbalance is not a minor technical footnote. It has scientific and clinical consequences. When the genetic reference base does not reflect human diversity very well, important disease-linked variants can be missed, predictive tools may work less well outside the groups they were trained in, and precision medicine risks becoming most precise for the people already best represented in the data.
That is why growing diversity in biobanks matters so much. This is not only about collecting more samples. It is about building a version of genomic medicine that is more complete, more reliable and more equitable.
The supplied evidence supports that framing well. Larger and more diverse biobanks improve genetic discovery across diseases, help make risk prediction more accurate across ancestry groups, and reduce dependence on a model of precision medicine built mainly around European ancestry data.
What goes wrong when the genetic base is too narrow
Human genetics has advanced remarkably over the past two decades, but it has not advanced on an equally representative foundation. Many large genome-wide association studies have relied heavily on participants of European ancestry.
That has still produced important discoveries, but it has also built in a structural bias. Variants, effect sizes and genetic patterns identified in one population do not always transfer cleanly to another. The frequency of risk-associated variants can differ. Their interaction with other genetic or environmental factors can differ too. And predictive models built from one ancestry group can lose accuracy in another.
In practical terms, this means a genetic risk score may look highly useful in one population and much less reliable in another. It also means that underrepresented populations may be left behind twice: first because fewer discoveries are made in them, and second because the tools eventually applied to them were developed using someone else’s genetic baseline.
Diversity improves the science itself, not just its optics
One of the strongest messages from the supplied literature is that diversity in biobanks is not just an ethical correction. It is a scientific advantage.
The Global Biobank Meta-analysis Initiative showed that data from 23 biobanks across four continents can be harmonised to increase power for genome-wide association studies and improve risk prediction. That matters because it demonstrates something important: population diversity does not have to fragment the science. When analysed well, it can strengthen it.
The broader the ancestry mix in a dataset, the greater the chance of identifying disease-associated variants, validating findings across populations and avoiding overly narrow conclusions drawn from one group alone.
Put simply, diversity is not just about inclusion. It improves the quality of the genetic map medicine is trying to build.
The practical value is already showing up in risk prediction
This is especially clear in polygenic risk scores, which combine the effects of many variants to estimate susceptibility to disease.
One of the biggest problems with earlier risk scores was uneven performance. Many worked reasonably well in people of European ancestry and much less well in others. That was not a mysterious technical flaw. It was the predictable outcome of building tools from narrow data.
The evidence supplied suggests that multi-ancestry modelling is beginning to correct some of that problem. A coronary artery disease polygenic risk score built across five ancestries improved prediction across multiple ancestry groups compared with earlier versions.
That is an important proof of concept. It shows that diversity in biobanks does not simply broaden theoretical knowledge. It can produce tools that are more robust and more broadly useful.
Alzheimer’s genetics shows how large the gap still is
Broader genomic reviews, including in Alzheimer’s disease, make the same point from another angle. Underrepresentation of non-European populations remains a major limitation in current genetic knowledge.
This matters especially in complex diseases, where risk depends not on one mutation but on many variants interacting with age, environment, vascular health and other biological factors. If the reference population is too narrow, the whole model of risk becomes incomplete.
In diseases like Alzheimer’s, cardiovascular disease and diabetes, population diversity is not simply about adding more names to a database. It is about describing the true architecture of risk more accurately.
This is also an equity issue
There is a technical story here, but there is also an equity story.
If precision medicine is developed mainly from European ancestry datasets, it risks reproducing and even deepening existing health inequities. That can happen in several ways: less accurate risk scores in underrepresented groups, poorer interpretation of variants outside well-studied populations, and reduced clinical confidence in tools that are being applied beyond the populations in which they were built.
In the UK, this issue has particular relevance. A healthcare system serving people from diverse African, Asian, Middle Eastern, European and mixed ancestry backgrounds cannot assume that tools developed from narrow genomic reference populations will perform equally well across the board.
That makes diversity in biobanks more than a methodological preference. It becomes part of what determines whether genomic medicine works fairly in real healthcare systems.
More diversity does not mean the problem is solved
It is also important not to overstate the progress.
Even large collaborative biobanks still often contain European ancestry imbalances. Representation has improved, but it remains incomplete. That means newer tools may be better than older ones without yet being fully equitable.
There is also a broader translational limit. Better genetic association discovery does not automatically become clinical benefit. Identifying more variants linked to disease is an important step, but turning those findings into prevention, diagnosis or treatment decisions takes time, validation and implementation work.
Polygenic risk scores, too, can still perform unevenly across ancestry groups even after multi-ancestry improvements. So diversity makes genomic tools better, but it does not make them instantly flawless.
What about treatment response?
The headline also refers to treatment response, and that is where the evidence needs to be handled more carefully.
The supplied articles more directly support disease risk prediction than treatment response. That does not mean diversity is irrelevant to treatment — it almost certainly matters there as well. But based on the references provided, the strongest case is for improved discovery of disease-linked variants and better cross-population risk modelling, rather than direct proof of better treatment prediction.
To make stronger claims about treatment response, the evidence would ideally include pharmacogenomic studies or direct clinical outcome data. Without that, the fairest conclusion is that diverse biobanks strengthen the foundation on which more personalised treatment approaches may later be built, but the treatment-response side is less directly demonstrated here.
What changes from here
If this trend continues, the impact could be substantial. More diverse biobanks do not just add volume. They change what science is able to see. They help uncover variants that might otherwise remain hidden, test whether findings hold across populations, and build predictive tools that are less fragile outside a European ancestry context.
They also change how future research can be designed. Instead of building tools first and discovering later that they work poorly in large parts of the population, researchers can build the evidence base from the start with enough diversity to make the resulting science more generalisable.
That may be the most important shift of all: moving from a precision-medicine model shaped by a limited subset of humanity to one that more closely reflects the patients it aims to serve.
The most balanced takeaway
The available evidence strongly supports the idea that more diverse biobanks improve genetic discovery and make risk prediction more useful across populations. International collaborations have already shown that combining data across continents increases the power of genomic studies and helps produce tools that work more broadly than those built mainly from European ancestry data.
But caution still matters. Diversity gains remain incomplete, improved discovery does not automatically create immediate clinical benefit, and the case for treatment response is less directly demonstrated by the supplied research.
Even so, the overall direction is clear. If precision medicine is going to deserve the word “precision,” it cannot be precise for only some people. That is why making biobanks more diverse is no longer a side issue in genetics. It is becoming a basic requirement for better science and fairer medicine.