Population ageing poses a significant challenge to societies worldwide. Clearly, a better understanding of the molecular changes that occur with ageing is needed in order to discover new treatments for age-associated diseases that range from arthrosis, cardiovascular disease and dementia. A major obstacle is the ability to accurate describe age-associated changes in cells, tissues and organs and to determine efficacies of trials targeting the ageing associated damage and dysfunction.
To address the need biomarkers of ageing, we are developing machine learning algorithms that can find patterns in large and complex datasets such as histological images of pathological samples of individuals with and without specific diseases. These patterns may allow us to determine the underlying molecular changes that occur with ageing and how we can intervene in the ageing process.
Routinely described pathologies are an untapped resource in the effort to characterize human ageing and discover new mechanisms of ageing and biomarkers. Our research takes advantage of pathology reports written by skilled physicians describing the micro-anatomy of more than 30 million tissue samples dating back to the 1970s. The key challenge in harnessing the rich information found in clinical narrative text is the unstructured nature of natural language. By applying natural language processing methods to clinical text, structured parameters describing the tissue architecture can be extracted into a form amenable to computational analysis.
Using this rich data set enables us to characterise ageing on the micro-anatomical level in very large numbers of individuals. Furthermore, coupled together with additional registry data such as pharmaceutical drug use, this novel dataset allows us to describe histological phenotypes in population cohorts of specific interest, leading to hypotheses that can be further validated in model organisms.