Artificial intelligence finds disease-related genes

13. März 2020 0 Von Horst Buchwald

Stockholm, 13.3.2020

An artificial neural network can reveal patterns in huge amounts of gene expression data and discover groups of disease-related genes. This has been shown in a new study led by researchers from the University of Linköping, published in Nature Communications. The scientists hope that the method can eventually be applied in precision medicine and individualised treatment.
When using social media, it is common for the platform to suggest people that you might want to add as friends. The proposal is based on the fact that you and the other person have common contacts, which indicates that you may know each other. Similarly, scientists create maps of biological networks based on how different proteins or genes interact.
The researchers behind a new study have used artificial intelligence, AI, to investigate whether it is possible to discover biological networks through deep learning, where entities known as „artificial neural networks“ are trained by experimental data. Since artificial neural networks are excellent at learning how to find patterns in enormous amounts of complex data, they are used in applications such as image recognition, but so far they have rarely been used in biological research.
„For the first time, we have used deep learning to find disease-related genes. This is a very powerful method in the analysis of large amounts of biological information or ‚large data‘,“ says Sanjiv Dwivedi, postdoctoral researcher at the Institute of Physics, Chemistry and Biology (IFM) at the University of Linköping.
The scientists used a large database with information on the expression patterns of 20,000 genes in a large number of people. The information was „unsorted“ in the sense that the researchers did not give the artificial neural network any information about which gene expression patterns came from people with diseases and which from healthy people. The AI model was then trained to find patterns of gene expression.
One of the challenges of machine learning is that it is not possible to see exactly how an artificial neural network solves a task. AI is sometimes described as a „black box“ – we only see the information we put in the box and the result it produces. We cannot see the steps in between.
Artificial neural networks consist of several layers in which information is processed mathematically. Besides the input layer, there is also the output layer. It provides the result of the information processing performed by the system. Between these two layers there are several hidden layers in which calculations are performed.
When the scientists trained the artificial neural network, they wondered whether it would be possible to lift the lid of the black box, so to speak, and understand how it works. Are the design of the neural network and the known biological networks similar?
„When we analysed our neural network, it turned out that the first hidden layer represented interactions between different proteins to a large extent. Deeper in the model, however, we found groups of different cell types on the third level. It is extremely interesting that this type of biologically relevant grouping is generated automatically because our network was based on unclassified gene expression data,“ said Mika Gustafsson, senior lecturer at the IFM and head of the study.
The scientists then investigated whether their model of gene expression could be used to determine which gene expression patterns are associated with a disease and which are normal. They confirmed that the model finds relevant patterns that correspond well with the biological mechanisms in the body. Since the model was trained with unclassified data, it is possible that the artificial neural network has found completely new patterns. The researchers now want to investigate whether such previously unknown patterns are relevant from a biological point of view.
„We believe that the key to progress in this field lies in understanding the neural network. This may teach us new things about biological relationships, for example about diseases in which many factors interact. And we believe that our method provides models that are easier to generalise and that can be used for many different types of biological information,“ says Mika Gustafsson.

Mika Gustafsson hopes that the close cooperation with medical researchers will enable him to apply the method developed in the study to precision medicine. For example, it might be possible to determine which patient groups should receive a certain type of drug or identify the patients most affected.
The study was financially supported by the Swedish Strategic Research Foundation (SSF) and the Swedish Research Council.

KategorieHeader