Y-h. Taguchi, Professor at the Department of Physics, Chuo University in Japan, provides comments on drug repositioning without the gene expression of disease cells treated with various drugs
Finding new drugs to treat diseases is always tricky. For example, although we have an effective mRNA vaccine for COVID-19, no effective drugs to treat COVID-19 have yet been identified. Since the difficulty is due to experiential evaluation, it is better to have a computer-oriented method. But how? We have one proposal for this at the Department of Physics, Chuo University in Japan.
In our previous Open Access Government article, we introduce how to use gene expression (1) with which several drugs were treated for drug repositioning. The problem with the studies described in the last piece is that the strategy could be used only for diseases whose gene expression is retrieved. In the study described in the previous article, we analyzed gene expression profiles of cancer cell lines. Thus, the drug repositioning we could identify was only those toward cancers.
This is a little bit problematic. If we would like to perform drug repositioning for other diseases, we need to prepare cell lines made from cells of target diseases. Since cancer cells are abnormal in some sense, it is not difficult to add immortality to cancer cell lines. Thus, we can easily maintain cancer cell lines in a Petri dish, but it is not the case for other cells. In addition to this, it is often challenging to collect cells from target diseases. It is time-consuming to treat cells of target diseases with various drugs.
Drug repositioning without the gene expression of disease cells
Here in this article, we would like to introduce another method (2) by which we can perform drug repositioning without the gene expression of disease cells treated with various drugs. To achieve this method, we need two kinds of gene expression profiles: gene expression of disease cells and gene expression of cells treated with multiple drugs. The point is that cells treated with various drugs do not have to be those of target diseases but can be any kind of cells. Then these two kinds of gene expression can be integrated with tensor decomposition (3,4).
In an actual study, gene expression of cells treated with various drugs was taken from a model animal, a rat. This database, DrugMatrix, stores a vast number of gene expression profiles of multiple tissues treated with various drugs. Measurement was performed not only after the treatment but also at different timed points after the treatment. On the other hand, gene expression of disease was typically taken from human cells. The only issue we must consider is that these two expression profiles must be taken from the same tissues. If we would like to perform drug repositioning for heart disease, tissues from which gene expression is taken should be the heart, too. Then we multiply these two expressions to have tensors like:
(healthy control vs patients) x (various drugs) x (time points) x (genes)
Since a tensor can have more than two indexes than a matrix can have, by applying tensor decomposition to the above tensor, we can have vectors that express “healthy control vs patients”, “various drugs”, “time points”, and “genes”, At first, we try to find which vectors coincide with a distinction between healthy controls and patients and which vectors represent time dependence.
Then, after the identification of these pairs of vectors, we can find which drug vectors and gene vectors are associated with these “healthy control vs patients” vectors and “time points” vectors. Finally, the investigation identified the combination of drug genes that contribute most to these drug and gene vectors, respectively.
Using mathematical procedures for these processes
Although this process sounds very complicated, it can be easily performed with purely mathematical procedures. Moreover, we do not need any special or complex computer software since tensor decomposition is a standard method invented long ago.
We applied this procedure to six diseases: heart failure, post-traumatic stress disorder (PTSD), acute lymphocytic leukaemia (ALL), diabetes, renal carcinoma, and cirrhosis. Identified genes are compared with genes whose expression is known to be altered when drug-target genes are knocked out.
Using statistical tests, we found that identified genes are significantly overlapped with genes whose expression is known to be altered when drug target genes are knocked out for five diseases tested other than ALL. Thus, our method is advantageous for performing drug repositioning.
A focus on cirrhosis
One might wonder if our method is only helpful for performing statistical analyses and not valuable for identifying a single drug to be tested for the actual disease. To deny this possibility, we focused on one specific disease, cirrhosis, since no effective drugs were known for this disease and due to our analyses, identified two genes, CYPOR and HNF4A, and one drug, bezafibrate, as the topmost promising genes and drugs toward cirrhosis.
The binding energy between these two proteins is bezafibrate, and we found large enough binding energy between these two proteins and drugs. Actually, bezafibrate was once tested for cirrhosis, and its efficiency toward cirrhosis was reported, although it has never been released as an effective drug. Thus we could identify effective drugs validated experimentally, simply in silico evaluation.
We could not experimentally test the drugs identified for five diseases, but they should be very promising since the one specified for cirrhosis was validated experimentally.
Drug repositioning approach
In addition, we recently released a computational package to perform this analysis, TDbasedUFEadv (5), in Bioconductor that is known to store various applications used in Bioinformatics. Thus now anyone interested in our method can test it. This should be a starting point for the drug repositioning task by our methods, and we expect that many new drugs will turn out to be effective towards diseases by the drug repositioning of our approach.
References
- Y-h. Taguchi, The link between gene expression and machine learning, Open Access Government 38(1) 296- 297 April (2023) https://doi.org/10.56367/OAG-038-10651
- Y-h. Taguchi, Identification of candidate drugs using tensor-decomposition-based unsupervised feature extraction in integrated analysis of gene expression between diseases and DrugMatrix datasets. Sci Rep 7, 13733 (2017). https://doi.org/10.1038/s41598-017-13003-0
- Y-h. Taguchi, Unsupervised feature extraction applied to bioinformatics, Research Outreach, No. 115, pp. 154-157, (2020), https://doi.org/10.32907/ro-115-154157
- Y-h. Taguchi, Unsupervised Feature Extraction Applied to Bioinformatics: A PCA Based and TD Based Approach, Springer International, (2020).
- Y-h. Taguchi, Turki Turki, TDbasedUFE and TDbasedUFEadv: bioconductor packages to perform tensor decomposition based unsupervised feature extraction bioRxiv 2023.05.14.540687; doi: https://doi.org/10.1101/2023.05.14.540687
This work is licensed under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International.