Improving topic modelling for Prediction of Drug Indication and Side effects

Main Article Content

Mrs.D. Mohanapriya, Dr.R. Beena

Abstract

Text mining is a common technique in system biology because it can reveal secret relationships between drugs, genes, and diseases in large quantities of data. Improved Predict Drug Indications and Side Effects using Topic Modelling and Natural Language Processing (IPISTON) was a text mining technique for drug phenotype and side effect prediction. In IPISTON, Linear Discriminative Analysis (LDA) was used to model the topics from the sentences in the collected data. Using the topics and Gene Regulation Score (GRS), a drug-topic probability matrix was constructed and it was given as input along with the syntactic distance measure to Conditional Random Field (CRF) and Bi-directional Long-Short Term Memory-CRF (BILSTM-CRF) classifiers for prediction of drug-phenotype relationship and drug-side effect relationship. In this paper, Enhanced Topic Modelling-IPISTON (ETP-IPISTON) is proposed to enhance the topic modelling for better prediction of drug-phenotype association and drug-side effect association. A logistic LDA is introduced for topic modelling. It has the capability of handling wide variety of data modalities. The logistic LDA eliminates the generative portion of the LDA while keeping the conditional distribution factorization over latent variables. The logistic LDA generates the gene vector and latent vector of every gene and it is given as input to the cells of BILSTM-CRF for topic modelling. In BILSTM-CRF, the logistic LDA reduces the computational cost of extracting topics from a large corpus. By using the topics modelled by logistic LDA-BILSTM-CRF and GRS score a drug-topic probability matrix is constructed and it is used along with the syntactic distance measure in CRF, BILSTM-CRF, Naïve Bayes, Classification and Regression Tree (CART) and Logistic regression for prediction of drug-phenotype relationship and drug-side effect relationship.

Article Details

How to Cite
Mrs.D. Mohanapriya, Dr.R. Beena. (2021). Improving topic modelling for Prediction of Drug Indication and Side effects. Annals of the Romanian Society for Cell Biology, 11542–11558. Retrieved from https://annalsofrscb.ro/index.php/journal/article/view/3974
Section
Articles