PreDTIs: prediction of drug–target interactions based on multiple feature information using gradient boosting framework with data balancing and feature selection techniques

Mahmud, S M Hasan

Please use this identifier to cite or link to this item: http://dspace.aiub.edu:8080/jspui/handle/123456789/2402

Full metadata record

DC Field	Value	Language
dc.contributor.author	Mahmud, S M Hasan	-
dc.date.accessioned	2024-09-22T08:29:02Z	-
dc.date.available	2024-09-22T08:29:02Z	-
dc.date.issued	2021-09-22	-
dc.identifier.citation	S M Hasan Mahmud, Wenyu Chen, Yongsheng Liu, Md Abdul Awal, Kawsar Ahmed, Md Habibur Rahman, Mohammad Ali Moni, PreDTIs: prediction of drug–target interactions based on multiple feature information using gradient boosting framework with data balancing and feature selection techniques, Briefings in Bioinformatics, Volume 22, Issue 5, September 2021, bbab046, https://doi.org/10.1093/bib/bbab046	en_US
dc.identifier.issn	1477-4054	-
dc.identifier.uri	http://dspace.aiub.edu:8080/jspui/handle/123456789/2402	-
dc.description.abstract	Discovering drug–target (protein) interactions (DTIs) is of great significance for researching and developing novel drugs, having a tremendous advantage to pharmaceutical industries and patients. However, the prediction of DTIs using wet-lab experimental methods is generally expensive and time-consuming. Therefore, different machine learning-based methods have been developed for this purpose, but there are still substantial unknown interactions needed to discover. Furthermore, data imbalance and feature dimensionality problems are a critical challenge in drug-target datasets, which can decrease the classifier performances that have not been significantly addressed yet. This paper proposed a novel drug–target interaction prediction method called PreDTIs. First, the feature vectors of the protein sequence are extracted by the pseudo-position-specific scoring matrix (PsePSSM), dipeptide composition (DC) and pseudo amino acid composition (PseAAC); and the drug is encoded with MACCS substructure fingerings. Besides, we propose a FastUS algorithm to handle the class imbalance problem and also develop a MoIFS algorithm to remove the irrelevant and redundant features for getting the best optimal features. Finally, balanced and optimal features are provided to the LightGBM Classifier to identify DTIs, and the 5-fold CV validation test method was applied to evaluate the prediction ability of the proposed method. Prediction results indicate that the proposed model PreDTIs is significantly superior to other existing methods in predicting DTIs, and our model could be used to discover new drugs for unknown disorders or infections, such as for the coronavirus disease 2019 using existing drugs compounds and severe acute respiratory syndrome coronavirus 2 protein sequences.	en_US
dc.language.iso	en	en_US
dc.publisher	Oxford University Press	en_US
dc.relation.ispartofseries	22;	-
dc.subject	Bioinformatics	en_US
dc.title	PreDTIs: prediction of drug–target interactions based on multiple feature information using gradient boosting framework with data balancing and feature selection techniques	en_US
dc.type	Article	en_US
Appears in Collections:	Publications: Journals

Files in This Item:

File	Description	Size	Format
Dspace.docx		4.66 MB	Microsoft Word XML	View/Open

Show simple item record

AIUB DSpace

Welcome to the Institutional Repository of American International University-Bangladesh. We preserve and enable easy and open access to all types of digital content including text, images, moving images, mpegs and data sets.