-
Computational Biomedicine (CBM, Online ISSN: 3107-3131) is a peer-reviewed, open access journal published quarterly and owned by Science Exploration Press. The journal covers a wide range of topics, including molecular medicine, simulation, modeling techniques, imaging methods, and information technology. Our mission is to encourage scientists to publish their experimental and theoretical findings in a detailed open-access format. We invite submissions across various article types, including Research Articles, Review Articles, Editorials, Case Reports, Letters to the Editor, Perspectives, and Commentaries. more >
Articles
Bias Correction using content adaptation for medical image translation
-
Aims: Medical image translation is widely used for data augmentation and cross-domain adaptation in clinical image analysis. However, the nature of medical imaging makes it challenging to collect high-quality samples for the training of translation ...
MoreAims: Medical image translation is widely used for data augmentation and cross-domain adaptation in clinical image analysis. However, the nature of medical imaging makes it challenging to collect high-quality samples for the training of translation models. Because of the limited access to and costly expense of medical images, distribution bias is commonly observed between the source and target samples, and this finally leads the models to mismatch the target domain.
Methods: To promote medical image translation, a bias correction method, named content adaptation, has been proposed in this study to align the training samples in the data space. Based on the invariant medical topological structure, paired samples are constructed from weakly paired and unpaired data using content adaptation to correct the distribution bias and promote the image-to-image translation.
Results: Experiments on retinal fundus image translation and COVID-19 CT synthesis demonstrate that the proposed method effectively suppresses structural hallucination and improves both visual quality and quantitative performance. Consistent gains are observed across multiple backbone models under different supervision settings. The results suggest that explicit anatomical alignment provides an effective and model-agnostic way to mitigate distribution bias in medical image translation. By bridging weakly paired data with paired translation paradigms, the proposed approach enhances structural fidelity without requiring strong supervision.
Conclusion: This work presents a topology-guided content adaptation strategy that improves robustness and reduces hallucination in medical image translation. The proposed framework is general and can be readily integrated into existing translation models, offering a practical solution for data-scarce medical imaging scenarios.
Less -
Huiyan Lin, ... Heng Li
-
DOI: https://doi.org/10.70401/cbm.2026.0011 - February 14, 2026
A bi-directional LSTM architecture enhanced with channel attention for seizure prediction
-
Aims: Neural networks capable of capturing temporal dependencies in electroencephalogram (EEG) signals hold considerable potential for seizure prediction by modeling the progressive evolution of preictal EEG changes. However, redundant or less ...
MoreAims: Neural networks capable of capturing temporal dependencies in electroencephalogram (EEG) signals hold considerable potential for seizure prediction by modeling the progressive evolution of preictal EEG changes. However, redundant or less informative temporal features may obscure critical preictal patterns, limiting seizure prediction performance. To address this, we developed a neural architecture that effectively leverages informative temporal features to enhance seizure prediction capability.
Methods: We designed a bidirectional long short-term memory (BiLSTM) network enhanced with a channel attention mechanism, termed Attention-BiLSTM, which adaptively emphasizes informative temporal features while reducing information redundancy. We further analyze the model’s attention weights and feature distributions to provide interpretable insights into its decision-making process.
Results: Evaluation on the CHB-MIT scalp EEG dataset demonstrates that Attention-BiLSTM achieves significant performance improvements over the baseline BiLSTM, with an average accuracy of 94.77%, sensitivity of 94.58%, specificity of 94.97%, and an area under the curve of 98.38%. Furthermore, visualization results indicate that the proposed model progressively enhances feature discriminability and directs attention to the most relevant temporal features for seizure prediction.
Conclusion: The proposed Attention-BiLSTM achieves improved performance and interpretability, offering valuable insights to support future development of scalable and generalizable seizure prediction systems.
Less -
Haiqing Yu, ... Dong Ming
-
DOI: https://doi.org/10.70401/cbm.2026.0010 - February 05, 2026
PCMCI-SVM: A model identifying diagnostic biomarkers for autism spectrum disorder through causal network analysis
-
Aims: Accurately identifying diagnostic biomarkers for Autism Spectrum Disorder (ASD) is crucial for enabling early diagnosis and timely intervention. Brain causal networks, which outline causal relationships and information transmission pathways ...
MoreAims: Accurately identifying diagnostic biomarkers for Autism Spectrum Disorder (ASD) is crucial for enabling early diagnosis and timely intervention. Brain causal networks, which outline causal relationships and information transmission pathways among different brain regions, hold significant diagnostic value for ASD. Identifying ASD-relevant causal relationships between brain regions is critical for ASD diagnosis and for elucidating its pathophysiology. This study proposes a new model called the Peter and Clark Momentary Conditional Independence (PCMCI)-Support Vector Machine (SVM) model, which can identify diagnostic biomarkers for ASD.
Methods: The brain blood oxygen level-dependent signals of 116 brain regions from 167 participants, consisting of 72 participants with ASD and 95 Typical Controls (TCs), were used to derive brain causal networks by detecting inter-regional causal relationships using five causal discovery algorithms; the PCMCI algorithm, spectral dynamic causal modeling, Granger causal analysis, Transfer Entropy, and Liang-Kleeman information flow causal analysis. Then, the brain causal relationships were fed into the SVM, Random Forest (RF), and K-Nearest Neighbors (KNN) classifiers, respectively to classify individuals with ASD and TCs, and thereby obtain a model suitable for identifying diagnostic biomarkers for ASD through causal network analysis.
Results: Experimental results demonstrate that the PCMCI-SVM model achieves an accuracy of 91.29% in ASD identification and outperforms the other four causal discovery algorithms, as well as the RF and KNN classifiers. Moreover, our analysis indicates that the right thalamus and right middle temporal gyrus are potential diagnostic biomarkers for ASD. Additionally, the causal relationship between [left inferior parietal→right insula] was found to be associated with the dorsal attention network, while [right cuneus→left cuneus] was associated with the visual network, suggesting that disruptions in these causal relationships may impair the functional integrity of their respective subnetworks.
Conclusion: Our findings show that cognitive processes and brain connectivity are largely influenced by causal interactions between different brain regions. These potential diagnostic biomarkers not only offer insights into the neurofunctional mechanisms of ASD but also hold promise for improving diagnostic accuracy for ASD.
Less -
Hao Wang, ... Yanrui Ding
-
DOI: https://doi.org/10.70401/cbm.2026.0009 - February 04, 2026
-
This article belongs to the Special Issue AI for Biomedicine: Models, Applications, and Challenges
iCDG-MOHGAT: Identification of cancer driver gene using multi-omics data and heterogeneous graph attention network
-
Aims: Driver mutations are crucial factors in the occurrence and development of cancer. Identifying cancer-related driver genes is of great significance for understanding the mechanisms of cancer initiation, prevention, and treatment. With the ...
MoreAims: Driver mutations are crucial factors in the occurrence and development of cancer. Identifying cancer-related driver genes is of great significance for understanding the mechanisms of cancer initiation, prevention, and treatment. With the continuous accumulation of cancer data, how to effectively utilize these data for the identification of cancer driver genes has become a major challenge in the field of cancer biology.
Methods: We propose a novel computational model called iCDG-MOHGAT. This model integrates multi-omics pan-cancer data (such as mutations, DNA methylation, etc.), multi-dimensional gene networks, and disease semantic similarity networks to identify cancer driver genes. We first construct multi-dimensional gene networks using various types of gene correlation information (protein-protein interaction, gene sequence similarity, etc.) and establish disease semantic similarity networks for relevant cancers. Due to the complexity of node and edge types, we utilize a heterogeneous graph attention network to learn and extract features from the multi-dimensional gene networks and disease semantic similarity networks. We also incorporate a fusion learning module to effectively integrate features from different dimensions. Finally, we optimize the random forest classifier using the sparrow algorithm for the task of predicting cancer driver genes.
Results: Experimental results demonstrate that iCDG-MOHGAT outperforms many state-of-the-art models in terms of AUPR and AUROC. In the final prediction results, 91% of the predicted new driver genes have at least one supporting evidence of being cancer genes. In the laboratory, this model can serve as an effective tool for identifying cancer driver genes.
Conclusion: We have introduced a novel computational model named iCDG-MOHGAT, which precisely identifies cancer driver genes by integrating multi-omics pan-cancer data and intricate multidimensional gene networks, coupled with disease semantic similarity networks. Experimental results demonstrate that iCDG-MOHGAT outperforms many state-of-the-art models in terms of AUPR and AUROC. In the final prediction results, 91% of the predicted genes have supporting evidence. In the laboratory, this model can serve as an effective tool for identifying cancer driver genes.
Less -
Lin Yuan, Jiawang Zhao
-
DOI: https://doi.org/10.70401/cbm.2026.0008 - February 03, 2026
Drug-target affinity prediction based on multi-source information and graph convolutional network
-
Aims: Drug-target affinity (DTA) prediction is crucial for drug discovery and repositioning. However, existing deep learning-based methods often overlook the synergy between the topological structure of DTA networks and the multimodal features ...
MoreAims: Drug-target affinity (DTA) prediction is crucial for drug discovery and repositioning. However, existing deep learning-based methods often overlook the synergy between the topological structure of DTA networks and the multimodal features of drugs and targets themselves.
Methods: This study proposes a new method, MIGDTA, a DTA prediction method based on multi-source information and graph convolutional network (GCN), which enhances prediction accuracy by integrating local features with global interaction information. MIGDTA first constructs a drug molecular graph, a target protein graph, and a DTA network, while computing molecular fingerprints and protein descriptors. Subsequently, it employs a graph isomorphism network to learn graph features, a GCN to capture network features, and a multilayer prceptron to encode biological features. Then, it refines heterogeneous network and graph features iteratively through the GCN, and finally concatenates the fused features with biological features for affinity prediction.
Results: Comparative experiments on benchmark datasets demonstrate that MIGDTA significantly outperforms existing methods. On the Davis dataset, compared to the best baseline method, MIGDTA reduces mean squared error (MSE) to 0.185, increases CI by 0.006, and improves
by 5%. Similar enhancements were observed on the KIBA dataset, where MIGDTA achieves an MSE of 0.130, along with 0.002 and 1% gains in CI and , respectively. Conclusion: Feature ablation studies verify the core role of graph features in modeling local structures and network features in capturing global topology, along with the supplementary importance of biological features. Comparative analyses of feature integration approaches confirm the effectiveness of the feature refinement module in fusing multimodal features and enhancing model discriminability.
Less -
Xiujuan Lei, ... Yuchen Zhang
-
DOI: https://doi.org/10.70401/cbm.2026.0007 - January 19, 2026
A comprehensive review on neuropeptides: databases and computational tools
-
Neuropeptides are crucial signaling molecules that regulate diverse physiological processes spanning growth, social behavior, learning, memory, metabolism, homeostasis, reproduction, and neural differentiation across both nervous and peripheral ...
MoreNeuropeptides are crucial signaling molecules that regulate diverse physiological processes spanning growth, social behavior, learning, memory, metabolism, homeostasis, reproduction, and neural differentiation across both nervous and peripheral systems. Dysregulation of neuropeptides signaling is closely linked to various pathological conditions, such as neurological disorders, metabolic diseases, cardiovascular conditions, and even cancer, positioning them as potential therapeutic agents or targets for intervention. In recent years, research into neuropeptides has accelerated, with vast amounts of data continuously accumulating in multiple databases. However, the study of neuropeptides is often impeded by the need for extensive and time-consuming experimental investigations. As a result, computational tools have become essential for the rapid, large-scale identification of neuropeptides. This review systematically discusses neuropeptide-related databases and computational tools. These databases organize extensive data on neuropeptide sequences, structures, and functions. Among these, NeuroPep2.0, with 11,417 neuropeptide entries, is currently the most widely used dataset for neuropeptide prediction. Additionally, this review explores the application of computational approaches in neuropeptide prediction. While early methods predominantly relied on homologous sequence alignment and biochemical feature statistics, recent advances in machine learning have significantly enhanced prediction accuracy and efficiency. Tools such as NeuroPred-PLM and DeepNeuropePred, developed by our research group using protein language models, have substantially improved prediction performance. In conclusion, this review provides a comprehensive overview of current neuropeptide databases and computational tools, offering researchers a thorough survey of available resources and analytical methods, and emphasizing the necessity of continuous optimization to advance neuropeptide research and its therapeutic applications.
Less -
Wei Xu, ... Yan Wang
-
DOI: https://doi.org/10.70401/cbm.2025.0001 - April 10, 2025
MediHerb: A multi-modal enhanced framework for disease inference via herbal knowledge
-
Aims: Development of robust and effective methods for uncovering herb interactions and constructing herb–disease associations requires the integration of diverse biological and medical information. A key challenge in Traditional Chinese Medicine ...
MoreAims: Development of robust and effective methods for uncovering herb interactions and constructing herb–disease associations requires the integration of diverse biological and medical information. A key challenge in Traditional Chinese Medicine (TCM) research is to robustly uncover herb interactions and construct reliable herb–disease associations. This task requires handling the inherently high-dimensional, multi-label, and cross-domain nature of prescription data. Existing approaches provide limited representation capacity and insufficient integration of biomedical knowledge, restricting their ability to capture the complex semantics underlying the relationships between herbs and diseases.
Methods: To address the limitations of existing approaches, we propose MediHerb, a multi-modal enhanced framework for disease inference via herbal knowledge. MediHerb unifies five complementary modalities: molecular sequences, fingerprints, physicochemical properties, graphical prescription representations, and the description of TCM prescriptions into a shared latent space. An attention-based fusion mechanism aligns the semantics across molecular, herbal, and diagnostic levels, enabling multi-granularity reasoning. To further promote accessibility, a lightweight graphical interface has been developed to support interaction with both the model and open datasets.
Results: Experimental results on benchmark datasets demonstrate that MediHerb substantially outperforms existing baselines in herb–disease inference. Beyond predictive accuracy, the learned embeddings and model attention patterns reveal meaningful biological and pharmacological insights, confirming that MediHerb captures the mechanistic underpinnings of herb–disease associations.
Conclusion: MediHerb highlights the potential of knowledge-enhanced multi-modal fusion to bridge molecular, herbal, and clinical semantics, offering a more interpretable and holistic approach to understanding TCM prescriptions.
Less -
Xiaoyi Liu, ... Jijun Tang
-
DOI: https://doi.org/10.70401/cbm.2025.0003 - November 25, 2025
Computational approach to pulmonary delivery of therapeutical RNAs
-
Targeted delivery of RNA-based therapeutics to the lungs remains a substantial challenge due to the unique anatomy of lung tissue and its complex immune barriers. In recent years, the convergence of physiologically based pharmacokinetic (PBPK) models, quantitative ...
MoreTargeted delivery of RNA-based therapeutics to the lungs remains a substantial challenge due to the unique anatomy of lung tissue and its complex immune barriers. In recent years, the convergence of physiologically based pharmacokinetic (PBPK) models, quantitative systems pharmacology approaches, and machine learning algorithms has led to the development of computational medicine frameworks, providing intelligent tools for addressing the aforementioned challenges in efficient pulmonary delivery. By integrating experimental data with predictive computational models, these approaches have advanced the development of RNA therapies for pulmonary diseases. The deep integration of multimodal data is expected to further accelerate drug discovery and clinical translation. In this review, we systematically summarize these computational approaches in designing and optimizing pulmonary RNA delivery systems. We particularly highlight the mechanism-based rational design of RNA therapies through simulations and predictions of biodistribution, cellular targeting, and intracellular transport processes.
Less -
Xianan Li, ... Pu Chen
-
DOI: https://doi.org/10.70401/cbm.2025.0004 - November 28, 2025
SpanAttNet: A hybrid SpanConv SPDConv architecture with residual self attention for viral protein subcellular localization
-
Aims: The subcellular localization of viral proteins can give insight into virus replication, immune evasion, and the development of therapeutic targets. Traditional experimental methods for determining localization are time-consuming and costly ...
MoreAims: The subcellular localization of viral proteins can give insight into virus replication, immune evasion, and the development of therapeutic targets. Traditional experimental methods for determining localization are time-consuming and costly to perform, which calls for robust computational approaches. In this paper, we propose designing a computational method for identifying the subcellular localization of viral proteins.
Methods: In the effort to improve feature extraction for viral protein subcellular localization, a novel hybrid deep learning architecture, SpanAttNet, was proposed by incorporating span-based convolution with spatial pyramid dilated convolution and a residual self-attention mechanism. Three commonly used sequence descriptors, AAC, PseAAC, and DDE, each combined with PCA for feature dimension reduction, were systematically used to benchmark SpanAttNet.
Results: Among the individual descriptors, the best performance was yielded by PseAAC (accuracy 93.95%, MCC 91.18% at ρ = 0.8 PCA reduction), while optimal performance from DDE was at minimum reduction (accuracy 87.00% at ρ = 0.2). Moreover, ensemble feature fusion across the various descriptors elevated SpanAttNet to its top performance, reaching an MCC of 93.79% and an F1-score of 92.91%, hence achieving the best balance between sensitivity and specificity. Compared to state-of-the-art models, SpanAttNet managed to consistently match or surpass predictive accuracy, demonstrating strong generalizability.
Conclusion: We establish SpanAttNet as a robust and biologically informed predictor for viral protein subcellular localization, with strong potential for extension to multi-label classification and broader proteomic applications.
Less -
Grace-Mercure Bakanina Kissanga, ... Hao Lin
-
DOI: https://doi.org/10.70401/cbm.2025.0006 - December 31, 2025
A bi-directional LSTM architecture enhanced with channel attention for seizure prediction
-
Aims: Neural networks capable of capturing temporal dependencies in electroencephalogram (EEG) signals hold considerable potential for seizure prediction by modeling the progressive evolution of preictal EEG changes. However, redundant or less ...
MoreAims: Neural networks capable of capturing temporal dependencies in electroencephalogram (EEG) signals hold considerable potential for seizure prediction by modeling the progressive evolution of preictal EEG changes. However, redundant or less informative temporal features may obscure critical preictal patterns, limiting seizure prediction performance. To address this, we developed a neural architecture that effectively leverages informative temporal features to enhance seizure prediction capability.
Methods: We designed a bidirectional long short-term memory (BiLSTM) network enhanced with a channel attention mechanism, termed Attention-BiLSTM, which adaptively emphasizes informative temporal features while reducing information redundancy. We further analyze the model’s attention weights and feature distributions to provide interpretable insights into its decision-making process.
Results: Evaluation on the CHB-MIT scalp EEG dataset demonstrates that Attention-BiLSTM achieves significant performance improvements over the baseline BiLSTM, with an average accuracy of 94.77%, sensitivity of 94.58%, specificity of 94.97%, and an area under the curve of 98.38%. Furthermore, visualization results indicate that the proposed model progressively enhances feature discriminability and directs attention to the most relevant temporal features for seizure prediction.
Conclusion: The proposed Attention-BiLSTM achieves improved performance and interpretability, offering valuable insights to support future development of scalable and generalizable seizure prediction systems.
Less -
Haiqing Yu, ... Dong Ming
-
DOI: https://doi.org/10.70401/cbm.2026.0010 - February 05, 2026
A comprehensive review on neuropeptides: databases and computational tools
-
Neuropeptides are crucial signaling molecules that regulate diverse physiological processes spanning growth, social behavior, learning, memory, metabolism, homeostasis, reproduction, and neural differentiation across both nervous and peripheral ...
MoreNeuropeptides are crucial signaling molecules that regulate diverse physiological processes spanning growth, social behavior, learning, memory, metabolism, homeostasis, reproduction, and neural differentiation across both nervous and peripheral systems. Dysregulation of neuropeptides signaling is closely linked to various pathological conditions, such as neurological disorders, metabolic diseases, cardiovascular conditions, and even cancer, positioning them as potential therapeutic agents or targets for intervention. In recent years, research into neuropeptides has accelerated, with vast amounts of data continuously accumulating in multiple databases. However, the study of neuropeptides is often impeded by the need for extensive and time-consuming experimental investigations. As a result, computational tools have become essential for the rapid, large-scale identification of neuropeptides. This review systematically discusses neuropeptide-related databases and computational tools. These databases organize extensive data on neuropeptide sequences, structures, and functions. Among these, NeuroPep2.0, with 11,417 neuropeptide entries, is currently the most widely used dataset for neuropeptide prediction. Additionally, this review explores the application of computational approaches in neuropeptide prediction. While early methods predominantly relied on homologous sequence alignment and biochemical feature statistics, recent advances in machine learning have significantly enhanced prediction accuracy and efficiency. Tools such as NeuroPred-PLM and DeepNeuropePred, developed by our research group using protein language models, have substantially improved prediction performance. In conclusion, this review provides a comprehensive overview of current neuropeptide databases and computational tools, offering researchers a thorough survey of available resources and analytical methods, and emphasizing the necessity of continuous optimization to advance neuropeptide research and its therapeutic applications.
Less -
Wei Xu, ... Yan Wang
-
DOI: https://doi.org/10.70401/cbm.2025.0001 - April 10, 2025
MediHerb: A multi-modal enhanced framework for disease inference via herbal knowledge
-
Aims: Development of robust and effective methods for uncovering herb interactions and constructing herb–disease associations requires the integration of diverse biological and medical information. A key challenge in Traditional Chinese Medicine ...
MoreAims: Development of robust and effective methods for uncovering herb interactions and constructing herb–disease associations requires the integration of diverse biological and medical information. A key challenge in Traditional Chinese Medicine (TCM) research is to robustly uncover herb interactions and construct reliable herb–disease associations. This task requires handling the inherently high-dimensional, multi-label, and cross-domain nature of prescription data. Existing approaches provide limited representation capacity and insufficient integration of biomedical knowledge, restricting their ability to capture the complex semantics underlying the relationships between herbs and diseases.
Methods: To address the limitations of existing approaches, we propose MediHerb, a multi-modal enhanced framework for disease inference via herbal knowledge. MediHerb unifies five complementary modalities: molecular sequences, fingerprints, physicochemical properties, graphical prescription representations, and the description of TCM prescriptions into a shared latent space. An attention-based fusion mechanism aligns the semantics across molecular, herbal, and diagnostic levels, enabling multi-granularity reasoning. To further promote accessibility, a lightweight graphical interface has been developed to support interaction with both the model and open datasets.
Results: Experimental results on benchmark datasets demonstrate that MediHerb substantially outperforms existing baselines in herb–disease inference. Beyond predictive accuracy, the learned embeddings and model attention patterns reveal meaningful biological and pharmacological insights, confirming that MediHerb captures the mechanistic underpinnings of herb–disease associations.
Conclusion: MediHerb highlights the potential of knowledge-enhanced multi-modal fusion to bridge molecular, herbal, and clinical semantics, offering a more interpretable and holistic approach to understanding TCM prescriptions.
Less -
Xiaoyi Liu, ... Jijun Tang
-
DOI: https://doi.org/10.70401/cbm.2025.0003 - November 25, 2025
iCDG-MOHGAT: Identification of cancer driver gene using multi-omics data and heterogeneous graph attention network
-
Aims: Driver mutations are crucial factors in the occurrence and development of cancer. Identifying cancer-related driver genes is of great significance for understanding the mechanisms of cancer initiation, prevention, and treatment. With the ...
MoreAims: Driver mutations are crucial factors in the occurrence and development of cancer. Identifying cancer-related driver genes is of great significance for understanding the mechanisms of cancer initiation, prevention, and treatment. With the continuous accumulation of cancer data, how to effectively utilize these data for the identification of cancer driver genes has become a major challenge in the field of cancer biology.
Methods: We propose a novel computational model called iCDG-MOHGAT. This model integrates multi-omics pan-cancer data (such as mutations, DNA methylation, etc.), multi-dimensional gene networks, and disease semantic similarity networks to identify cancer driver genes. We first construct multi-dimensional gene networks using various types of gene correlation information (protein-protein interaction, gene sequence similarity, etc.) and establish disease semantic similarity networks for relevant cancers. Due to the complexity of node and edge types, we utilize a heterogeneous graph attention network to learn and extract features from the multi-dimensional gene networks and disease semantic similarity networks. We also incorporate a fusion learning module to effectively integrate features from different dimensions. Finally, we optimize the random forest classifier using the sparrow algorithm for the task of predicting cancer driver genes.
Results: Experimental results demonstrate that iCDG-MOHGAT outperforms many state-of-the-art models in terms of AUPR and AUROC. In the final prediction results, 91% of the predicted new driver genes have at least one supporting evidence of being cancer genes. In the laboratory, this model can serve as an effective tool for identifying cancer driver genes.
Conclusion: We have introduced a novel computational model named iCDG-MOHGAT, which precisely identifies cancer driver genes by integrating multi-omics pan-cancer data and intricate multidimensional gene networks, coupled with disease semantic similarity networks. Experimental results demonstrate that iCDG-MOHGAT outperforms many state-of-the-art models in terms of AUPR and AUROC. In the final prediction results, 91% of the predicted genes have supporting evidence. In the laboratory, this model can serve as an effective tool for identifying cancer driver genes.
Less -
Lin Yuan, Jiawang Zhao
-
DOI: https://doi.org/10.70401/cbm.2026.0008 - February 03, 2026
Drug-target affinity prediction based on multi-source information and graph convolutional network
-
Aims: Drug-target affinity (DTA) prediction is crucial for drug discovery and repositioning. However, existing deep learning-based methods often overlook the synergy between the topological structure of DTA networks and the multimodal features ...
MoreAims: Drug-target affinity (DTA) prediction is crucial for drug discovery and repositioning. However, existing deep learning-based methods often overlook the synergy between the topological structure of DTA networks and the multimodal features of drugs and targets themselves.
Methods: This study proposes a new method, MIGDTA, a DTA prediction method based on multi-source information and graph convolutional network (GCN), which enhances prediction accuracy by integrating local features with global interaction information. MIGDTA first constructs a drug molecular graph, a target protein graph, and a DTA network, while computing molecular fingerprints and protein descriptors. Subsequently, it employs a graph isomorphism network to learn graph features, a GCN to capture network features, and a multilayer prceptron to encode biological features. Then, it refines heterogeneous network and graph features iteratively through the GCN, and finally concatenates the fused features with biological features for affinity prediction.
Results: Comparative experiments on benchmark datasets demonstrate that MIGDTA significantly outperforms existing methods. On the Davis dataset, compared to the best baseline method, MIGDTA reduces mean squared error (MSE) to 0.185, increases CI by 0.006, and improves
by 5%. Similar enhancements were observed on the KIBA dataset, where MIGDTA achieves an MSE of 0.130, along with 0.002 and 1% gains in CI and , respectively. Conclusion: Feature ablation studies verify the core role of graph features in modeling local structures and network features in capturing global topology, along with the supplementary importance of biological features. Comparative analyses of feature integration approaches confirm the effectiveness of the feature refinement module in fusing multimodal features and enhancing model discriminability.
Less -
Xiujuan Lei, ... Yuchen Zhang
-
DOI: https://doi.org/10.70401/cbm.2026.0007 - January 19, 2026
SpanAttNet: A hybrid SpanConv SPDConv architecture with residual self attention for viral protein subcellular localization
-
Aims: The subcellular localization of viral proteins can give insight into virus replication, immune evasion, and the development of therapeutic targets. Traditional experimental methods for determining localization are time-consuming and costly ...
MoreAims: The subcellular localization of viral proteins can give insight into virus replication, immune evasion, and the development of therapeutic targets. Traditional experimental methods for determining localization are time-consuming and costly to perform, which calls for robust computational approaches. In this paper, we propose designing a computational method for identifying the subcellular localization of viral proteins.
Methods: In the effort to improve feature extraction for viral protein subcellular localization, a novel hybrid deep learning architecture, SpanAttNet, was proposed by incorporating span-based convolution with spatial pyramid dilated convolution and a residual self-attention mechanism. Three commonly used sequence descriptors, AAC, PseAAC, and DDE, each combined with PCA for feature dimension reduction, were systematically used to benchmark SpanAttNet.
Results: Among the individual descriptors, the best performance was yielded by PseAAC (accuracy 93.95%, MCC 91.18% at ρ = 0.8 PCA reduction), while optimal performance from DDE was at minimum reduction (accuracy 87.00% at ρ = 0.2). Moreover, ensemble feature fusion across the various descriptors elevated SpanAttNet to its top performance, reaching an MCC of 93.79% and an F1-score of 92.91%, hence achieving the best balance between sensitivity and specificity. Compared to state-of-the-art models, SpanAttNet managed to consistently match or surpass predictive accuracy, demonstrating strong generalizability.
Conclusion: We establish SpanAttNet as a robust and biologically informed predictor for viral protein subcellular localization, with strong potential for extension to multi-label classification and broader proteomic applications.
Less -
Grace-Mercure Bakanina Kissanga, ... Hao Lin
-
DOI: https://doi.org/10.70401/cbm.2025.0006 - December 31, 2025
Special Issues
AI for Biomedicine: Models, Applications, and Challenges
-
Submission Deadline: 31 Dec 2025
-
Published articles: 1


