融合序列和多标签嵌入信息的多视角深度学习多功能酶预测
首发时间:2023-04-28
摘要:多功能酶能以不同的功能和形式对生物体的生存、进化产生积极作用,因此了解相关酶的功能就显得至关重要。当前,传统的机器学习方法已广泛应用到酶功能分类预测方法中,但大多数方法仅针对于单功能酶的分类预测任务,且现有的少数多功能酶分类模型只能预测酶委员会(EC)编号的第一层。针对上述挑战,本文提出了一种融合序列和多标签嵌入信息的多视角深度学习多功能酶预测方法。在该方法中,使用由带注意力机制的卷积神经网络(CNN)和双向长短记忆网络(BiLSTM)组成的混合网络对酶序列深度特征进行学习。同时,对EC编号每一层的分类预测模型构建一个EC类相关图,并利用图卷积网络(GCN)对EC类相关标签进行嵌入,利用得到的标签嵌入对特征学习过程进行指导。最后通过一个多标签分类器对多功能酶进行分类预测。实验结果表明,该方法在EC编码第四层的子集精度达到75.75%,其Macro_F1参数达到90.41%,与现有方法相比,该方法在多功能酶四层EC码预测性能上均得到了一定提升。
关键词: 深度学习 多功能酶分类 多视角学习 图卷积网络 多标签分类
For information in English, please click here
Multi-view Deep Learning Multifunctional Enzyme Prediction Based on Fusion of Sequence and Multi label Embedded Information
Abstract:Multifunctional enzymes can have positive effects on the survival and evolution of organisms in different functions and forms, so it is crucial to understand the functions of related enzymes. Currently, traditional machine learning methods have been widely used in enzyme function classification prediction methods, but most of them are only for the classification prediction task of single-function enzymes, and the few existing multifunctional enzyme classification models can only predict the first level of enzyme committee (EC) numbering. To address the above challenges, this paper proposes a multi-view deep learning multifunctional enzyme prediction method that fuses sequence and multi-label embedding information. In this method, a hybrid network consisting of a convolutional neural network (CNN) with attention mechanism and a bi-directional long-short memory network (Bi-LSTM) is used to learn the deep features of enzyme sequences. Meanwhile, an EC class correlation graph is constructed for the classification prediction model of each layer of EC numbering, and the EC class correlation labels are embedded using a graph convolutional network (GCN), and the obtained label embeddings are used to guide the feature learning process. Finally, a multi-label classifier is used for classification prediction of multifunctional enzymes. The experimental results show that the method achieves 75.75% accuracy for a subset of EC codes in the fourth layer, and its Macro_F1 parameter reaches 90.41%, which is a certain improvement in the prediction performance of all four EC code layers of multifunctional enzymes compared with existing methods.
Keywords: deep learning multifunctional enzyme classification multi-view Learning graph convolutional network multi-label classification
基金:
引用
No.****
动态公开评议
共计0人参与
勘误表
融合序列和多标签嵌入信息的多视角深度学习多功能酶预测
评论
全部评论