异源蛋白质相互作用数据整合算法的进展
王文馨1,陈宇光1,石铁流2*
(1上海大学生命科学学院,上海 200444;2中国科学院上海生命科学信息中心,上海 200031)

摘 要:摘 要:蛋白质相互作用在生物学过程和细胞功能行使中起核心作用。高通量技术的应用结合计算机预测方法的发展,使得直接和间接来源的蛋白质相互作用数据得到了大规模的增加。如何系统地整合这些数据并从中提取有用的信息是一项挑战,这也促使了许多整合算法应运而生。本文综述了八种整合蛋白质相互作用数据源的方法: 投票、支持向量机、朴素贝叶斯、逻辑斯蒂回归、决策树、随机森林、基于随机森林的k-近邻法以及混合属性分类等方法。
关键词:蛋白质相互作用;数据整合;二分类器
中图分类号:Q51;Q811.4;Q816  文献标识码:A

Advances in algorithms applied on various protein-protein interaction data sources integration
WANG Wen-xin1, CHEN Yu-guang1, SHI Tie-liu2*
(1School of Life Sciences, Shanghai University, Shanghai 200444, China; 2Shanghai Information Center for Life Sciences, Chinese Academy of Sciences, Shanghai 200031, China)

Abstract: Abstract: Protein-protein interactions are crucial for all biological processes and fundamental to virtually every aspect of cellular functions. Developments of high through-put experimental techniques and in silico prediction methods help to increase direct and indirect protein-protein interactions data. How to systematically integrate those data and extract the meaningful information from them is a really challenge. Many computational approaches are therefore emerging for the purposes. This review presents recent advances for the application of those approaches in integrating protein-protein interaction data sources, including voting, support vector machine, naive bayes, logistic regression, decision tree, random forest (RF), RF-based k-nearest-neighbor and mixture of feature experts.
Key words: protein-protein interaction; data integration; binary classifier

Back to top