• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2010, Vol. 32 ›› Issue (9): 148-151.doi: 10.3969/j.issn.1007130X.2010.

• 论文 • 上一篇    下一篇

基于多特征尺度的大肠杆菌启动子预测

杨乌日吐1,林昊2   

  1. (1.内蒙古大学研究生工作部,内蒙古 呼和浩特 010021;2.电子科技大学生命科学与技术学院,四川 成都 610054)
  • 收稿日期:2010-03-13 修回日期:2010-06-10 出版日期:2010-09-02 发布日期:2010-09-02
  • 作者简介:杨乌日吐(1981),男,辽宁阜新人,博士,讲师,研究方向为生物信息学;林昊,博士,副教授,研究方向为生物信息学。
  • 基金资助:

    内蒙古大学青年科学基金资助项目(ND0810);电子科技大学科研启动费资助项目

Prediction of the E.coliK12 Promoter Based on MultiFeature Selection

YANG Wuritu1,LIN Hao2   

  1. (1.Graduate Affairs Department,Inner Mongolia University,Hohhot 010021;2.School of Life Science and Technology,University of Electronic Science and Technology of China,Chengdu 610054,China)
  • Received:2010-03-13 Revised:2010-06-10 Online:2010-09-02 Published:2010-09-02

摘要:

本文对实验证实的741条大肠杆菌Sigma70启动子的序列进行预测研究。首先,基于RNA聚合酶与DNA的相互作用,利用位置打分函数对序列中的保守位点进行了衡量;然后,根据启动子的序列特征,利用离散性指标对序列中不同的碱基信息含量进行测量;最后,利用多元非线性判别分析实现了对大肠杆菌启动子的预测。10折叠交叉检验结果显示,总体预测精度达到85%以上。与其它算法比较结果显示,我们开发的这一算法能够更好地预测大肠杆菌启动子。

关键词: 大肠杆菌启动子, 位置关联打分函数, 多样性指标, 修正的马氏判别式

Abstract:

According to the known knowledge of 741 experimentally confirmed Sigma70 promoters, the promoters are predicted. At first, based on the interaction between RNAp and DNA elements, the positioncorrelationscorefunction (PCSF) algorithm is used to measure the conservative sits in promoter sequences. Subsequently, according to the characteristics of promoters, a diversity index is applied to measure the information content in different regions. Finally, the modified Mahalanobis Discriminant is proposed to perform prediction. The overall accuracies of 10fold crossvalidation of 85%+ are achieved. By comparing with other methods, it is shown that the proposed method can recognize the  Escherichia coli promoters with high accuracy.

Key words: E.coli promoter;position correlation score function;diversity index;modified mahalanobis discriminant