Skip to main navigation Skip to search Skip to main content

Molecular function prediction using neighborhood features

Research output: Contribution to journalArticlepeer-review

65 Scopus citations

Abstract

The recent advent of high-throughput methods has generated large amounts of gene interaction data. This has allowed the construction of genomewide networks. A significant number of genes in such networks remain uncharacterized and predicting the molecular function of these genes remains a major challenge. A number of existing techniques assume that genes with similar functions are topologically close in the network. Our hypothesis is that genes with similar functions observe similar annotation patterns in their neighborhood, regardless of the distance between them in the interaction network. We thus predict molecular functions of uncharacterized genes by comparing their functional neighborhoods to genes of known function. We propose a two-phase approach. First, we extract functional neighborhood features of a gene using Random Walks with Restarts. We then employ a KNN classifier to predict the function of uncharacterized genes based on the computed neighborhood features. We perform leave-one-out validation experiments on two S. cerevisiae interaction networks and show significant improvements over previous techniques. Our technique provides a natural control of the trade-off between accuracy and coverage of prediction. We further propose and evaluate prediction in sparse genomes by exploiting features from well-annotated genomes.

Original languageEnglish
Article number5313794
Pages (from-to)208-217
Number of pages10
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume7
Issue number2
DOIs
StatePublished - 2010

Keywords

  • Classification
  • Feature extraction
  • Functional interaction network
  • Gene function prediction

Fingerprint

Dive into the research topics of 'Molecular function prediction using neighborhood features'. Together they form a unique fingerprint.

Cite this