E-Book, Englisch, 46 Seiten
Clark Information-Theoretic Evaluation for Computational Biomedical Ontologies
2014
ISBN: 978-3-319-04138-4
Verlag: Springer Nature Switzerland
Format: PDF
Kopierschutz: 1 - PDF Watermark
E-Book, Englisch, 46 Seiten
Reihe: SpringerBriefs in Computer Science
ISBN: 978-3-319-04138-4
Verlag: Springer Nature Switzerland
Format: PDF
Kopierschutz: 1 - PDF Watermark
The development of effective methods for the prediction of ontological annotations is an important goal in computational biology, yet evaluating their performance is difficult due to problems caused by the structure of biomedical ontologies and incomplete annotations of genes. This work proposes an information-theoretic framework to evaluate the performance of computational protein function prediction. A Bayesian network is used, structured according to the underlying ontology, to model the prior probability of a protein's function. The concepts of misinformation and remaining uncertainty are then defined, that can be seen as analogs of precision and recall. Finally, semantic distance is proposed as a single statistic for ranking classification models. The approach is evaluated by analyzing three protein function predictors of gene ontology terms. The work addresses several weaknesses of current metrics, and provides valuable insights into the performance of protein function prediction tools.
Autoren/Hrsg.
Weitere Infos & Material
1;Preface;5
2;Contents;6
3;1 Introduction;7
3.1;1.1 Background;10
3.2;1.2 Protein Function Prediction Scenarios;12
3.3;1.3 State of the Art Methods;13
3.4;References;13
4;2 Methods;18
4.1;2.1 Calculating the Joint Probability of a Graph;18
4.1.1;2.1.1 Calculating the Information Content of a Graph;21
4.1.2;2.1.2 Comparing Two Annotation Graphs;22
4.1.3;2.1.3 Measuring the Quality of Function Prediction;23
4.1.4;2.1.4 Weighted Metrics;25
4.1.5;2.1.5 Semantic Distance;25
4.1.6;2.1.6 Precision and Recall;26
4.1.7;2.1.7 Supplementary Evaluation Metrics;27
4.1.8;2.1.8 Additional Topological Metrics;30
4.2;2.2 Confusion Matrix Interpretation of ru and mi;30
4.3;2.3 Annotation Models;31
4.3.1;2.3.1 The Naïve Model;31
4.3.2;2.3.2 The BLAST Model;32
4.3.3;2.3.3 The GOtcha Model;32
4.4;References;32
5;3 Experiments and Results;34
5.1;3.1 Average Information Content of a Protein;34
5.2;3.2 Comparative Examples of Calculating Information Content;35
5.3;3.3 Two-Dimensional Plots;38
5.4;3.4 Comparisons of Single Statistics;40
5.5;References;45
6;4 Discussion;47
6.1;References;48
7;Index;49




