Frishman / Valencia Modern Genome Annotation
1. Auflage 2009
ISBN: 978-3-211-75123-7
Verlag: Springer Wien
Format: PDF
Kopierschutz: 1 - PDF Watermark
The Biosapiens Network
E-Book, Englisch, 490 Seiten, eBook
ISBN: 978-3-211-75123-7
Verlag: Springer Wien
Format: PDF
Kopierschutz: 1 - PDF Watermark
An accurate description of current scientific developments in the field of bioinformatics and computational implementation is presented by research of the BioSapiens Network of Excellence. Bioinformatics is essential for annotating the structure and function of genes, proteins and the analysis of complete genomes and to molecular biology and biochemistry.
Included is an overview of bioinformatics, the full spectrum of genome annotation approaches including; genome analysis and gene prediction, gene regulation analysis and expression, genome variation and QTL analysis, large scale protein annotation of function and structure, annotation and prediction of protein interactions, and the organization and annotation of molecular networks and biochemical pathways. Also covered is a technical framework to organize and represent genome data using the DAS technology and work in the annotation of two large genomic sets: HIV/HCV viral genomes and splicing alternatives potentially encoded in 1% of the human genome.
Zielgruppe
Research
Autoren/Hrsg.
Weitere Infos & Material
BIOSAPIENS: A European Network of Excellence to develop genome annotation resources.- BIOSAPIENS: A European Network of Excellence to develop genome annotation resources.- Gene definition.- State of the art in eukaryotic gene prediction.- Quality control of gene predictions.- Gene regulation and expression.- Evaluating the prediction of cis-acting regulatory elements in genome sequences.- A biophysical approach to large-scale protein-DNA binding data.- From gene expression profiling to gene regulation.- Annotation and genetics.- Annotation, genetics and transcriptomics.- Functional annotation of proteins.- Resources for functional annotation.- Annotating bacterial genomes.- Data mining in genome annotation.- Modern genome annotation: the BioSapiens network.- Structure to function.- Harvesting the information from a family of proteins.- Protein structure prediction.- Structure prediction of globular proteins.- The state of the art of membrane protein structure prediction: from sequence to 3D structure.- Protein-protein complexes, pathways and networks.- Computational analysis of metabolic networks.- Protein-protein interactions: analysis and prediction.- Infrastructure for distributed protein annotation.- Infrastructure for distributed protein annotation.- Applications.- Viral bioinformatics.- Alternative splicing in the ENCODE protein complement.
CHAPTER 3 Annotation, genetics and transcriptomics (p. 123-124)
R. Mott
Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford, UK
1 Introduction
This chapter discusses how to combine genome annotations of the type described elsewhere in this book with genetic and functional genomics data to find the genes associatedwith a phenotype, and in particular with a complex disease. This problemis of fundamental importance, the promise that understanding the molecular basis of common diseases would lead to effective treatments helped motivate and fund the human genome project.
Complex diseases such as cancer, diabetes, cardiovascular disease and depression are defined as conditions with multiple causes, both genetic (due to mutations in the genome) and environmental (everything else). By contrast, a Mendelian disease is caused by mutations in a single gene, with minimal environmental contribution. With a few exceptions such as cystic fibrosis in Caucasians and sickle-cell anaemia in parts of equatorial Africa, most Mendelian diseases are rare and do not impose a major health care burden on society. Most common diseases are complex, the exceptions being caused by infectious agents such as HIV and tuberculosis, and even in these cases there is a genetic contribution to resistance to infection.
In general, most complex diseases have a significant genetic component which we can estimate by examining the co-prevalence of a disease in genetically identical (monozygotic) twins compared to non-identical (dizygotic) twins, who only share 50% of their DNA by descent. Because the average effect due to shared environment should be the same in the two groups, any excess in co-prevalence is likely to be genetic. Thus it is possible to estimate the extent of the genetic contribution to a disease without identifying the causative genes and polymorphisms (Mather and Jinks 1982).
The ultimate aim of gene annotation is to describe the function of every segment of the genome, including protein coding genes as well as micro-RNAs, transcription-factor binding sites and other cryptic functional elements. In addition we want to annotate the functional consequence of every polymorphism observed in a population. If we had a perfectly annotated genome then we could predict which genes are relevant to each disease, and there would be no need for further work. However, in fact we have only begun to scratch the surface of the annotation problem, and we will need to be able to integrate data from multiple sources in order to make progress.
Before going further it is important to clarify what is meant by the phrase “gene function”. This turns out to be a surprisingly difficult concept, depending on the context in which the question is being asked. Gene function may be defined at a number of levels. For example, for protein-coding genes, it is important to know in which tissues and at which developmental stages the protein is expressed, and in which splice variants or isoforms. Next, the interactants of the protein are important, as they define the pathways in which the protein functions. Finally we wish to understand the consequences of perturbations to the gene`s DNA sequence, as these may give rise to genetic disease.