Buch, Englisch, 348 Seiten, Format (B × H): 161 mm x 240 mm, Gewicht: 690 g
Buch, Englisch, 348 Seiten, Format (B × H): 161 mm x 240 mm, Gewicht: 690 g
ISBN: 978-0-8493-2801-5
Verlag: CRC Press
Covering theory, algorithms, and methodologies, as well as data mining technologies, Data Mining for Bioinformatics provides a comprehensive discussion of data-intensive computations used in data mining with applications in bioinformatics. It supplies a broad, yet in-depth, overview of the application domains of data mining for bioinformatics to help readers from both biology and computer science backgrounds gain an enhanced understanding of this cross-disciplinary field.
The book offers authoritative coverage of data mining techniques, technologies, and frameworks used for storing, analyzing, and extracting knowledge from large databases in the bioinformatics domains, including genomics and proteomics. It begins by describing the evolution of bioinformatics and highlighting the challenges that can be addressed using data mining techniques. Introducing the various data mining techniques that can be employed in biological databases, the text is organized into four sections:
- Supplies a complete overview of the evolution of the field and its intersection with computational learning
- Describes the role of data mining in analyzing large biological databases—explaining the breath of the various feature selection and feature extraction techniques that data mining has to offer
- Focuses on concepts of unsupervised learning using clustering techniques and its application to large biological data
- Covers supervised learning using classification techniques most commonly used in bioinformatics—addressing the need for validation and benchmarking of inferences derived using either clustering or classification
The book describes the various biological databases prominently referred to in bioinformatics and includes a detailed list of the applications of advanced clustering algorithms used in bioinformatics. Highlighting the challenges encountered during the application of classification on biological databases, it considers systems of both single and ensemble classifiers and shares effort-saving tips for model selection and performance estimation strategies.
Zielgruppe
bioinformatics software engineers/developers; bioinformatician; bioinformatics scientists and support specialists; data mining analysts/architects; data mining engineers and support specialists; faculty; post doctoral researchers
Autoren/Hrsg.
Fachgebiete
Weitere Infos & Material
Introduction to Bioinformatics
Introduction
Transcription and Translation The Central Dogma of Molecular Biology
The Human Genome Project
Beyond the Human Genome Project Sequencing Technology Dideoxy Sequencing Cyclic Array Sequencing Sequencing by Hybridization Microelectrophoresis Mass Spectrometry Nanopore Sequencing Next-Generation Sequencing Challenges of Handling NGS Data Sequence Variation Studies Kinds of Genomic Variations SNP Characterization Functional Genomics Splicing and Alternative Splicing Microarray-Based Functional Genomics Comparative Genomics Functional Annotation Function Prediction Aspects
Conclusion
References
Biological Databases and Integration
Introduction: Scientific Work Flows and Knowledge Discovery
Biological Data Storage and Analysis Challenges of Biological Data Classification of Bioscience Databases Primary versus Secondary Databases Deep versus Broad Databases Point Solution versus General Solution Databases Gene Expression Omnibus (GEO) Database The Protein Data Bank (PDB)
The Curse of Dimensionality
Data Cleaning Problems of Data Cleaning Challenges of Handling Evolving Databases Problems Associated with Single-Source Techniques Problems Associated with Multisource Integration Data Argumentation: Cleaning at the Schema Level Knowledge-Based Framework: Cleaning at the Instance Level Data Integration Ensembl Sequence Retrieval System (SRS) IBM’s DiscoveryLink Wrappers: Customizable Database Software Data Warehousing: Data Management with Query Optimization Data Integration in the PDB
Conclusion
References
Knowledge Discovery in Databases
Introduction
Analysis of Data Using Large Databases Distance Metrics Data Cleaning and Data Preprocessing
Challenges in Data Cleaning Models of Data Cleaning Proximity-Based Techniques Parametric Methods Nonparametric Methods Semiparametric Methods




