Peterson / Pan / Zomaya | Analysis of DNA Microarrays | Buch | 978-0-470-17081-6 | sack.de

Buch, Englisch, 736 Seiten, mit 1 CD-ROM, Format (B × H): 161 mm x 240 mm, Gewicht: 1285 g

Leseprobe

Peterson / Pan / Zomaya

Analysis of DNA Microarrays

1. Auflage 2013
ISBN: 978-0-470-17081-6
Verlag: Wiley

Buch, Englisch, 736 Seiten, mit 1 CD-ROM, Format (B × H): 161 mm x 240 mm, Gewicht: 1285 g

ISBN: 978-0-470-17081-6
Verlag: Wiley

114,50 €

(inkl. MwSt.)

versandkostenfreie Lieferung
Lieferfrist: bis zu 10 Werktage

Bücher versandkostenfrei

kostenlose Rücksendung

Wiley Series in Bioinformatics: Computational Techniques and Engineering
Yi Pan and Albert Y. Zomaya, Series Editors

Wide coverage of traditional unsupervised and supervised methods and newer contemporary approaches that help researchers handle the rapid growth of classification methods in DNA microarray studies

Proliferating classification methods in DNA microarray studies have resulted in a body of information scattered throughout literature, conference proceedings, and elsewhere. This book unites many of these classification methods in a single volume. In addition to traditional statistical methods, it covers newer machine-learning approaches such as fuzzy methods, artificial neural networks, evolutionary-based genetic algorithms, support vector machines, swarm intelligence involving particle swarm optimization, and more.

Classification Analysis of DNA Microarrays provides highly detailed pseudo-code and rich, graphical programming features, plus ready-to-run source code. Along with primary methods that include traditional and contemporary classification, it offers supplementary tools and data preparation routines for standardization and fuzzification; dimensional reduction via crisp and fuzzy c-means, PCA, and non-linear manifold learning; and computational linguistics via text analytics and n-gram analysis, recursive feature extraction during ANN, kernel-based methods, ensemble classifier fusion.

This powerful new resource: - Provides information on the use of classification analysis for DNA microarrays used for large-scale high-throughput transcriptional studies
- Serves as a historical repository of general use supervised classification methods as well as newer contemporary methods
- Brings the reader quickly up to speed on the various classification methods by implementing the programming pseudo-code and source code provided in the book
- Describes implementation methods that help shorten discovery times

Classification Analysis of DNA Microarrays is useful for professionals and graduate students in computer science, bioinformatics, biostatistics, systems biology, and many related fields.

Peterson / Pan / Zomaya Analysis of DNA Microarrays jetzt bestellen!

Autoren/Hrsg.

Peterson, Leif E

Pan, Yi

Zomaya, Albert Y

Weitere Mitwirkende

Pan, Yi

Zomaya, Albert Y.

Fachgebiete

Weitere Infos & Material

Inhaltsverzeichnis

Preface xix

Abbreviations xxiii

1 Introduction 1

1.1 Class Discovery 2

1.2 Dimensional Reduction 4

1.3 Class Prediction 4

1.4 Classification Rules of Thumb 5

1.5 DNA Microarray Datasets Used 9

References 11

Part I Class Discovery 13

2 Crisp K-Means Cluster Analysis 15

2.1 Introduction 15

2.2 Algorithm 16

2.3 Implementation 18

2.4 Distance Metrics 20

2.5 Cluster Validity 24

2.5.1 Davies–Bouldin Index 25

2.5.2 Dunn’s Index 25

2.5.3 Intracluster Distance 26

2.5.4 Intercluster Distance 27

2.5.5 Silhouette Index 30

2.5.6 Hubert’s Statistic 31

2.5.7 Randomization Tests for Optimal Value of K 31

2.6 V-Fold Cross-Validation 35

2.7 Cluster Initialization 37

2.7.1 K Randomly Selected Microarrays 37

2.7.2 K Random Partitions 40

2.7.3 Prototype Splitting 41

2.8 Cluster Outliers 44

2.9 Summary 44

References 45

3 Fuzzy K-Means Cluster Analysis 47

3.1 Introduction 47

3.2 Fuzzy K-Means Algorithm 47

3.3 Implementation 49

3.4 Summary 54

References 54

4 Self-Organizing Maps 57

4.1 Introduction 57

4.2 Algorithm 57

4.2.1 Feature Transformation and Reference Vector Initialization 59

4.2.2 Learning 60

4.2.3 Conscience 61

4.3 Implementation 63

4.3.1 Feature Transformation and Reference Vector Initialization 63

4.3.2 Reference Vector Weight Learning 66

4.4 Cluster Visualization 67

4.4.1 Crisp K-Means Cluster Analysis 67

4.4.2 Adjacency Matrix Method 68

4.4.3 Cluster Connectivity Method 69

4.4.4 Hue–Saturation–Value (HSV) Color Normalization 69

4.5 Unified Distance Matrix (U Matrix) 71

4.6 Component Map 71

4.7 Map Quality 73

4.8 Nonlinear Dimension Reduction 75

References 79

5 Unsupervised Neural Gas 81

5.1 Introduction 81

5.2 Algorithm 82

5.3 Implementation 82

5.3.1 Feature Transformation and Prototype Initialization 82

5.3.2 Prototype Learning 83

5.4 Nonlinear Dimension Reduction 85

5.5 Summary 87

References 88

6 Hierarchical Cluster Analysis 91

6.1 Introduction 91

6.2 Methods 91

6.2.1 General Programming Methods 91

6.2.2 Step 1: Cluster-Analyzing Arrays as Objects with Genes as Attributes 92

6.2.3 Step 2: Cluster-Analyzing Genes as Objects with Arrays as Attributes 94

6.3 Algorithm 96

6.4 Implementation 96

6.4.1 Heatmap Color Control 96

6.4.2 User Choices for Clustering Arrays and Genes 97

6.4.3 Distance Matrices and Agglomeration Sequences 98

6.4.4 Drawing Dendograms and Heatmaps 104

References 105

7 Model-Based Clustering 107

7.1 Introduction 107

7.2 Algorithm 110

7.3 Implementation 111

7.4 Summary 116

References 117

8 Text Mining: Document Clustering 119

8.1 Introduction 119

8.2 Duo-Mining 119

8.3 Streams and Documents 120

8.4 Lexical Analysis 120

8.4.1 Automatic Indexing 120

8.4.2 Removing Stopwords 121

8.5 Stemming 121

8.6 Term Weighting 121

8.7 Concept Vectors 124

8.8 Main Terms Representing Concept Vectors 124

8.9 Algorithm 125

8.10 Preprocessing 127

8.11 Summary 137

References 137

9 Text Mining: N-Gram Analysis 139

9.1 Introduction 139

9.2 Algorithm 140

9.3 Implementation 141

9.4 Summary 154

References 156

Part II Dimension Reduction 159

10 Principal Components Analysis 161

10.1 Introduction 161

10.2 Multivariate Statistical Theory 161

10.2.1 Matrix Definitions 162

10.2.2 Principal Component Solution of R 163

10.2.3 Extraction of Principal Components 164

10.2.4 Varimax Orthogonal Rotation of Components 166

10.2.5 Principal Component Score Coefficients 168

10.2.6 Principal Component Scores 169

10.3 Algorithm 170

10.4 When to Use Loadings and PC Scores 170

10.5 Implementation 171

10.5.1 Correlation Matrix R 171

10.5.2 Eigenanalysis of Correlation Matrix R 172

10.5.3 Determination of Loadings and Varimax Rotation 174

10.5.4 Calculating Principal Component (PC) Scores 176

10.6 Rules of Thumb For PCA 182

10.7 Summary 186

References 187

11 Nonlinear Manifold Learning 189

11.1 Introduction 189

11.2 Correlation-Based PCA 190

11.3 Kernel PCA 191

11.4 Diffusion Maps 192

11.5 Laplacian Eigenmaps 192

11.6 Local Linear Embedding 193

11.7 Locality Preserving Projections 194

11.8 Sammon Mapping 195

11.9 NLML Prior to Classification Analysis 195

11.10 Classification Results 197

11.11 Summary 200

References 203

Part III Class Prediction 205

12 Feature Selection 207

12.1 Introduction 207

12.2 Filtering versus Wrapping 208

12.3 Data 209

12.3.1 Numbers 209

12.3.2 Responses 209

12.3.3 Measurement Scales 210

12.3.4 Variables 211

12.4 Data Arrangement 211

12.5 Filtering 213

12.5.1 Continuous Features 213

12.5.2 Best Rank Filters 219

12.5.3 Randomization Tests 236

12.5.4 Multitesting Problem 237

12.5.5 Filtering Qualitative Features 242

12.5.6 Multiclass Gini Diversity Index 246

12.5.7 Class Comparison Techniques 247

12.5.8 Generation of Nonredundant Gene List 250

12.6 Selection Methods 254

12.6.1 Greedy Plus Takeaway (Greedy PTA) 254

12.6.2 Best Ranked Genes 258

12.7 Multicollinearity 259

12.8 Summary 270

References 270

13 Classifier Performance 273

13.1 Int

Über Autor(innen)

LEIF E. PETERSON, PHD, is Associate Professor of Public Health, Weill Cornell Medical College, Cornell University, and is with the Center for Biostatistics, The Methodist Hospital Research Institute (Houston). He is a member of the IEEE Computational Intelligence Society, and Editor-in-Chief of the BioMed Central Source Code for Biology and Medicine.

Produktsicherheit

Fragen zum Artikel?

Ihre Fragen, Wünsche oder Anmerkungen

Vorname*

Nachname*

Ihre E-Mail-Adresse*

Kundennr.

Ihre Nachricht*

Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.

Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.

114,50 € (inkl. MwSt.)

Lieferfrist: bis zu 10 Werktage

Bücher versandkostenfrei

kostenlose Rücksendung

Webcode: sack.de/r1tt3