E-Book, Englisch, 558 Seiten
Celebi / Aydin Unsupervised Learning Algorithms
1. Auflage 2016
ISBN: 978-3-319-24211-8
Verlag: Springer Nature Switzerland
Format: PDF
Kopierschutz: 1 - PDF Watermark
E-Book, Englisch, 558 Seiten
ISBN: 978-3-319-24211-8
Verlag: Springer Nature Switzerland
Format: PDF
Kopierschutz: 1 - PDF Watermark
This book summarizes the state-of-the-art in unsupervised learning. The contributors discuss how with the proliferation of massive amounts of unlabeled data, unsupervised learning algorithms, which can automatically discover interesting and useful patterns in such data, have gained popularity among researchers and practitioners. The authors outline how these algorithms have found numerous applications including pattern recognition, market basket analysis, web mining, social network analysis, information retrieval, recommender systems, market research, intrusion detection, and fraud detection. They present how the difficulty of developing theoretically sound approaches that are amenable to objective evaluation have resulted in the proposal of numerous unsupervised learning algorithms over the past half-century. The intended audience includes researchers and practitioners who are increasingly using unsupervised learning algorithms to analyze their data. Topics of interest includeanomaly detection, clustering, feature extraction, and applications of unsupervised learning. Each chapter is contributed by a leading expert in the field.
Autoren/Hrsg.
Weitere Infos & Material
1;Preface;6
2;Contents;10
3;Anomaly Detection for Data with Spatial Attributes;12
3.1;1 Introduction;12
3.2;2 Problem Definition and Taxonomy of Techniques;15
3.2.1;2.1 Problem Definition;15
3.2.2;2.2 Taxonomy of Techniques;16
3.3;3 Object Anomalies: Techniques for Outlier Detection;18
3.3.1;3.1 General Outlier Detection;19
3.3.1.1;3.1.1 General Framework for Outlier Detection;19
3.3.1.2;3.1.2 LOF;20
3.3.1.3;3.1.3 Adapting for Spatial Outlier Detection;21
3.3.2;3.2 Spatial Outlier Detection;22
3.3.2.1;3.2.1 SLOM;22
3.4;4 Region Anomalies: Global;23
3.4.1;4.1 Statistical Approaches: Spatial Scan Statistics;23
3.4.1.1;4.1.1 ULS Scan Statistic;27
3.4.1.2;4.1.2 Other Extensions;29
3.4.2;4.2 Mining Approaches;29
3.4.2.1;4.2.1 Bump Hunting;29
3.5;5 Region Anomalies: Local;30
3.5.1;5.1 Localized Homogeneous Anomalies;30
3.5.2;5.2 Image Segmentation;32
3.6;6 Region Anomalies: Grouping;34
3.6.1;6.1 Clustering for Spatial Data;35
3.6.1.1;6.1.1 HAC-A;35
3.6.1.2;6.1.2 Clustering Ensuring Spatial Convexity;35
3.6.2;6.2 Clustering-Based Anomaly Detection;36
3.6.2.1;6.2.1 Adapting for Anomaly Detection on Spatial Data;38
3.7;7 Discussion;38
3.8;8 Directions for Future Work;39
3.9;9 Conclusions;41
3.10;References;42
4;Anomaly Ranking in a High Dimensional Space: The Unsupervised TreeRank Algorithm;44
4.1;1 Introduction;45
4.2;2 Anomaly Ranking: Background and Preliminaries;46
4.2.1;2.1 A Scoring Approach to Anomaly Ranking;46
4.2.2;2.2 Measuring Scoring Accuracy: The Mass-Volume Curve;47
4.3;3 Turning Anomaly Ranking into Bipartite Ranking;50
4.3.1;3.1 Bipartite Ranking and ROC Analysis;50
4.3.2;3.2 A Bipartite View of Anomaly Ranking;52
4.3.3;3.3 Extending Bipartite Methods via Uniform Sampling;53
4.4;4 The Unsupervised TreeRank Algorithm;54
4.4.1;4.1 Anomaly Ranking Trees;54
4.4.2;4.2 The Algorithm: Growing the Anomaly Ranking Tree;56
4.4.3;4.3 Pruning the Anomaly Ranking Tree: Model Selection;60
4.5;5 Numerical Experiments;61
4.6;6 Conclusion;64
4.7;References;64
5;Genetic Algorithms for Subset Selection in Model-Based Clustering;66
5.1;1 Introduction;66
5.2;2 Model-Based Clustering;67
5.2.1;2.1 Finite Mixture Modelling;67
5.2.2;2.2 BIC as a Criterion for Model Selection;68
5.3;3 Subset Selection in Model-Based Clustering;69
5.3.1;3.1 The Proposed Approach;70
5.3.2;3.2 Models for No Clustering;71
5.4;4 Genetic Algorithms;71
5.4.1;4.1 GAs for Subset Selection in Model-Based Clustering;72
5.4.1.1;4.1.1 Genetic Coding Scheme;72
5.4.1.2;4.1.2 Generation of a Population of Models;72
5.4.1.3;4.1.3 Fitness Function to Evaluate the Model Clustering;73
5.4.1.4;4.1.4 Genetic Operators;73
5.4.2;4.2 Computational Issues;73
5.4.3;4.3 Random-Key GAs to Select a Fixed Size Subset;74
5.5;5 Data Examples;74
5.5.1;5.1 Birds, Planes and Cars;75
5.5.2;5.2 Italian Wines;75
5.6;6 Conclusions;79
5.7;References;80
6;Clustering Evaluation in High-Dimensional Data;82
6.1;1 Introduction;82
6.2;2 Basic Notation;83
6.3;3 Problems in Analyzing High-Dimensional Data;84
6.3.1;3.1 Distance Concentration;85
6.3.2;3.2 Hubness: The Long Tail of Relevance and the Central Tendencies of Hubs;86
6.4;4 Clustering Techniques for High-Dimensional Data;89
6.5;5 Clustering Quality Indexes: An Overview;90
6.6;6 Clustering Quality Indexes: Existing Surveys;95
6.7;7 Clustering Evaluation in Many Dimensions;96
6.7.1;7.1 Experimental Protocol;97
6.7.2;7.2 Sensitivity to Increasing Dimensionality;98
6.7.2.1;7.2.1 Sensitivity of the Average Quality Assessment;98
6.7.2.2;7.2.2 Stability in Quality Assessment;108
6.7.3;7.3 Quantifying the Influence of Hubs;109
6.8;8 Perspectives and Future Directions;112
6.9;References;113
7;Combinatorial Optimization Approaches for Data Clustering;119
7.1;1 Introduction;119
7.2;2 Applications;120
7.3;3 Problem Definition and Distance Measures Definition;121
7.3.1;3.1 Euclidean Distance;122
7.3.2;3.2 Pearson's Correlation Coefficient;122
7.3.3;3.3 City-Block or Manhattan;123
7.3.4;3.4 Cosine or Uncentered Correlation;123
7.4;4 Mathematical Formulations of the Problem;123
7.4.1;4.1 Minimize (Maximize) the Within (Between)-Clusters Sum of Squares;124
7.4.1.1;4.1.1 Cardinality of Each Cluster A Priori Known;124
7.4.1.2;4.1.2 Bipartition of the Patterns;126
7.4.2;4.2 Optimizing the Within Clusters Distance;129
7.5;5 A Review of the Most Popular Clustering Techniques;131
7.5.1;5.1 Hierarchical Clustering Algorithms;132
7.5.2;5.2 Partitioning Clustering Algorithms;133
7.5.2.1;5.2.1 Squared Error Algorithms and the k-Means/k-Medoid Algorithms;133
7.5.2.2;5.2.2 Graph-Theoretic Algorithms;135
7.5.2.3;5.2.3 Mixture-Resolving Algorithms;136
7.5.3;5.3 Efficient Metaheuristic Approaches;136
7.5.4;5.4 Encoding;139
7.5.5;5.5 Decoding;140
7.6;6 Concluding Remarks;140
7.7;References;141
8;Kernel Spectral Clustering and Applications;145
8.1;1 Introduction;145
8.2;2 Notation;147
8.3;3 Kernel Spectral Clustering (KSC);147
8.3.1;3.1 Mathematical Formulation;147
8.3.1.1;3.1.1 Training Problem;147
8.3.1.2;3.1.2 Generalization;149
8.3.1.3;3.1.3 Model Selection;150
8.3.2;3.2 Soft Kernel Spectral Clustering;151
8.3.3;3.3 Hierarchical Clustering;153
8.3.3.1;3.3.1 Approach 1;154
8.3.3.2;3.3.2 Approach 2;154
8.3.4;3.4 Sparse Clustering Models;155
8.3.4.1;3.4.1 Incomplete Cholesky Decomposition;155
8.3.4.2;3.4.2 Using Additional Penalty Terms;158
8.4;4 Applications;159
8.4.1;4.1 Image Segmentation;159
8.4.2;4.2 Scientific Journal Clustering;161
8.4.3;4.3 Power Load Clustering;165
8.4.4;4.4 Big Data;165
8.5;5 Conclusions;168
8.6;References;169
9;Uni- and Multi-Dimensional Clustering Via Bayesian Networks;172
9.1;1 Introduction;172
9.2;2 Uni-Dimensional Clustering;175
9.2.1;2.1 Known Structure;176
9.2.1.1;2.1.1 Naive Bayes;176
9.2.1.2;2.1.2 Expectation Model Averaging;178
9.2.1.3;2.1.3 Expectation Model Averaging: Tree Augmented Naive Bayes;179
9.2.2;2.2 Unknown Structure;180
9.2.2.1;2.2.1 Extended Naive Bayes;180
9.2.2.2;2.2.2 Recursive Bayesian Multinets;181
9.3;3 Multi-Dimensional Clustering;185
9.3.1;3.1 Latent Tree Models;185
9.3.1.1;3.1.1 Known Structure;186
9.3.1.2;3.1.2 Unknown Structure;186
9.3.2;3.2 Cluster Variables Novelty;189
9.4;4 Our Approach;191
9.5;5 Preliminary Results;196
9.6;6 Conclusion and Summary;198
9.7;References;199
10;A Radial Basis Function Neural Network Training Mechanism for Pattern Classification Tasks;202
10.1;1 Introduction;202
10.2;2 RBF Neural Network;203
10.3;3 Particle Swarm Optimization;204
10.4;4 RBF Network Training Algorithm;205
10.4.1;4.1 Extraction of the Multidimensional Fuzzy Subspaces;206
10.4.2;4.2 Estimation of the Network's Basis Function Parameters;208
10.4.3;4.3 Discriminant Analysis and PSO Implementation;209
10.5;5 Evaluation Experiments;210
10.5.1;5.1 WDBC Data Set;211
10.5.2;5.2 Wine Data Set;212
10.5.3;5.3 Pima Indians Diabetes Data Set;212
10.6;6 Conclusion;213
10.7;References;214
11;A Survey of Constrained Clustering;216
11.1;1 Introduction;216
11.2;2 Unsupervised Clustering;217
11.2.1;2.1 Minimum Sum-of-Squares Clustering;218
11.2.1.1;2.1.1 K-Means Algorithm;219
11.2.2;2.2 Agglomerative Hierarchical Clustering;220
11.2.3;2.3 COBWEB;220
11.3;3 Constrained Clustering;221
11.3.1;3.1 Constrained Clustering with Labeled Data;223
11.3.1.1;3.1.1 Search Based Methods;223
11.3.1.2;3.1.2 Distance Based Methods;224
11.3.2;3.2 Constrained Clustering with Instance-Level Constraints;225
11.3.2.1;3.2.1 Search Based Methods;225
11.3.2.2;3.2.2 Distance Based Methods;230
11.3.2.3;3.2.3 Search and Distance Based Methods;233
11.3.3;3.3 Constrained Clustering with Cluster-Level Constraints;235
11.3.4;3.4 Feasibility Issues;239
11.3.5;3.5 Related Studies;240
11.4;4 Conclusion;240
11.5;References;241
12;An Overview of the Use of Clustering for Data Privacy;245
12.1;1 Introduction;245
12.2;2 Clustering to Define Masking Methods;247
12.2.1;2.1 Clustering in Microaggregation;247
12.2.2;2.2 Clustering for Graphs: Microaggregation and k-Anonymity;249
12.2.3;2.3 Attacks on Microaggregation;251
12.2.4;2.4 Fuzzy Clustering for Microaggregation;252
12.2.5;2.5 Clustering for Masking Data Streams;253
12.2.6;2.6 Masking Very Large Data Sets;253
12.2.7;2.7 Masking Through Semantic Clustering;254
12.2.8;2.8 Clustering in Other Masking Methods;254
12.3;3 Clustering to Measure Information Loss;255
12.4;4 Conclusion;256
12.5;References;257
13;Nonlinear Clustering: Methods and Applications;260
13.1;1 Introduction;260
13.2;2 COLL for Kernel-Based Clustering;262
13.2.1;2.1 Problem Formulation;263
13.2.2;2.2 Batch Kernel k-Means and Issues;264
13.2.3;2.3 Conscience On-Line Learning;265
13.2.3.1;2.3.1 The COLL Model;265
13.2.3.2;2.3.2 The Computation of COLL;267
13.2.3.3;2.3.3 Computational Complexity;271
13.2.4;2.4 Experiments and Applications;272
13.3;3 Multi-exemplar Affinity Propagation;274
13.3.1;3.1 Affinity Propagation;275
13.3.2;3.2 Multi-exemplar Affinity Propagation;276
13.3.2.1;3.2.1 The Model;276
13.3.2.2;3.2.2 Optimization;278
13.3.3;3.3 Experiments and Applications;283
13.4;4 Graph-Based Multi-prototype Competitive Learning;284
13.4.1;4.1 Graph-Based Initial Clustering;284
13.4.2;4.2 Multi-prototype Competitive Learning;286
13.4.3;4.3 Fast GMPCL;288
13.4.3.1;4.3.1 Inner Product Based Computation;289
13.4.3.2;4.3.2 FGMPCL in High Dimension;291
13.4.4;4.4 Experiments and Applications;293
13.5;5 Position Regularized Support Vector Clustering;293
13.5.1;5.1 Background;295
13.5.1.1;5.1.1 Support Vector Domain Description;295
13.5.1.2;5.1.2 Support Vector Clustering;297
13.5.2;5.2 Position Regularized Support Vector Clustering;297
13.5.3;5.3 Experiments and Applications;302
13.6;6 Conclusion and Discussion;306
13.7;References;307
14;Swarm Intelligence-Based Clustering Algorithms: A Survey;310
14.1;1 Introduction;310
14.2;2 The Clustering Problem;313
14.3;3 Overview of the Swarm Intelligence-Based Approaches;315
14.3.1;3.1 Particle Swarm Optimization;315
14.3.2;3.2 Ant Colony Optimization;316
14.3.3;3.3 Ant-Based Sorting;318
14.3.4;3.4 Other Swarm Intelligence-Based Metaheuristics;319
14.4;4 Classification of the Swarm Intelligence-Based Algorithms for Clustering;319
14.4.1;4.1 Data Point-to-Cluster Assignment;319
14.4.2;4.2 Cluster Representatives;321
14.4.3;4.3 Direct Point-Agent Matching;325
14.4.4;4.4 Search Agent;327
14.5;5 Discussion;329
14.5.1;5.1 Agent Representation Versus SI-Based Clustering Algorithms;329
14.5.2;5.2 Agent Representation Versus Challenging Issues in Clustering;330
14.6;6 Conclusion;332
14.7;Appendix;333
14.8;References;345
15;Extending Kmeans-Type Algorithms by Integrating Intra-cluster Compactness and Inter-cluster Separation;349
15.1;1 Introduction;349
15.2;2 Related Work;351
15.2.1;2.1 No Weighting Kmeans-Type Algorithm;351
15.2.1.1;2.1.1 No Weighting Kmeans-Type Algorithm Without Inter-cluster Separation;351
15.2.1.2;2.1.2 No Weighting Kmeans-Type Algorithm with Inter-cluster Separation;353
15.2.2;2.2 Vector Weighting Kmeans-Type Algorithm;354
15.2.2.1;2.2.1 Vector Weighting Kmeans-Type Algorithm Without Inter-cluster Separation;354
15.2.2.2;2.2.2 Vector Weighting Kmeans-Type Algorithm with Inter-cluster Separation;355
15.2.3;2.3 Matrix Weighting Kmeans-Type Algorithm;355
15.2.3.1;2.3.1 Matrix Weighting Kmeans-Type Algorithm Without Inter-cluster Separation;355
15.2.3.2;2.3.2 Matrix Weighting Kmeans-Type Algorithm with Inter-cluster Separation;357
15.2.4;2.4 Summary of the Existing Kmeans-Type Algorithms;358
15.2.5;2.5 Characteristics of Our Extending Kmeans-Type Algorithms;359
15.3;3 The Extending Model of Kmeans-Type Algorithm;359
15.3.1;3.1 Motivation;359
15.3.2;3.2 Extension of Basic Kmeans (E-kmeans);362
15.3.3;3.3 Extension of Wkmeans (E-Wkmeans);364
15.3.4;3.4 Extension of AWA (E-AWA);366
15.3.5;3.5 Relationship Among Algorithms;368
15.3.6;3.6 Computational Complexity;368
15.4;4 Experiments;369
15.4.1;4.1 Experimental Setup;369
15.4.2;4.2 Synthetic Data Set;370
15.4.2.1;4.2.1 Parametric Study;370
15.4.2.2;4.2.2 Results and Analysis;373
15.4.2.3;4.2.3 Feature Selection;374
15.4.2.4;4.2.4 Convergence Speed;376
15.4.3;4.3 Real-Life Data Set;377
15.4.3.1;4.3.1 Parametric Study;377
15.4.3.2;4.3.2 Results and Analysis;381
15.4.3.3;4.3.3 Convergence Speed;385
15.5;5 Discussion;386
15.6;6 Conclusion and Future Work;387
15.7;References;388
16;A Fuzzy-Soft Competitive Learning Approach for Grayscale Image Compression;391
16.1;1 Introduction;391
16.2;2 Related Work;393
16.2.1;2.1 The Batch Learning Vector Quantization;393
16.2.2;2.2 The Fuzzy Learning Vector Quantization Algorithm;395
16.3;3 The Proposed Vector Quantization Approach;397
16.3.1;3.1 Fuzzy-Set-Based Competitive Learning;398
16.3.2;3.2 Codeword Migration Process;401
16.4;4 Experimental Study;403
16.4.1;4.1 Study of the Behavior of the Distortion Measure and the PSNR;404
16.4.2;4.2 Computational Demands;406
16.4.3;4.3 Study of the Migration Strategy;406
16.4.4;4.4 Literature Comparison;407
16.5;5 Conclusions;408
16.6;References;409
17;Unsupervised Learning in Genome Informatics;411
17.1;1 Introduction;412
17.2;2 Unsupervised Learning for DNA;412
17.2.1;2.1 DNA Motif Discovery and Search;414
17.2.1.1;2.1.1 Representation (DNA Motif Model);414
17.2.1.2;2.1.2 Learning (Motif Discovery);417
17.2.1.3;2.1.3 Prediction (Motif Search);419
17.2.2;2.2 Genome-Wide DNA-Binding Pattern Discovery;423
17.3;3 Unsupervised Learning for Inferring microRNA Regulatory Network;424
17.3.1;3.1 PicTar;426
17.3.2;3.2 A Probabilistic Approach to Explore Human miRNA Target Repertoire by Integrating miRNA-Overexpression Data and Sequence Information;430
17.3.2.1;3.2.1 Bayesian Mixture Model;431
17.3.2.2;3.2.2 Variational Bayesian Expectation Maximization;432
17.3.2.3;3.2.3 TargetScore;432
17.3.3;3.3 Network-Based Methods to Detect miRNA Regulatory Modules;433
17.3.4;3.4 GroupMiR: Inferring miRNA and mRNA Group Memberships with Indian Buffet Process;434
17.3.5;3.5 SNMNMF: Sparse Network-Regularized Multiple Nonnegative Matrix Factorization;440
17.3.6;3.6 Mirsynergy: Detecting Synergistic miRNA Regulatory Modules by Overlapping Neighborhood Expansion;442
17.3.6.1;3.6.1 Two-Stage Clustering;442
17.4;References;445
18;The Application of LSA to the Evaluation of Questionnaire Responses;455
18.1;1 Introduction;456
18.2;2 Essays for Evaluation;456
18.2.1;2.1 Open-Ended Responses Provide Unique Evaluation Leverage;457
18.2.2;2.2 The Problem with Essays: Human Raters Don't Scale;457
18.2.2.1;2.2.1 Expense;458
18.2.2.2;2.2.2 Language Dependencies;458
18.2.2.3;2.2.3 Consistency Issues;459
18.2.3;2.3 Automated Scoring Is Needed;460
18.2.3.1;2.3.1 Creating Automated Methods Requires Learning Systems;460
18.3;3 LSA as an Unsupervised Learning System;460
18.3.1;3.1 Brief History of LSA;461
18.3.2;3.2 Mathematical Background;462
18.3.2.1;3.2.1 Parsing: Turning Words into Numbers;462
18.3.2.2;3.2.2 Singular Value Decomposition;464
18.3.2.3;3.2.3 Query and Analysis Processing;466
18.3.3;3.3 LSA Learns Language;468
18.3.3.1;3.3.1 Unsupervised Learning;469
18.3.3.2;3.3.2 The LSA Model of Learning;470
18.3.3.3;3.3.3 Evidence of the Model;472
18.3.4;3.4 LSA Applications;475
18.4;4 Methodology;477
18.4.1;4.1 Objective;477
18.4.2;4.2 The Base Interpretive Space;477
18.4.2.1;4.2.1 Corpus Size;478
18.4.2.2;4.2.2 Relevant and Distributed Content;478
18.4.2.3;4.2.3 Term Coverage;479
18.4.3;4.3 Evaluation Algorithms;480
18.4.3.1;4.3.1 Target Based Scoring;480
18.4.3.2;4.3.2 Near Neighbor Scoring;481
18.4.3.3;4.3.3 Additive Analysis;481
18.4.4;4.4 Feedback Selection;482
18.5;5 Case Study NICHD Project;482
18.5.1;5.1 Background: Driver Training;482
18.5.1.1;5.1.1 Open-Ended Responses to Scenario Prompts;483
18.5.1.2;5.1.2 Provide Feedback Suggestions for Improvement;483
18.5.2;5.2 Construction of the Background Space;484
18.5.3;5.3 Establish Target and Feedback Items;484
18.5.3.1;5.3.1 Human Input: The SME;485
18.5.3.2;5.3.2 Human Selected Feedback Items;485
18.5.4;5.4 Feedback Selection Method;486
18.5.5;5.5 Results;487
18.6;6 Conclusion;487
18.7;References;488
19;Mining Evolving Patterns in Dynamic Relational Networks;491
19.1;1 Introduction;491
19.2;2 Definitions and Notation;493
19.3;3 Mining the Evolution of Conserved Relational States;496
19.3.1;3.1 Evolving Induced Relational State;496
19.3.2;3.2 Finding Evolving Induced Relational States;499
19.3.2.1;3.2.1 Step 1: Mining of Induced Relational States;499
19.3.2.2;3.2.2 Step 2: Mining of Maximal Evolution Paths;506
19.4;4 Mining the Coevolving Relational Motifs;508
19.4.1;4.1 Coevolving Relational Motifs;508
19.4.1.1;4.1.1 CRM Embedding/Occurrence;509
19.4.1.2;4.1.2 CRM Constraints;509
19.4.2;4.2 Finding Coevolving Relational Motifs;511
19.4.2.1;4.2.1 CRM Representation;511
19.4.2.2;4.2.2 Mining Anchors;513
19.4.2.3;4.2.3 CRM Enumeration;513
19.4.2.4;4.2.4 CRMminer Algorithm;519
19.4.2.5;4.2.5 Algorithm Completeness;519
19.4.2.6;4.2.6 Search Space Pruning;520
19.5;5 Mining the Coevolving Induced Relational Motifs;524
19.5.1;5.1 Coevolving Induced Relational Motifs;524
19.5.2;5.2 Coevolving Induced Relational Motif Mining;526
19.5.2.1;5.2.1 Mining Anchors;526
19.5.2.2;5.2.2 CIRM Enumeration;527
19.5.2.3;5.2.3 CIRMminer Algorithm;530
19.6;6 Qualitative Analysis and Applications;531
19.6.1;6.1 Analysis of a Trade Network;531
19.6.2;6.2 Analysis of a Co-authorship Network;533
19.6.3;6.3 Analysis of a Multivariate Time-Series Dataset;534
19.7;7 Conclusion and Future Research Directions;534
19.8;References;537
20;Probabilistically Grounded Unsupervised Training of Neural Networks;539
20.1;1 Introduction;539
20.2;2 Unsupervised Estimation of Probability Density Functions;540
20.2.1;2.1 Estimating pdfs via Constrained RBFs;541
20.2.2;2.2 Estimating pdfs via Multilayer Perceptrons;544
20.3;3 From pdf Estimation to Online Neural Clustering;549
20.4;4 Maximum-Likelihood Modeling of Sequences;554
20.4.1;4.1 Motivation: Beyond Hidden Markov Models and Recurrent ANNs;554
20.4.2;4.2 Unsupervised ML Learning in Generative ANN/HMM Hybrids;556
20.5;5 Conclusions;561
20.6;References;562




