Locarek-Junge / Weihs Classification as a Tool for Research

Proceedings of the 11th IFCS Biennial Conference and 33rd Annual Conference of the Gesellschaft für Klassifikation e.V., Dresden, March 13-18, 2009
1. Auflage 2010
ISBN: 978-3-642-10745-0
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark

Proceedings of the 11th IFCS Biennial Conference and 33rd Annual Conference of the Gesellschaft für Klassifikation e.V., Dresden, March 13-18, 2009

E-Book, Englisch, 823 Seiten

Reihe: Studies in Classification, Data Analysis, and Knowledge Organization

ISBN: 978-3-642-10745-0
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark



Clustering and Classification, Data Analysis, Data Handling and Business Intelligence are research areas at the intersection of statistics, mathematics, computer science and artificial intelligence. They cover general methods and techniques that can be applied to a vast set of applications such as in business and economics, marketing and finance, engineering, linguistics, archaeology, musicology, biology and medical science. This volume contains the revised versions of selected papers presented during the 11th Biennial IFCS Conference and 33rd Annual Conference of the German Classification Society (Gesellschaft für Klassifikation - GfKl). The conference was organized in cooperation with the International Federation of Classification Societies (IFCS), and was hosted by Dresden University of Technology, Germany, in March 2009.

Locarek-Junge / Weihs Classification as a Tool for Research jetzt bestellen!

Weitere Infos & Material


1;Preface
;6
2;Contents
;14
3;Contributors
;24
4;Part I (Semi-) Plenary Presentations;38
4.1;Hierarchical Clustering with Performance Guarantees;39
4.1.1;1 Introduction;39
4.1.2;2 A Replacement for k-d Trees;40
4.1.2.1;2.1 The Curse of Dimension for Spatial Data Structures;40
4.1.2.2;2.2 Low Dimensional Manifolds and Intrinsic Dimension;42
4.1.2.3;2.3 Random Projection Trees;43
4.1.3;3 A Replacement for Complete Linkage;44
4.1.3.1;3.1 An Existence Problem for Hierarchical Clustering;44
4.1.3.2;3.2 Approximation Algorithms for Clustering;45
4.1.3.3;3.3 Farthest-First Traversal;46
4.1.3.4;3.4 A Hierarchical Clustering Algorithm;47
4.1.4;References;50
4.2;Alignment Free String Distances for Phylogeny;51
4.2.1;1 Introduction;51
4.2.2;2 Four Alignment Free Distances;52
4.2.2.1;2.1 The MSM Distance;52
4.2.2.2;2.2 The k-word distance;53
4.2.2.3;2.3 The ACS distance;54
4.2.2.4;2.4 A Compression Distance;55
4.2.3;3 Simulations;56
4.2.3.1;3.1 A Simple Evolutionary Model;56
4.2.3.2;3.2 The Simulation Process;57
4.2.3.3;3.3 Simulation Results;57
4.2.4;4 Conclusion;59
4.2.5;References;59
4.3;Data Quality Dependent Decision Making in Pattern Classification;61
4.3.1;1 Introduction;61
4.3.2;2 Theoretical Framework;63
4.3.2.1;2.1 Problem Formulation;63
4.3.2.2;2.2 Quality-Based Fusion;66
4.3.2.3;2.3 Data Quality Assessment;68
4.3.3;3 An Illustration of the Benefits of the Quality Based Fusion;69
4.3.4;4 Conclusions;71
4.3.5;References;71
4.4;Clustering Proteins and Reconstructing Evolutionary Events;73
4.4.1;1 Introduction: Clustering and Knowledge Feedback;73
4.4.2;2 Clustering Using the Data Recovery Approach;75
4.4.2.1;2.1 Additive Clustering and One-by-One Iterative Extraction;75
4.4.2.2;2.2 One Cluster Clustering;76
4.4.2.2.1;2.2.1 Pre-specified Intensity;76
4.4.2.2.2;2.2.2 Optimal Intensity;77
4.4.3;3 Proteome Knowledge in Determining Similarity Shift;77
4.4.3.1;3.1 Protein Families and Evolutionary Tree;77
4.4.3.2;3.2 Utilizing Knowledge of Proteome;79
4.4.4;4 Advancing Genome Knowledge;82
4.4.4.1;4.1 Reconstructed Histories of HPFs;82
4.4.4.2;4.2 Derived Ancestors of Herpes Proteins;82
4.4.5;5 Conclusion;83
4.4.6;References;84
4.5;Microarray Dimension Reduction Based on Maximizing Mantel Correlation Coefficients Using a Genetic Algorithm Search Strategy;85
4.5.1;1 Introduction;85
4.5.2;2 Methods;87
4.5.3;3 Results;90
4.5.4;4 Discussion;95
4.5.5;References;95
5;Part II Classification and Data Analysis;97
5.1;Multiparameter Hierarchical Clustering Methods;98
5.1.1;1 Introduction;98
5.1.2;2 Notation and Terminology;101
5.1.3;3 Two Parameter Hierarchical Clustering: A Characterization Theorem;101
5.1.4;4 Metric Stability of C;104
5.1.5;References;105
5.2;Unsupervised Sparsification of Similarity Graphs;106
5.2.1;1 Introduction and Related Work;106
5.2.2;2 Sparsification;108
5.2.2.1;2.1 Existing Approaches;109
5.2.2.2;2.2 An Object-specific, Unsupervised Approach to Sparsification;110
5.2.3;3 Evaluation;111
5.2.4;4 Conclusion;113
5.2.5;References;113
5.3;Simultaneous Clustering and Dimensionality ReductionUsing Variational Bayesian Mixture Model;115
5.3.1;1 Introduction;115
5.3.2;2 Exponential Family and e-PCA;116
5.3.3;3 Constrained Mixture Model;117
5.3.4;4 Variational Bayes Method;118
5.3.4.1;4.1 Optimal q2() for Fixed q1(Zn);119
5.3.4.2;4.2 Optimal q1(Zn) for Fixed q2();119
5.3.4.3;4.3 Laplace Approximation;120
5.3.5;5 Dimensionality Reduction;120
5.3.6;6 Experiments;121
5.3.7;7 Discussion and Conclusion;122
5.3.8;References;122
5.4;A Partitioning Method for the Clustering of Categorical Variables;124
5.4.1;1 Introduction;124
5.4.2;2 A Center-Based Partitioning Method for the Clustering of Categorical Variables;125
5.4.2.1;2.1 Definition of the Latent Variable;126
5.4.2.2;2.2 The Center-Based Clustering Algorithm;127
5.4.3;3 Applications;128
5.4.3.1;3.1 Simulation Study;128
5.4.3.2;3.2 Real Data Application;129
5.4.4;4 Concluding Remarks;131
5.4.5;References;132
5.5;Treed Gaussian Process Models for Classification;133
5.5.1;1 Introduction and Background;133
5.5.1.1;1.1 Gaussian Processes for Regression and Classification;134
5.5.2;2 Treed Gaussian Processes;135
5.5.2.1;2.1 TGP for Regression;135
5.5.2.2;2.2 TGP for Classification;136
5.5.3;3 Illustrations and Empirical Results;138
5.5.3.1;3.1 2d Exponential Data;138
5.5.3.2;3.2 Classification TGP on Real Data;139
5.5.4;4 Conclusion;140
5.5.5;References;140
5.6;Ridgeline Plot and Clusterwise Stability as Tools for Merging Gaussian Mixture Components;141
5.6.1;1 Introduction;141
5.6.2;2 The Ridgeline Method;143
5.6.3;3 A Method Based on Misclassification Probabilities;144
5.6.4;4 Bootstrap Stability Assessment;145
5.6.5;5 Real Data Example: Clustering Melody Contours;146
5.6.6;6 Conclusion;148
5.6.7;References;148
5.7;Clustering with Confidence: A Low-Dimensional Binning Approach;149
5.7.1;1 Introduction;149
5.7.2;2 Cluster Trees: Piecewise Constant Density Estimates;150
5.7.3;3 Clustering with Confidence;152
5.7.3.1;3.1 Bootstrap Confidence Sets for Level Sets;153
5.7.3.2;3.2 Constructing the Cluster Tree;153
5.7.4;4 Example: ``Automatic Gating'' in Flow Cytometry;155
5.7.5;5 Summary and Future Work;156
5.7.6;References;156
5.8;Local Classification of Discrete Variables by Latent Class Models;158
5.8.1;1 Introduction;158
5.8.2;2 Mixtures Versus Common Components;159
5.8.3;3 Latent Class Analysis;159
5.8.3.1;3.1 Estimation;160
5.8.3.2;3.2 Model Selection;161
5.8.4;4 Local Classification of Discrete Variables;161
5.8.4.1;4.1 Class Conditional Mixtures;161
5.8.4.2;4.2 Common Components;162
5.8.4.2.1;4.2.1 Classification Capability;163
5.8.5;5 Application;163
5.8.5.1;5.1 Simulation Study;163
5.8.5.2;5.2 SNP Data;165
5.8.6;6 Conclusion;165
5.8.7;References;165
5.9;A Comparative Study on Discrete Discriminant Analysis through a Hierarchical Coupling Approach;167
5.9.1;1 Introduction;167
5.9.2;2 Combining Models in Biclass Problems;168
5.9.3;3 The Hierarchical Coupling Model (HIERM);169
5.9.4;4 Comparison of the HIERM Model with Other Models, Using Similarity Coefficients for Binary Data;170
5.9.4.1;4.1 Similarity Coefficients for Binary Data;170
5.9.5;5 Numerical Experiments;170
5.9.6;6 Conclusions;174
5.9.7;References;175
5.10;A Comparative Study of Several Parametricand Semiparametric Approaches for Time Series Classification;176
5.10.1;1 Introduction;176
5.10.2;2 Some Dissimilarity Measures Between Time Series;177
5.10.3;3 Simulation Study;179
5.10.3.1;3.1 Classification of Time Series as Stationary or Non-Stationary;179
5.10.3.2;3.2 Clustering of ARMA Time Series;180
5.10.3.3;3.3 Clustering of Non-Linear Time Series;182
5.10.4;4 Concluding Remarks;183
5.10.5;References;183
5.11;Finite Dimensional Representation of Functional Data with Applications;185
5.11.1;1 Introduction;185
5.11.2;2 Representing Functional Data in a Reproducing Kernel Hilbert Space;186
5.11.2.1;2.1 Functional Data Projections onto the Eigenfunctions Space;187
5.11.3;3 Experiments;189
5.11.3.1;3.1 RKHS Projections Versus PCA Projections;189
5.11.3.2;3.2 Classification Example;191
5.11.4;4 Conclusions;192
5.11.5;References;192
5.12;Clustering Spatio-Functional Data: A Model Based Approach;194
5.12.1;1 Introduction and Problematic;194
5.12.2;2 The Spatio-Functional Data;195
5.12.3;3 Dynamic Clustering Algorithm;196
5.12.4;4 Dynamic Clustering for Spatio-Functional Data;196
5.12.5;5 Analysis of a Real Dataset: Sea Temperatureof the Italian Coast;198
5.12.6;6 Conclusion and Future Research;200
5.12.7;References;202
5.13;Use of Mixture Models in Multiple Hypothesis Testingwith Applications in Bioinformatics;203
5.13.1;1 Introduction;203
5.13.2;2 Modelling of Z-Scores;205
5.13.3;3 Example: Breast Cancer Data;207
5.13.4;4 Empirical Null;207
5.13.5;5 Simulation Study;208
5.13.6;References;209
5.14;Finding Groups in Ordinal Data: An Examinationof Some Clustering Procedures;211
5.14.1;1 Introduction;211
5.14.2;2 Clustering Procedures for Ordinal Data;212
5.14.3;3 Simulation Experiment Characteristics;213
5.14.4;4 Discussion on Simulation Results;215
5.14.5;5 Limitations;217
5.14.6;References;218
5.15;An Application of One-mode Three-way Overlapping Cluster Analysis;219
5.15.1;1 Introduction;219
5.15.2;2 Overlapping Cluster Analysis Models;220
5.15.3;3 Improvement of the Algorithm;221
5.15.4;4 An Application;222
5.15.5;5 Discussion and Conclusion;225
5.15.6;References;226
5.16;Evaluation of Clustering Results: The Trade-off Bias-Variability;227
5.16.1;1 Desirable Properties of a Clustering Solution;227
5.16.2;2 Evaluating Stability;228
5.16.2.1;2.1 Indices of Agreement Between Partitions;228
5.16.2.2;2.2 Cross-validation;229
5.16.3;3 The Proposed Approach;229
5.16.4;4 Data Analysis;230
5.16.5;5 Discussion and Perspectives;232
5.16.6;References;233
5.17;Cluster Structured Multivariate Probability Distributionwith Uniform Marginals;235
5.17.1;1 Introduction;235
5.17.2;2 The 3-Dimensional Case;235
5.17.2.1;2.1 Description of Proposed Probability Distribution;236
5.17.2.1.1;2.1.1 Marginal Distributions;236
5.17.2.1.2;2.1.2 Raw and Central Moments;237
5.17.2.1.3;2.1.3 The Characteristic Function;239
5.17.3;3 Cluster Structured 3-Dimensional Distribution with Uniform Marginals;239
5.17.4;4 The n-Dimensional Case;241
5.17.5;Reference;242
5.18;Analysis of Diversity-Accuracy Relations in Cluster Ensemble;243
5.18.1;1 Introduction;243
5.18.2;2 Diversity Measures;244
5.18.3;3 Numerical Experiments and Results;245
5.18.4;4 Summary;248
5.18.5;References;250
5.19;Linear Discriminant Analysis with more Variablesthan Observations: A not so Naive Approach;252
5.19.1;1 Introduction;252
5.19.2;2 A Not so Naive Linear Discriminant Rule;253
5.19.3;3 Asymptotic Properties;255
5.19.4;4 Simulation Study;257
5.19.5;5 Conclusions and Perspectives;259
5.19.6;References;259
5.20;Fast Hierarchical Clustering from the Baire Distance;260
5.20.1;1 Introduction;260
5.20.2;2 Longest Common Prefix or Baire Distance;261
5.20.2.1;2.1 Ultrametric Baire Space and Distance;261
5.20.3;3 Application to Chemoinformatics;261
5.20.3.1;3.1 Dimensionality Reduction by Random Projection;262
5.20.3.2;3.2 Chemoinformatics Data Clustering;262
5.20.4;4 Application to Astronomy;263
5.20.4.1;4.1 Clustering SDSS Data Based on a Baire Distance;263
5.20.4.2;4.2 Baire and K-means Cluster Comparison;265
5.20.5;5 Conclusions;266
5.20.6;References;268
5.21;The Trend Vector Model: Identification and Estimation in SAS;269
5.21.1;1 Introduction;269
5.21.2;2 Example;271
5.21.3;3 Identification Problems;272
5.21.3.1;3.1 De Rooij's Solution;273
5.21.3.2;3.2 Simpler Solution;273
5.21.4;4 Estimation with SAS proc nlmixed;274
5.21.5;5 Conclusion;275
5.21.6;References;276
5.22;Discrete Beta-Type Models;277
5.22.1;1 Introduction;277
5.22.2;2 A Re-parameterized Discrete Beta Distribution;278
5.22.3;3 Smoothing by Discrete Beta Kernels;280
5.22.3.1;3.1 Choosing the Smoothing Parameter h;281
5.22.4;4 Application to a Real Data Set;282
5.22.5;5 Concluding Remarks;284
5.22.6;References;284
5.23;The R Package DAKS: Basic Functions and Complex Algorithms in Knowledge Space Theory;286
5.23.1;1 Introduction;286
5.23.2;2 Basics of KST and IITA;287
5.23.3;3 The R Package DAKS;289
5.23.4;4 Conclusion;293
5.23.5;References;293
5.24;Methods for the Analysis of Skew-Symmetry in Asymmetric Multidimensional Scaling;294
5.24.1;1 Introduction;294
5.24.2;2 Scalar Product-like Models (Two-Way Case);295
5.24.3;3 Scalar Product-like Models (Three-Way Case);298
5.24.4;4 Distance-like Models;299
5.24.5;5 Conclusions;300
5.24.6;References;300
5.25;Canonical Correspondence Analysis in Social Science Research;302
5.25.1;1 Introduction;302
5.25.2;2 Canonical Correspondence Analysis;303
5.25.3;3 Constraining by a Single Categorical Variable;304
5.25.4;4 Constraints for Dealing with Missing Responses;307
5.25.5;5 Discussion;309
5.25.6;References;309
5.26;Exploring Data Through Archetypes;310
5.26.1;1 Introduction;310
5.26.2;2 Elements of Archetypal Analysis;311
5.26.3;3 Elements of Spreadplot Design;314
5.26.4;4 The Proposed Exploratory Data Analysis Strategy;314
5.26.4.1;4.1 Deriving and Analyzing Archetypes by Varying m;315
5.26.4.2;4.2 Representing Data in the Spaces Spanned by the Archetypes;317
5.26.4.3;4.3 Exploring the Peripheries of the Data Scatter;318
5.26.5;5 Concluding Remarks;319
5.26.6;References;320
5.27;Exploring Sensitive Topics: Sensitivity, Jeopardy, and Cheating;322
5.27.1;1 Introduction;322
5.27.2;2 Randomized Response;323
5.27.3;3 Exploring Sensitivity;324
5.27.4;4 The Sensitivity Level;326
5.27.5;5 Conclusions;327
5.27.6;References;328
5.28;Sampling the Join of Streams;329
5.28.1;1 Introduction;329
5.28.2;2 General Framework;330
5.28.3;3 Four Algorithms for Sampling the Join of Streams;331
5.28.3.1;3.1 Reservoir Sampling;331
5.28.3.2;3.2 Weighted Reservoir Sampling;331
5.28.3.3;3.3 Deterministic Reservoir Sampling;332
5.28.3.4;3.4 Active Reservoir Sampling;334
5.28.4;4 Experimental Results;334
5.28.5;5 Conclusion and Future Works;335
5.28.6;References;336
5.29;The R Package fechner for Fechnerian Scaling;337
5.29.1;1 Introduction;337
5.29.2;2 Theory of FS;338
5.29.3;3 The R Package fechner;340
5.29.4;4 Example;341
5.29.5;5 Conclusion;344
5.29.6;References;344
5.30;Asymptotic Behaviour in Symbolic Markov Chains;345
5.30.1;1 Introduction;345
5.30.2;2 Symbolic Variables;346
5.30.3;3 Markov Chains;346
5.30.3.1;3.1 The Markov Property;346
5.30.4;4 The CK Property for Symbolic Stochastic Processes;347
5.30.4.1;4.1 Multivalued Categorical Variable;347
5.30.4.2;4.2 Interval Valued Variable;347
5.30.5;5 Stationary Distribution in Discrete Time;348
5.30.5.1;5.1 Single Categorical Variable;348
5.30.5.2;5.2 Multivalued Categorical Variable;349
5.30.5.3;5.3 Single Valued Quantitative Variable (Continuous Variable);349
5.30.5.4;5.4 Particular Case: The Random Walk;349
5.30.5.5;5.5 Interval Valued Variable;350
5.30.5.6;5.6 Particular Case;350
5.30.5.7;5.7 Random Walk in Discrete Time;351
5.30.5.8;5.8 Non Independent Random Walk;351
5.30.6;6 Conclusion;352
5.30.6.1;6.1 Future Work;352
5.30.7;References;352
5.31;An Interactive Graphical System for Visualizing Data Quality–Tableplot Graphics;353
5.31.1;1 Introduction;353
5.31.2;2 Visualization Design;355
5.31.3;3 Interactivity in Tableplot;356
5.31.4;4 Tableplot for Visualizing Data Quality;356
5.31.5;5 Comparison with Other Plots;359
5.31.6;6 Software;360
5.31.7;7 Conclusion;361
5.31.8;References;361
5.32;Symbolic Multidimensional Scaling Versus Noisy Variables and Outliers;362
5.32.1;1 Introduction;362
5.32.2;2 Symbolic Data;363
5.32.3;3 Symbolic Multidimensional Scaling Methods;364
5.32.4;4 The Models;366
5.32.5;5 Results of Simulations;367
5.32.6;6 Final Remarks;368
5.32.7;References;369
5.33;Principal Components Analysis for Trapezoidal Fuzzy Numbers;371
5.33.1;1 Introduction;372
5.33.2;2 The Method: PCA-TF;372
5.33.3;3 Tests of the Performance of the PCA-TF Method;377
5.33.4;4 Conclusions;380
5.33.5;References;380
5.34;Factor Selection in Observational Studies – An Applicationof Nonlinear Factor Selection to Propensity Scores;381
5.34.1;1 Introduction;381
5.34.2;2 Theoretical Framework;383
5.34.3;3 Factor Selection for Propensity Score Modelling;384
5.34.3.1;3.1 Non-linear Factor Selection;384
5.34.3.2;3.2 Factor Selection for the Propensity Score Model;385
5.34.4;4 Example;386
5.34.5;References;388
5.35;Nonlinear Mapping Using a Hybridof PARAMAP and Isomap Approaches;390
5.35.1;1 Introduction;390
5.35.2;2 PARAMAP and Isomap Algorithms;391
5.35.2.1;2.1 The PARAMAP Algorithm;392
5.35.2.2;2.2 The Isomap Algorithm;392
5.35.2.3;2.3 Evaluation of the Mapping Results;393
5.35.3;3 PARAMAP-Isomap Hybrid Approach;394
5.35.3.1;3.1 Isomap Preprocessing Step;394
5.35.3.2;3.2 Mapping the Holdout Points;395
5.35.4;4 Results on the Experimental Configurations;395
5.35.4.1;4.1 Sphere with 62 Regularly Spaced Points;395
5.35.4.2;4.2 Sphere with 1,000 Points;397
5.35.4.3;4.3 Swiss Roll with 1,000 Points;398
5.35.5;5 Conclusion;398
5.35.6;References;399
5.36;Dimensionality Reduction Techniques for Streaming Time Series: A New Symbolic Approach;400
5.36.1;1 Introduction;400
5.36.2;2 Related Works;401
5.36.3;3 A New Symbolic Strategy for Streaming Time Series Dimensionality Reduction;402
5.36.3.1;3.1 Training Step;403
5.36.3.2;3.2 Online Representation;404
5.36.3.3;3.3 A Feasible Representation for Bivariate Streaming Time Series;405
5.36.3.4;3.4 Time Series Approximation;405
5.36.4;4 Experimental Evaluation;406
5.36.5;5 Conclusions and Perspectives;406
5.36.6;References;407
5.37;A Batesian Semiparametric Generalized Linear Modelwith Random Effects Using Dirichlet Process Priors;409
5.37.1;1 Introduction;409
5.37.2;2 Finite Mixture GLM with Random Effects;410
5.37.3;3 Representation of the Dirichlet Process Mixture Model;410
5.37.4;4 Algorithm: Blocked Gibbs Sampler;412
5.37.5;5 Simulation Studies;413
5.37.5.1;5.1 Simulation Study 2;415
5.37.6;6 Conclusion;415
5.37.7;References;416
5.38;Exact Confidence Intervals for Odds Ratios with Algebraic Statistics;417
5.38.1;1 Introduction;417
5.38.2;2 Traditional Confidence Intervals for the Odds Ratio;418
5.38.3;3 Algebraic Confidence Interval;419
5.38.4;4 Simulation Study;422
5.38.5;5 Example;423
5.38.6;6 Discussion;424
5.38.7;References;424
5.39;The CHIC Analysis Software v1.0;426
5.39.1;1 Introduction;426
5.39.2;2 Data Entry and Data Management;427
5.39.3;3 Simple Correspondence Analysis;428
5.39.4;4 Multiple Correspondence Analysis;429
5.39.5;5 Visualization Options;430
5.39.6;6 Ward Clustering as a Complementary Method;432
5.39.7;7 Summary;432
5.39.8;References;432
6;Part III Applications;434
6.1;Clustering the Roman Heaven: Uncovering the Religious Structures in the Roman Province Germania Superior;435
6.1.1;1 Introduction;435
6.1.2;2 Data and Methods;436
6.1.3;3 Results and Interpretations;438
6.1.4;4 Validation of the Results;440
6.1.5;5 Conclusion;441
6.1.6;References;442
6.2;Geochemical and Statistical Investigation of Roman Stamped Tiles of the Legio XXI Rapax;443
6.2.1;1 Introduction;443
6.2.2;2 The Roman Stamped Tiles Investigated by Giacomini;444
6.2.3;3 Preparation of Data Coming from Different Laboratories;444
6.2.4;4 Hierarchical Cluster Analysis by Ward's Method;446
6.2.5;5 Validation of Cluster Analysis Results;446
6.2.6;6 Interpretation of the Geochemical Clusters;449
6.2.7;References;450
6.3;Land Cover Classification by Multisource Remote Sensing: Comparing Classifiers for Spatial Data;451
6.3.1;1 Introduction;451
6.3.2;2 Benchmarking Classifiers for Multisource Rock Glacier Detection;453
6.3.2.1;2.1 Materials and Methods;453
6.3.2.2;2.2 Results;454
6.3.3;3 Discussion;455
6.3.3.1;3.1 State-of-the-Art Classifiers for Land Cover Mapping;455
6.3.3.2;3.2 High-Dimensional Problems in Remote Sensing;456
6.3.3.3;3.3 Spatial Error Estimation;456
6.3.3.4;3.4 Indirect Classification in Remote Sensing;457
6.3.4;4 Conclusions;457
6.3.5;References;458
6.4;Are there Cluster of Communities with the SameDynamic Behaviour?;460
6.4.1;1 Introduction;460
6.4.2;2 Data for German Community Dynamics;461
6.4.3;3 Visualization and Clustering of Similar Dynamics;463
6.4.4;4 Explaining Patterns of Multidimensional Dynamics;464
6.4.5;5 Transition to Knowledge and Spatial Abstraction;464
6.4.6;6 Discussion;465
6.4.7;7 Conclusion;466
6.4.8;References;467
6.5;Land Cover Detection with Unsupervised Clusteringand Hierarchical Partitioning;469
6.5.1;1 Introduction;469
6.5.2;2 Processing Flow;470
6.5.3;3 Hierarchical Segmentation;472
6.5.3.1;3.1 Transition Regions and Image Masking;472
6.5.4;4 Unsupervised Clustering;473
6.5.5;5 Classification and Cluster Separability;474
6.5.6;6 Concluding Remarks and Prospectives;476
6.5.7;References;476
6.6;Using Advanced Regression Models for Determining Optimal Soil Heterogeneity Indicators;477
6.6.1;1 Introduction;477
6.6.1.1;1.1 Research Target;478
6.6.1.2;1.2 Article Structure;478
6.6.2;2 Data Description;479
6.6.3;3 Advanced Regression Techniques;479
6.6.3.1;3.1 Introduction to Regression Techniques;480
6.6.3.2;3.2 Neural Networks;480
6.6.3.3;3.3 Regression Tree;481
6.6.3.4;3.4 Support Vector Regression;481
6.6.3.5;3.5 Linear Regression and Naive Estimator;482
6.6.3.6;3.6 Model Parameter Estimation;482
6.6.4;4 Regression Results;482
6.6.5;5 Conclusion;483
6.6.5.1;5.1 Future Work;484
6.6.6;References;484
6.7;Local Analysis of SNP Data;486
6.7.1;1 Introduction;486
6.7.2;2 Methods;487
6.7.2.1;2.1 Associative Classification;487
6.7.2.2;2.2 Localised Logistic Regression;489
6.7.3;3 Data;490
6.7.4;4 Results;491
6.7.5;5 Summary and Discussion;492
6.7.6;References;493
6.8;Airborne Particulate Matter and Adverse Health Events: Robust Estimation of Timescale Effects;494
6.8.1;1 Introduction;494
6.8.2;2 Materials and Methods;495
6.8.2.1;2.1 Data and Statistical Approach to Estimation of Associationsat Different Timescales;495
6.8.2.2;2.2 Fourier Decomposition;497
6.8.2.3;2.3 Singular Spectrum Analysis;497
6.8.3;3 Results and Discussion;499
6.8.4;References;501
6.9;Identification of Specific Genomic Regions Responsiblefor the Invasivity of Neisseria Meningitidis;503
6.9.1;1 Introduction;503
6.9.2;2 Neisseria Meningitidis and the FrpB Proteins;504
6.9.3;3 Algorithm for Detection of Genomic Regions Responsible for Disease;505
6.9.4;4 Results and Discussion;507
6.9.5;5 Conclusion;510
6.9.6;References;510
6.10;Classification of ABC Transporters Using Community Detection;512
6.10.1;1 Introduction;512
6.10.2;2 Materials and Methods;513
6.10.2.1;2.1 Data Sources;513
6.10.2.2;2.2 Methods;514
6.10.2.2.1;2.2.1 Identification and Filtering of Isorthologous Links;514
6.10.2.2.2;2.2.2 Identification of Isortholog Groups by Community Detection;514
6.10.2.2.3;2.2.3 Validation;515
6.10.3;3 Results;517
6.10.3.1;3.1 General Results on ABC System Classification;517
6.10.3.2;3.2 Results on Pentose-Related Importer Subfamily;517
6.10.4;4 Conclusion;518
6.10.5;References;519
6.11;Estimation of the Number of Sustained Viral Respondersby Interferon Therapy Using Random Numbers with a Logistic Model;520
6.11.1;1 Introduction;520
6.11.2;2 Subjects and Model Assumptions;521
6.11.3;3 Methods;521
6.11.4;4 Results;522
6.11.5;5 Discussion;524
6.11.6;References;526
6.12;Virtual High Throughput Screening Using Machine Learning Methods;528
6.12.1;1 Introduction;528
6.12.2;2 Data Description;529
6.12.3;3 Prediction of Experimental HTS Results Using Machine Learning Methods;530
6.12.3.1;3.1 Sampling Strategy;530
6.12.3.2;3.2 Machine Learning Methods;530
6.12.3.3;3.3 Comparison of Molecular and Atomic Descriptors;531
6.12.3.4;3.4 Results and Discussion;532
6.12.4;4 Conclusion and Future Developments;534
6.12.5;References;535
6.13;Network Analysis of Works on Clustering and Classification from Web of Science;536
6.13.1;1 Introduction;536
6.13.2;2 Networks from WoS;537
6.13.3;3 Analyses of Records from JoC;538
6.13.3.1;3.1 Collaboration Network;540
6.13.3.2;3.2 Citation Network Analysis;541
6.13.3.3;3.3 Citations Between Authors;544
6.13.3.3.1;3.3.1 Line Islands [10,400] – Authors Citations;544
6.13.4;4 Conclusion;546
6.13.5;References;547
6.14;Recommending in Social Tagging Systems Based on Kernelized Multiway Analysis;548
6.14.1;1 Introduction;548
6.14.2;2 Related Work;549
6.14.3;3 Tensors and Tucker Decomposition;550
6.14.4;4 Recommendation Based on Tucker Decomposition;551
6.14.5;5 Smoothing with Kernel Functions;552
6.14.6;6 Experimental Results;553
6.14.7;7 Conclusions;555
6.14.8;References;555
6.15;Dynamic Population Segmentation in Online Market Monitoring;556
6.15.1;1 Introduction;556
6.15.2;2 Related Work;557
6.15.3;3 Sensor Binning Based on Price Dynamics;558
6.15.3.1;3.1 Harvest Adaptation;558
6.15.3.2;3.2 Dynamic Population Segmentation;560
6.15.4;4 Harvest Balancing;561
6.15.4.1;4.1 Feasible Harvest Schedules;561
6.15.4.2;4.2 Harvest Schedule Tuning;562
6.15.5;5 Summary;562
6.15.6;References;563
6.16;Gaining `Consumer Insights' from Influential Actors in Weblog Networks;564
6.16.1;1 Introduction;564
6.16.2;2 Methods for Analyzing the Blogosphere;565
6.16.2.1;2.1 SNA Measures and Ego Networks;565
6.16.2.2;2.2 Netnography;566
6.16.3;3 Empirical Study: Mobile Communication;567
6.16.3.1;3.1 Data Description;567
6.16.3.2;3.2 SNA Analysis;568
6.16.3.3;3.3 Netnographical Analysis;569
6.16.4;4 Conclusions and Future Work;570
6.16.5;References;571
6.17;Visualising a Text with a Tree Cloud;572
6.17.1;1 Introduction;572
6.17.2;2 Constructing a Tree Cloud;573
6.17.2.1;2.1 Building the List of Frequent Terms;573
6.17.2.2;2.2 Building the Distance Matrix;574
6.17.2.3;2.3 Building the Tree;574
6.17.2.4;2.4 Building the Tree Cloud;575
6.17.3;3 Evaluating the Quality of a Tree Cloud;575
6.17.3.1;3.1 Stability and Robustness;577
6.17.3.2;3.2 Arboricity;577
6.17.3.3;3.3 Distance Comparison on the Obama Corpus;577
6.17.3.4;3.4 Robustness to Parameter Variations;578
6.17.4;4 Conclusion;578
6.17.5;References;579
6.18;A Tree Kernel Based on Classification and Citation Datato Analyse Patent Documents;581
6.18.1;1 Introduction;581
6.18.2;2 European Classification System ECLA;582
6.18.3;3 Patent Citations;583
6.18.4;4 Tree Kernel;583
6.18.5;5 Experiment;586
6.18.6;6 Conclusions;587
6.18.7;References;588
6.19;A New SNA Centrality Measure Quantifying the Distanceto the Nearest Center;589
6.19.1;1 Introduction;589
6.19.2;2 Methodology;590
6.19.3;3 Data and Data Preparation;591
6.19.4;4 Results;593
6.19.4.1;4.1 R-Devel Network;593
6.19.4.2;4.2 R-Help Network;593
6.19.4.3;4.3 Empirical Evidence of the Usefulness of the WDNC;593
6.19.4.4;4.4 WDNC Compared to Formal R Organization;595
6.19.5;5 Conclusion and Discussion;596
6.19.6;References;596
6.20;Mining Innovative Ideas to Support New Product Research and Development;597
6.20.1;1 Introduction;597
6.20.2;2 Rationale Behind Mining Innovative Ideas;598
6.20.3;3 Process of Mining Innovative Ideas;599
6.20.4;4 Acquisition of Ideas;600
6.20.5;5 Acquisition of Technological Context Information;600
6.20.6;6 Relationship among Scientific Categories;600
6.20.7;7 Classification of Ideas;601
6.20.8;8 Results and Evaluation;602
6.20.9;9 Outlook;603
6.20.10;References;604
6.21;The Basis of Credit Scoring: On the Definition of Credit Default Events;605
6.21.1;1 Introduction;605
6.21.2;2 Data Set of Individual Payment Histories;606
6.21.3;3 A Payment-Pattern Approach to the Identification of Credit Default Events;606
6.21.3.1;3.1 The Patterns of Payment;607
6.21.3.2;3.2 Measurement of Profitability;608
6.21.3.3;3.3 Application to the Empirical Data Set;609
6.21.4;4 Indicators of Individual Payment Performance;610
6.21.4.1;4.1 Description of Indicators;610
6.21.4.2;4.2 Application to the Empirical Data Set;611
6.21.5;5 Discussion;612
6.21.6;References;612
6.22;Forecasting Candlesticks Time Series with Locally Weighted Learning Methods;613
6.22.1;1 Introduction;613
6.22.2;2 Locally Weighted Learning Methods for Candlestick Time Series;615
6.22.2.1;2.1 k-NN for Candlestick Time Series;615
6.22.2.1.1;2.1.1 Determination of the k Nearest Neighbors;615
6.22.2.1.2;2.1.2 Generation of the Forecast;616
6.22.3;3 Forecasting the S&P500 Candlestick Time Series;617
6.22.3.1;3.1 Removing the Trend from Candlestick Time Series;617
6.22.3.1.1;3.1.1 Differencing the Intervals;617
6.22.3.1.2;3.1.2 Differencing the Candlestick Using the Previous Close Value;618
6.22.3.2;3.2 The One-Step Ahead Forecasting Experiment;618
6.22.4;4 Future Work;620
6.22.5;References;620
6.23;An Analysis of Alternative Methods for MeasuringLong-Run Performance: An Application to ShareRepurchase Announcements;622
6.23.1;1 Introduction;622
6.23.2;2 Methodologies for Measuring Long-Run Performance;623
6.23.2.1;2.1 BHAR, Fama-French Alphas and Cross-Sectional Regressions;623
6.23.2.2;2.2 Calendar Time Portfolio Approach;624
6.23.2.3;2.3 Generalized Calendar Time Approach;625
6.23.3;3 Data and Empirical Results;625
6.23.3.1;3.1 BHAR, Fama-French Alphas and Cross-Sectional Regressions;626
6.23.3.2;3.2 Calendar Time Portfolio Approach;627
6.23.3.3;3.3 Generalized Calendar Time Approach;628
6.23.4;4 Conclusion;629
6.23.5;References;629
6.24;Knowledge Discovery in Stock Market Data;630
6.24.1;1 Introduction;630
6.24.2;2 Daily Returns on Stocks;631
6.24.3;3 Knowledge Discovery in Market Activities;633
6.24.4;4 Types of Marked States;634
6.24.5;5 Discussion;635
6.24.6;6 Conclusion;636
6.24.7;References;636
6.25;The Asia Financial Crises and Exchange Rates: Had there been Volatility Shifts for Asian Currencies?;638
6.25.1;1 Introduction;638
6.25.2;2 Model and Bayesian Inference;639
6.25.2.1;2.1 The Volatility Model;639
6.25.2.2;2.2 The Prior and Posterior Distribution;641
6.25.2.3;2.3 Gibbs Sampling;642
6.25.3;3 Empirical Analysis;642
6.25.3.1;3.1 Model Choice;642
6.25.3.2;3.2 Thailand;643
6.25.3.3;3.3 The Philippines;645
6.25.3.4;3.4 Indonesia;645
6.25.3.5;3.5 South Korea;645
6.25.4;4 Conclusions;646
6.25.5;References;646
6.26;The Pricing of Risky Securities in a Fuzzy Least Square Regression Model;647
6.26.1;1 The Capital Asset Pricing Model;647
6.26.2;2 A Fuzzy Least Squares Regression Approach;649
6.26.3;3 Case Study;651
6.26.4;4 Final Remarks;653
6.26.5;References;654
6.27;Classification of the Indo-European Languages Using a Phylogenetic Network Approach;655
6.27.1;1 Introduction;655
6.27.2;2 Description of the Dyen Database;656
6.27.3;3 Materials and Methods;657
6.27.4;4 Results and Discussion;659
6.27.5;5 Conclusion;661
6.27.6;References;662
6.28;Parsing as Classification;664
6.28.1;1 Introduction;664
6.28.2;2 WCDG;666
6.28.3;3 MSTParser;667
6.28.4;4 MaltParser;668
6.28.5;5 Parser Combination;669
6.28.6;6 Conclusion;670
6.28.7;References;670
6.29;Comparing the Stability of Clustering Results of Dialect Data Based on Several Distance Matrices;672
6.29.1;1 Introduction;672
6.29.2;2 The Compound Matrix;673
6.29.3;3 Why Use these Statistical Measures for Linguistic Data?;676
6.29.4;4 Comparing Hierarchical Cluster Results;676
6.29.4.1;4.1 Cluster Stability Results;676
6.29.4.2;4.2 Interpretation of the Dialect clusters;677
6.29.5;References;679
6.30;Marketing and Regional Sales: Evaluationof Expenditure Strategies by Spatial Sales Response Functions;680
6.30.1;1 Introduction;680
6.30.2;2 Cross Sectional Sales Response Models;681
6.30.2.1;2.1 The Basic CSSR Model;681
6.30.2.2;2.2 Bayesian Inference by MCMC for CSSR Models;682
6.30.3;3 A Spatial Auto-Regressive Extension to CSSR Models;684
6.30.3.1;3.1 Spatial Lags;684
6.30.3.2;3.2 The CSSR-SAR Model;685
6.30.4;4 Empirical Test;687
6.30.5;5 Conclusions and Outlook;687
6.30.6;References;688
6.31;A Demand Learning Data Based Approach to Optimize Revenues of a Retail Chain;689
6.31.1;1 Introduction;689
6.31.2;2 Model Description;690
6.31.3;3 Numerical Study;693
6.31.4;4 Summary and Future Directions;695
6.31.5;References;696
6.32;Missing Values and the Consistency Problem Concerning AHP Data;698
6.32.1;1 Introduction;698
6.32.2;2 Consistency Adjustment Approaches;700
6.32.2.1;2.1 Manual Consistency Adjustment Approaches;700
6.32.2.2;2.2 Automated Consistency Adjustment Approaches;701
6.32.2.2.1;2.2.1 Automated Expert-Choice Method (AEM);701
6.32.2.2.2;2.2.2 Iterative Eigenvalue Improvement Method (IEM);701
6.32.2.2.3;2.2.3 Genetic Adjustment Method (GAM);701
6.32.3;3 Comparison of Automated Consistency Adjustment Approaches;702
6.32.3.1;3.1 Common Performance Measures;702
6.32.3.2;3.2 Results;703
6.32.4;4 Outlook;705
6.32.5;References;705
6.33;Monte Carlo Methods in the Assessment of New Products:A Comparison of Different Approaches;706
6.33.1;1 Introduction;706
6.33.2;2 Assessment Methods of NPD;707
6.33.3;3 Real Options in the Assessment of NPD;709
6.33.4;4 Monte Carlo Simulation in Assessment of NPD;710
6.33.5;5 Conclusions and Outlook;712
6.33.6;References;713
6.34;Preference Analysis and Product Design in Markets for Elderly People: A Comparison of Methods and Approaches;714
6.34.1;1 Introduction;714
6.34.2;2 New Approach for Elderly People;716
6.34.2.1;2.1 Application of the New Approach for Elderly People;717
6.34.2.2;2.2 Comparison of Results;719
6.34.2.3;2.3 Field Predictability Test;719
6.34.3;3 Discussion and Outlook;720
6.34.4;References;720
6.35;Usefulness of A Priori Information about Customersfor Market Research: An Analysis for Personalisation Aspects in Retailing;722
6.35.1;1 Introduction;722
6.35.2;2 Personalisation Aspects in Retailing;723
6.35.3;3 Preference Estimation in Market Research;723
6.35.4;4 Empirical Investigation;724
6.35.4.1;4.1 Research Object and Design;724
6.35.4.2;4.2 Results;725
6.35.5;5 Conclusion and Outlook;728
6.35.6;References;728
6.36;Importance of Consumer Preferences on the Diffusionof Complex Products and Systems;730
6.36.1;1 Motivation;730
6.36.2;2 Specifics of Complex Products and Systems;731
6.36.3;3 The Diffusion of Complex Products and Systems;732
6.36.4;4 Consumer Preferences in the Diffusion of CoPS;734
6.36.5;5 Model Behaviour;735
6.36.6;6 Summary;737
6.36.7;References;737
6.37;Household Possession of Consumer Durables on Background of some Poverty Lines;739
6.37.1;1 Introduction;739
6.37.2;2 The Method;740
6.37.3;3 Conclusions;745
6.37.4;References;746
6.38;Effect of Consumer Perceptions of Web Site Brand Personality and Web Site Brand Association on Web Site Brand Image;747
6.38.1;1 Introduction;747
6.38.2;2 Theoretical Background and Hypotheses;748
6.38.3;3 Methods;748
6.38.4;4 Results;750
6.38.5;5 Discussion;753
6.38.6;References;753
6.39;Perceptually Based Phoneme Recognition in Popular Music;755
6.39.1;1 Introduction;755
6.39.2;2 Description of the Task;756
6.39.3;3 Auditory Modelling;756
6.39.4;4 Feature Extraction;758
6.39.5;5 Classifier Tuning;759
6.39.6;6 Results and Discussion;760
6.39.7;7 Summary;762
6.39.8;References;762
6.40;SVM Based Instrument and Timbre Classification;763
6.40.1;1 Introduction;763
6.40.2;2 Feature Extraction;764
6.40.2.1;2.1 Preprocessing;765
6.40.2.2;2.2 Perceptive Linear Prediction;766
6.40.2.3;2.3 Mel Frequency Cepstral Coefficients;766
6.40.3;3 Clustering;766
6.40.4;4 Classification;767
6.40.5;5 Software;768
6.40.6;6 Results;768
6.40.7;7 Conclusion;770
6.40.8;References;770
6.41;Three-way Scaling and Clustering Approach to Musical Structural Analysis;771
6.41.1;1 Introduction;771
6.41.2;2 The Three-Way Structure Model;773
6.41.2.1;2.1 Results by INDSCAL;773
6.41.2.2;2.2 Results Using INDCLUS Presented as a Hanabi Chart;774
6.41.3;3 Conclusion and Discussion;776
6.41.4;References;778
6.42;Improving GMM Classifiers by Preliminary One-class SVM Outlier Detection: Application to Automatic Music Mood Estimation;779
6.42.1;1 Introduction;779
6.42.1.1;1.1 Mood Models;780
6.42.1.2;1.2 Mood Audio Features;780
6.42.1.3;1.3 Mood Classification;781
6.42.2;2 Proposed System;781
6.42.3;3 Outlier Detection with One-Class SVM;782
6.42.3.1;3.1 One-Class SVM;782
6.42.3.2;3.2 Estimation of Kernel Parameters;783
6.42.4;4 Evaluation;783
6.42.4.1;4.1 Dataset and Parameter Settings;783
6.42.4.2;4.2 Results;785
6.42.5;5 Conclusions;785
6.42.6;References;786
6.43;Multiobjective Optimization for Decision Supportin Automated 2.5D System-in-Package Electronics Design;787
6.43.1;1 Introduction;787
6.43.2;2 Multiobjective Decision Support;789
6.43.3;3 Optimization Problems and Algorithms;790
6.43.4;4 Constructive Placement Heuristic;790
6.43.5;5 Group Constraint Concept;792
6.43.6;6 Computational Results;794
6.43.7;7 Conclusion;795
6.43.8;References;795
6.44;Multi-Objective Quality Assessment for EA Parameter Tuning;796
6.44.1;1 Introduction;796
6.44.2;2 Definition of Test Functions;797
6.44.3;3 Experiments and Results;798
6.44.3.1;3.1 Experiments;798
6.44.3.2;3.2 Results;799
6.44.3.2.1;3.2.1 Results on F10–F12;799
6.44.4;4 Conclusions and Outlook;802
6.44.5;References;803
6.45;A Novel Multi-Objective Target Value Optimization Approach;804
6.45.1;1 Introduction;804
6.45.2;2 Efficient Global Optimization (EGO);805
6.45.3;3 The New Approach;807
6.45.3.1;3.1 Exemplary Progress;808
6.45.3.2;3.2 Stopping Criterion;808
6.45.4;4 Handling of Missing Values;809
6.45.5;5 Case Study;810
6.45.6;6 Summary;811
6.45.7;References;811
6.46;Desirability-Based Multi-Criteria Optimisation of HVOF Spray Experiments;813
6.46.1;1 The Process of High Velocity Oxy-Fuel Spraying;813
6.46.2;2 Experimental Designs;815
6.46.2.1;2.1 Plackett-Burman Design;815
6.46.2.2;2.2 Fractional-Factorial 25-1 Design;816
6.46.2.3;2.3 Central Composite Design;817
6.46.3;3 Multi-criteria Optimisation;818
6.46.3.1;3.1 Overlayed Contours;818
6.46.3.2;3.2 Desirabilities;818
6.46.4;4 Conclusion;820
6.46.5;References;820
7;Index;821



Ihre Fragen, Wünsche oder Anmerkungen
Vorname*
Nachname*
Ihre E-Mail-Adresse*
Kundennr.
Ihre Nachricht*
Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.
Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.