E-Book, Englisch, 537 Seiten
Rizzi / Vichi COMPSTAT 2006 - Proceedings in Computational Statistics
1. Auflage 2007
ISBN: 978-3-7908-1709-6
Verlag: Physica-Verlag HD
Format: PDF
Kopierschutz: 1 - PDF Watermark
17th Symposium Held in Rome, Italy, 2006
E-Book, Englisch, 537 Seiten
ISBN: 978-3-7908-1709-6
Verlag: Physica-Verlag HD
Format: PDF
Kopierschutz: 1 - PDF Watermark
International Association for Statistical Computing The International Association for Statistical Computing (IASC) is a Section of the International Statistical Institute. The objectives of the Association are to foster world-wide interest in e?ective statistical computing and to - change technical knowledge through international contacts and meetings - tween statisticians, computing professionals, organizations, institutions, g- ernments and the general public. The IASC organises its own Conferences, IASC World Conferences, and COMPSTAT in Europe. The 17th Conference of ERS-IASC, the biennial meeting of European - gional Section of the IASC was held in Rome August 28 - September 1, 2006. This conference took place in Rome exactly 20 years after the 7th COMP- STAT symposium which was held in Rome, in 1986. Previous COMPSTAT conferences were held in: Vienna (Austria, 1974); West-Berlin (Germany, 1976); Leiden (The Netherlands, 1978); Edimbourgh (UK, 1980); Toulouse (France, 1982); Prague (Czechoslovakia, 1984); Rome (Italy, 1986); Copenhagen (Denmark, 1988); Dubrovnik (Yugoslavia, 1990); Neuchˆ atel (Switzerland, 1992); Vienna (Austria,1994); Barcelona (Spain, 1996);Bristol(UK,1998);Utrecht(TheNetherlands,2000);Berlin(Germany, 2002); Prague (Czech Republic, 2004).
Autoren/Hrsg.
Weitere Infos & Material
1;Preface;5
2;Contents;8
3;Part I Classification and Clustering;25
3.1;Issues of robustness and high dimensionality in cluster analysis;26
3.1.1;1 Introduction;26
3.1.2;2 Multivariate t Distribution;29
3.1.3;3 ML Estimation of Mixtures of t Components;30
3.1.4;4 Factor Analysis Model for Dimension Reduction;31
3.1.5;5 Mixtures of Normal Factor Analyzers;32
3.1.6;6 Mixtures of t Factor Analyzers;34
3.1.7;7 Discussion;36
3.1.8;References;36
3.2;Fuzzy K-medoids clustering models for fuzzy multivariate time trajectories;39
3.2.1;1 Introduction;39
3.2.2;2 Fuzzy data time arrays, fuzzy multivariate time trajectories and dissimilarity measures;40
3.2.3;3 Fuzzy K-means clustering models for fuzzy multivariate time trajectories [ CD03];43
3.2.4;4 Fuzzy K-medoids clustering for fuzzy multivariate time trajectories;45
3.2.5;5 Application;47
3.2.6;References;50
3.3;Bootstrap methods for measuring classification uncertainty in latent class analysis;52
3.3.1;1 Introduction;52
3.3.2;2 Measures of classification uncertainty;54
3.3.3;3 The bootstrap method;55
3.3.4;4 Bootstrapping LC models;56
3.3.5;5 Applications;57
3.3.6;6 Discussion;60
3.3.7;References;61
3.4;A robust linear grouping algorithm;63
3.4.1;1 Introduction;63
3.4.2;2 Linear Grouping Algorithm;64
3.4.3;3 Robust Linear Grouping Algorithm;65
3.4.4;4 Examples;67
3.4.5;5 Discussion;70
3.4.6;References;72
3.5;Computing and using the deviance with classification trees;74
3.5.1;1 Introduction;74
3.5.2;2 Tree induction principle: an illustrative example;75
3.5.3;3 Validating the tree descriptive ability;77
3.5.4;4 Computational aspects;82
3.5.5;5 Conclusion;84
3.5.6;References;84
3.6;Estimation procedures for the false discovery rate: a systematic comparison for microarray data;86
3.6.1;1 Introduction;86
3.6.2;2 The testing problem;87
3.6.3;3 The false discovery rate;88
3.6.4;4 Estimation procedures;89
3.6.5;5 The data sets;92
3.6.6;6 Outline of the comparative study;95
3.6.7;7 Results and conclusions;96
3.6.8;Acknowledgment;98
3.6.9;References;98
3.7;A unifying model for biclustering*;99
3.7.1;1 Illustrative Example;99
3.7.2;2 Biclustering;100
3.7.3;3 A Unifying Biclustering Model;101
3.7.4;4 Data Analysis;103
3.7.5;5 Concluding Remarks;104
3.7.6;References;105
4;Part II Image Analysis and Signal Processing;107
4.1;Non-rigid image registration using mutual information;108
4.1.1;1 Introduction;108
4.1.2;2 Non-rigid registration;109
4.1.3;3 The mutual information criterion;112
4.1.4;4 Non-rigid registration using mutual information;113
4.1.5;5 Validation;116
4.1.6;References;117
4.2;Musical audio analysis using sparse representations;121
4.2.1;1 Introduction;121
4.2.2;2 Finding Sparse Representations;122
4.2.3;3 Sparse Representations for Music Transcription;125
4.2.4;4 Source Separation;128
4.2.5;5 Conclusions;130
4.2.6;Acknowledgements;130
4.2.7;References;131
4.3;Robust correspondence recognition for computer vision;134
4.3.1;1 Introduction;134
4.3.2;2 Stability and Digraph Kernels;138
4.3.3;3 Properties of Strict Sub-Kernels;142
4.3.4;4 A Simple Algorithm for Interval Orientations;144
4.3.5;5 Discussion;144
4.3.6;References;145
4.4;Blind superresolution;147
4.4.1;1 Introduction;147
4.4.2;2 Mathematical Model;150
4.4.3;3 Blind Superresolution;152
4.4.4;4 Experiments;155
4.4.5;5 Conclusions;156
4.4.6;Acknowledgment;157
4.4.7;References;157
4.5;Analysis of Music Time Series;160
4.5.1;1 Introduction;160
4.5.2;2 Model building;161
4.5.3;3 Applied models;164
4.5.4;4 Studies;166
4.5.5;5 Conclusion;171
4.5.6;References;172
5;Part III Data Visualization;173
5.1;Tying up the loose ends in simple, multiple, joint correspondence analysis;174
5.1.1;1 Introduction;174
5.1.2;2 Basic CA theory;175
5.1.3;3 Multiple and joint correspondence analysis;177
5.1.4;4 Data sets used as illustrations;177
5.1.5;5 Measuring variance and comparing different tables;178
5.1.6;6 The myth of the influential outlier;179
5.1.7;7 The scaling problem in CA;180
5.1.8;8 To rotate or not to rotate;186
5.1.9;9 Statistical significance of results;189
5.1.10;10 Loose ends in MCA and JCA;191
5.1.11;Acknowledgments;194
5.1.12;References;194
5.2;3 dimensional parallel coordinates plot and its use for variable selection;197
5.2.1;1 Introduction;197
5.2.2;2 Parallel coordinates plot and interactive operations;198
5.2.3;3 3 dimensional parallel coordinates plot;199
5.2.4;4 Implementation of 3D PCP software;203
5.2.5;5 Concluding remarks;204
5.2.6;References;204
5.3;Geospatial distribution of alcohol-related violence in Northern Virginia;206
5.3.1;1 Introduction;206
5.3.2;2 Overview of the Model;207
5.3.3;3 The Data;211
5.3.4;4 Estimating the Probabilities;212
5.3.5;5 Geospatial Visualization of Acute Outcomes;213
5.3.6;6 Conclusions;214
5.3.7;Acknowledgements;215
5.3.8;References;216
5.4;Visualization in comparative music research;217
5.4.1;1 Introduction;217
5.4.2;2 Music representations;218
5.4.3;3 Musical databases;219
5.4.4;4 Musical feature extraction;220
5.4.5;5 Data mining;220
5.4.6;6 Examples of visualization of musical collections;222
5.4.7;7 Conclusion;225
5.4.8;References;226
5.5;Exploratory modelling analysis: visualizing the value of variables;228
5.5.1;1 Introduction;228
5.5.2;2 Example — Florida 2004;229
5.5.3;3 Selection — More than just Variable Selection;231
5.5.4;4 Graphics for Variable Selection;233
5.5.5;5 Small or LARGE Datasets;236
5.5.6;6 Summary and Outlook;236
5.5.7;References;237
5.6;Density estimation from streaming data using wavelets;238
5.6.1;1 Introduction;238
5.6.2;2 Recursive Formulation;242
5.6.3;3 Discounting Old Data;243
5.6.4;4 A Case Study: Internet Header Traffic Data;245
5.6.5;References;249
6;Part IV Multivariate Analysis;250
6.1;Reducing conservatism of exact small-sample methods of inference for discrete data;251
6.1.1;1 Introduction;251
6.1.2;2 Small-Sample Inference for Discrete Distributions;253
6.1.3;3 Ways of Reducing Conservatism;255
6.1.4;4 Fuzzy Inference Using Discrete Data;259
6.1.5;5 The Mid-P Quasi-Exact Approach;260
6.1.6;Acknowledgement;264
6.1.7;References;265
6.2;Symbolic data analysis: what is it?;267
6.2.1;1 Symbolic Data;267
6.2.2;2 Structure;270
6.2.3;3 Analysis: Symbolic vis-a-vis Classical Approach;272
6.2.4;4 Conclusion;273
6.2.5;References;274
6.3;A dimensional reduction method for ordinal three- way contingency table;276
6.3.1;1 Introduction;276
6.3.2;2 Decomposing a Non Symmetric Index;277
6.3.3;3 The Partition of a Predictability Measure;279
6.3.4;4 Ordinal Three-Way Non Symmetrical Correspondence Analysis;280
6.3.5;5 Example;284
6.3.6;References;287
6.4;Operator related to a data matrix: a survey;289
6.4.1;1 The initial choices;289
6.4.2;2 Joint analysis of several data matrices (the STATIS method);293
6.4.3;3 Principal component analysis with respect to instrumental variables;295
6.4.4;4 Conclusions;298
6.4.5;Acknowledgements;299
6.4.6;References;299
6.5;Factor interval data analysis and its application;302
6.5.1;1 Introduction;302
6.5.2;2 Methodology of Interval Data and Its Possible Limitations;303
6.5.3;3 Methodology of Factor Interval Data and Its Advantages;307
6.5.4;4 Application in Chinese Stock Markets;309
6.5.5;5 Conclusion;315
6.5.6;References;315
6.6;Identifying excessively rounded or truncated data;316
6.6.1;1 Data;316
6.6.2;2 DensityModels;317
6.6.3;3 Asymptotic Behavior;322
6.6.4;4 Conclusion;325
6.6.5;Acknowledgements;325
6.6.6;References;326
6.7;Statistical inference and data mining: false discoveries control;327
6.7.1;Introduction;327
6.7.2;1 Data Mining Specificities and Statistical Inference;328
6.7.3;2 Validation of Interesting Features;329
6.7.4;3 Controlling UAFWER Using the BS FD Algorithm;332
6.7.5;4 Experimentation;335
6.7.6;Conclusion and Perspectives;337
6.7.7;References;337
6.8;Is ‘Which model . . .?’ the right question?;339
6.8.1;1 Introduction;339
6.8.2;2 Preliminaries;340
6.8.3;3 From choice to synthesis;342
6.8.4;4 Example;347
6.8.5;5 Conclusion;350
6.8.6;References;350
6.9;Use of latent class regression models with a random intercept to remove the effects of the overall response rating level;352
6.9.1;1 Introduction;352
6.9.2;2 Description of the cracker case study;353
6.9.3;3 The LC ordinal regression model with a random intercept;354
6.9.4;4 Results obtained with the cracker data set;356
6.9.5;5 General discussion;357
6.9.6;References;360
6.10;Discrete functional data analysis;362
6.10.1;1 Introduction;362
6.10.2;2 Functional Data;363
6.10.3;3 Difference Operators;363
6.10.4;4 Detection of Relations among Differences;365
6.10.5;5 Concluding Remarks;369
6.10.6;References;369
6.11;Self organizing MAPS: understanding, measuring and reducing variability;371
6.11.1;1 Introduction;372
6.11.2;2 Several Approaches Concerning the Preservation of the Topology;373
6.11.3;3 Understanding Variability of SOM’ Neighbourhood Structure Visualizing Distances between All Classes;375
6.11.4;4 The R-map Method to Increase SOM Reliability;376
6.11.5;5 Application: Validating the Number of Units for a SOM Network;379
6.11.6;6 Conclusion;381
6.11.7;References;382
6.12;Parameterization and estimation of path models for categorical data;383
6.12.1;1 Introduction;383
6.12.2;2 Log-linear, graphical and DAG models;384
6.12.3;3 DAG models as marginal models;386
6.12.4;4 Parameterization of DAG models;386
6.12.5;5 Path models;387
6.12.6;6 Maximum likelihood estimation;388
6.12.7;7 An example;390
6.12.8;References;394
6.13;Latent class model with two latent variables for analysis of count data;395
6.13.1;1 Introduction;395
6.13.2;2 Model;396
6.13.3;3 Analysis of retail market data;397
6.13.4;References;399
7;Part V Web Based Teaching;400
7.1;Challenges concerning web data mining;401
7.1.1;1 Motivation;401
7.1.2;2 Challenges Concerning Algorithmic Aspects;405
7.1.3;3 Conclusions and Further Research;412
7.1.4;References;412
7.2;e-Learning statistics – a selective review;415
7.2.1;1 Introduction;415
7.2.2;2 Modern e-Learning Materials;416
7.2.3;3 Evaluation;423
7.2.4;4 Conclusion;424
7.2.5;References;425
7.3;Quality assurance of web based e-Learning for statistical education;427
7.3.1;1 Introduction;427
7.3.2;2 Important Features of the e-StatEdu System;429
7.3.3;3 Quality Assurance;432
7.3.4;4 Discussion;435
7.3.5;Acknowledgement;435
7.3.6;References;435
8;Part VI Algorithms;437
8.1;Genetic algorithms for building double threshold generalized autoregressive conditional heteroscedastic models of time series;438
8.1.1;1 Introduction;439
8.1.2;2 The DTGARCHModel;441
8.1.3;3 A Genetic Algorithm for DTGARCH Model Building;442
8.1.4;4 Application to Financial Data;444
8.1.5;5 Conclusions;447
8.1.6;References;448
8.2;Nonparametric evaluation of matching noise;450
8.2.1;1 Introduction and preliminaries;450
8.2.2;2 Statistical framework for matching noise;451
8.2.3;3 Matching noise for KNN distance hot-deck;453
8.2.4;4 An important special case: distance hot-deck;454
8.2.5;5 d0-Kernel hot-deck;455
8.2.6;6 A comparison among different techniques;456
8.2.7;References;457
8.3;Subset selection algorithm based on mutual information;458
8.3.1;1 Introduction;458
8.3.2;2 Estimation of mutual information using normal mixture;460
8.3.3;3 Algorithm for subset selection;461
8.3.4;4 Numerical investigation with real data set;465
8.3.5;References;465
8.4;Visiting near-optimal solutions using local search algorithms;468
8.4.1;1 Background and motivation;468
8.4.2;2 Definitions and notation;469
8.4.3;3 The ß-acceptable solution probability;471
8.4.4;4 Visiting a ß-acceptable solution;473
8.4.5;5 Computational results;474
8.4.6;6 Conclusions;477
8.4.7;References;478
8.5;The convergence of optimization based GARCH estimators: theory and application*;479
8.5.1;1 Introduction;479
8.5.2;2 Convergence of Optimization Based Estimators;480
8.5.3;3 Application to GARCH Model;483
8.5.4;4 Results;484
8.5.5;5 Conclusions;488
8.5.6;References;489
8.6;The stochastics of threshold accepting: analysis of an application to the uniform design problem;491
8.6.1;1 Introduction;491
8.6.2;2 Formal Framework;492
8.6.3;3 Results for Uniform Design Implementation;493
8.6.4;4 Conclusions and Outlook;498
8.6.5;References;498
9;Part VII Robustness;500
9.1;Robust classification with categorical variables;501
9.1.1;1 Introduction;501
9.1.2;2 Cluster detection through diagnostic monitoring;502
9.1.3;3 Performance of the method;505
9.1.4;4 E-government data;509
9.1.5;Acknowledgement;512
9.1.6;References;512
9.2;Multiple group linear discriminant analysis: robustness and error rate;514
9.2.1;1 Introduction;514
9.2.2;2 Estimation and Robustness;516
9.2.3;3 Optimal Error Rate for Three Groups;518
9.2.4;4 Simulations;520
9.2.5;5 Conclusions;524
9.2.6;References;524
10;Author Index;526




