E-Book, Englisch, 364 Seiten
Skiadas Advances in Data Analysis
1. Auflage 2009
ISBN: 978-0-8176-4799-5
Verlag: Birkhäuser Boston
Format: PDF
Kopierschutz: 1 - PDF Watermark
Theory and Applications to Reliability and Inference, Data Mining, Bioinformatics, Lifetime Data, and Neural Networks
E-Book, Englisch, 364 Seiten
Reihe: Statistics for Industry and Technology
ISBN: 978-0-8176-4799-5
Verlag: Birkhäuser Boston
Format: PDF
Kopierschutz: 1 - PDF Watermark
This unified volume is a collection of invited chapters presenting recent developments in the field of data analysis, with applications to reliability and inference, data mining, bioinformatics, lifetime data, and neural networks. The book is a useful reference for graduate students, researchers, and practitioners in statistics, mathematics, engineering, economics, social science, bioengineering, and bioscience.
Autoren/Hrsg.
Weitere Infos & Material
1;Contents;6
2;Preface;14
3;List of Contributors;15
4;List of Tables;15
5;List of Figures;18
6;Part I Data Mining and Text Mining;23
6.1;1 Assessing the Stability of Supplementary Elements on Principal Axes Maps Through Bootstrap Resampling. Contribution to Interpretation in Textual Analysis;24
6.2;Ramón Alvarez-Esteban, Olga Valencia, and Mónica Bécue-Bertaut;24
6.2.1;1.1 Introduction;24
6.2.2;1.2 Data;25
6.2.3;1.3 Methodology;25
6.2.4;1.4 Results;26
6.2.4.1;1.4.1 CA results;26
6.2.4.2;1.4.2 Stability;27
6.2.5;1.5 Conclusion;30
6.2.6;References;31
6.3;2 A Doubly Projected Analysis for Lexical Tables ;33
6.4;Simona Balbi and Michelangelo Misuraca;33
6.4.1;2.1 Introduction;33
6.4.2;2.2 Some methodological recall;34
6.4.2.1;2.2.1 Constrained principal component analysis;34
6.4.2.2;2.2.2 Principal component analysis onto a reference subspace;35
6.4.3;2.3 Basic concepts and data structure;35
6.4.4;2.4 A doubly projected analysis;36
6.4.5;2.5 The Italian academic programs: A study on skills and competences supply;36
6.4.6;References;38
6.5;3 Analysis of a Mixture of Closed and Open-Ended Questions in the Case of a Multilingual Survey;40
6.6;Mónica Bécue-Bertaut, Karmele Fernández-Aguirre, and Juan I. Modroño-Herrán;40
6.6.1;3.1 Introduction;40
6.6.2;3.2 Data and objectives;40
6.6.3;3.3 Notation;42
6.6.4;3.4 Methodology;43
6.6.4.1;3.4.1 Principle of multiple factor analysis;43
6.6.4.2;3.4.2 Integrating categorical sets in MFA;44
6.6.4.3;3.4.3 Integrating frequency tables in MFA;44
6.6.4.4;3.4.4 Extended MFA performed as a weighted PCA;44
6.6.5;3.5 Results;45
6.6.5.1;3.5.1 Clustering from closed questions only;45
6.6.5.2;3.5.2 Clustering from closed and open-ended questions;46
6.6.6;3.6 Conclusion;49
6.6.7;References;50
6.7;4 Number of Frequent Patterns in Random Databases ;51
6.8;Loïck Lhote;51
6.8.1;4.1 Introduction;51
6.8.2;4.2 Model of databases;52
6.8.2.1;4.2.1 Frequent pattern mining;52
6.8.2.2;4.2.2 Model of random databases;53
6.8.3;4.3 Main results;54
6.8.3.1;4.3.1 Linear frequency threshold;54
6.8.3.2;4.3.2 Constant frequency threshold;54
6.8.3.3;4.3.3 Sketch of proofs;55
6.8.4;4.4 Dynamical databases;56
6.8.4.1;4.4.1 Dynamical sources;56
6.8.4.2;4.4.2 Main tools;57
6.8.4.3;4.4.3 Proof of Theorem 3;59
6.8.5;4.5 Improved memoryless model of databases;60
6.8.6;4.6 Experiments;60
6.8.7;4.7 Conclusion;61
6.8.8;References;62
7;Part II Information Theory and Statistical Applications;64
7.1;5 Introduction ;65
7.2;Koustautiuos Zografos;65
7.2.1;5.1 Introduction;65
7.2.2;References;66
7.3;6 Measures of Divergence in Model Selection ;67
7.4;Alex Karagrigoriou and Kyriacos Mattheou;67
7.4.1;6.1 Introduction;67
7.4.2;6.2 Measures of divergence;68
7.4.3;6.3 Model selection criteria;69
7.4.4;6.4 The divergence information criterion;71
7.4.5;6.5 Lower bound of the MSE of prediction of DIC;74
7.4.6;6.6 Simulations;77
7.4.7;References;80
7.5;7 High Leverage Points and Outliers in Generalized Linear Models for Ordinal Data ;82
7.6;M.C. Pardo;82
7.6.1;7.1 Introduction;82
7.6.2;7.2 Background and notation for GLM;83
7.6.3;7.3 The hat matrix: Properties;85
7.6.4;7.4 Outliers;88
7.6.5;7.5 Numerical example;91
7.6.6;7.6 Conclusion;94
7.6.7;References;94
7.7;8 On a Minimization Problem Involving Divergences and Its Applications;96
7.8;Athanasios P. Sachlas and Takis Papaioannou;96
7.8.1;8.1 Introduction;96
7.8.2;8.2 Minimization of divergences;97
7.8.3;8.3 Properties of divergences without probability vectors;98
7.8.4;8.4 Graduating mortality rates via divergences;102
7.8.4.1;8.4.1 Divergence-theoretic actuarial graduation;102
7.8.4.2;8.4.2 Lagrangian duality results for the power divergence;104
7.8.5;8.5 Numerical investigation;105
7.8.6;8.6 Conclusions and comments;106
7.8.7;References;108
8;Part III Asymptotic Behaviour of Stochastic Processesand Random Fields;110
8.1;9 Remarks on Stochastic Models Under Consideration ;111
8.2;Ekaterina V. Bulinskaya;111
8.2.1;9.1 Introduction;111
8.2.2;9.2 Results and methods;112
8.2.3;9.3 Applications;114
8.2.4;References;117
8.3;10 New Invariance Principles for Critical Branching Process in Random Environment;119
8.4;Valeriy I. Afanasyev;119
8.4.1;10.1 Introduction;119
8.4.2;10.2 Main results;121
8.4.3;10.3 Proof of Theorem 1;123
8.4.4;10.4 Finite-dimensional distributions;126
8.4.5;10.5 Conclusion;128
8.4.6;References;129
8.5;11 Gaussian Approximation for Multichannel Queueing Systems ;130
8.6;Larisa G. Afanas'eva;130
8.6.1;11.1 Introduction;130
8.6.2;11.2 Model description;131
8.6.3;11.3 The basic theorem;131
8.6.4;11.4 A limit theorem for a regenerative arrival process;135
8.6.5;11.5 Doubly stochastic poisson process (DSPP);136
8.6.6;11.6 Conclusion;140
8.6.7;References;141
8.7;12 Stochastic Insurance Models, Their Optimalityand Stability ;142
8.8;Ekaterina V. Bulinskaya;142
8.8.1;12.1 Introduction;142
8.8.2;12.2 Model description;143
8.8.3;12.3 Optimal control;143
8.8.4;12.4 Sensitivity analysis;147
8.8.5;12.5 Conclusion;153
8.8.6;References;153
8.9;13 Central Limit Theorem for Random Fields and Applications ;154
8.10;Alexander Bulinski;154
8.10.1;13.1 Introduction;154
8.10.2;13.2 Main results;155
8.10.3;13.3 Applications;161
8.10.4;References;163
8.11;14 A Berry--Esseen Type Estimate for Dependent Systems on Transitive Graphs;164
8.12;Alexey Shashkin;164
8.12.1;14.1 Introduction;164
8.12.2;14.2 Main result;165
8.12.3;14.3 Proof;166
8.12.4;14.4 Conclusion;169
8.12.5;References;169
8.13;15 Critical and Subcritical Branching Symmetric Random Walks on d-Dimensional Lattices;170
8.14;Elena Yarovaya;170
8.14.1;15.1 Introduction;170
8.14.2;15.2 Description of a branching random walk;171
8.14.3;15.3 Definition of criticality for branching random walks;173
8.14.4;15.4 Main equations;174
8.14.5;15.5 Asymptotic behavior of survival probabilities;175
8.14.6;15.6 Limit theorems;176
8.14.7;15.7 Proof of theorems for dimensions d=1,2 in critical and subcritical cases;177
8.14.8;15.8 Conclusions;180
8.14.9;References;181
9;Part IV Bioinformatics and Markov Chains;182
9.1;16 Finite Markov Chain Embedding for the Exact Distribution of Patterns in a Set of Random Sequences;183
9.2;Juliette Martin, Leslie Regad, Anne-Claude Camproux, and Grégory Nuel;183
9.2.1;16.1 Introduction;183
9.2.2;16.2 Methods;184
9.2.2.1;16.2.1 Notations;184
9.2.2.2;16.2.2 Pattern Markov chains;185
9.2.2.3;16.2.3 Exact computations;185
9.2.3;16.3 Data;187
9.2.3.1;16.3.1 Simulated data;187
9.2.3.2;16.3.2 Real data;187
9.2.4;16.4 Results and discussion;188
9.2.4.1;16.4.1 Simulation study;188
9.2.4.2;16.4.2 Illustrations on biological sequences;189
9.2.5;16.5 Conclusion;191
9.2.6;References;191
9.3;17 On the Convergence of the Discrete-Time Homogeneous Markov Chain ;193
9.4;I. Kipouridis and G.Tsaklidis;193
9.4.1;17.1 Introduction;193
9.4.2;17.2 The homogeneous Markov chain in discrete time;194
9.4.3;17.3 The equation of the image of a hypersphere under the transformation (2.1);194
9.4.4;17.4 Representation of equation (3.6) in matrix form;197
9.4.5;17.5 Conditions for a hypersphere of Rn-1 to be the image of a hypersphere under the stochastic transformation pT(t)=pT(t-1)P;202
9.4.6;References;212
10;Part V Life Table Data, Survival Analysis, and Riskin Household Insurance;213
10.1;18 Comparing the Gompertz-Type Models with a First Passage Time Density Model ;214
10.2;Christos H. Skiadas and Charilaos Skiadas;214
10.2.1;18.1 Introduction;214
10.2.2;18.2 The Gompertz-type models;215
10.2.3;18.3 Application to life table and the Carey medfly data;217
10.2.4;18.4 Remarks;218
10.2.5;18.5 Conclusion;219
10.2.6;References;219
10.3;19 A Comparison of Recent Procedures in Weibull Mixture Testing ;221
10.4;Karl Mosler and Lars Haferk221
10.4.1;19.1 Introduction;221
10.4.2;19.2 Three approaches for testing homogeneity;222
10.4.3;19.3 Implementing MLRT and D-tests with Weibull alternatives;223
10.4.4;19.4 Comparison of power;225
10.4.5;19.5 Conclusion;227
10.4.6;References;227
10.5;20 Hierarchical Bayesian Modelling of Geographic Dependence of Risk in Household Insurance;229
10.6;László Márkus, N. Miklós Arató, and Vilmos Prokaj;229
10.6.1;20.1 Introduction;229
10.6.2;20.2 Data description, model building, and a tool for fit diagnosis;230
10.6.3;20.3 Model estimation, implementation of the MCMC algorithm;233
10.6.4;20.4 Conclusion;236
10.6.5;References;237
11;Part VI Neural Networks and Self-Organizing Maps;238
11.1;21 The FCN Framework: Development and Applications ;239
11.2;Yiannis S. Boutalis, Theodoros L. Kottas, and Manolis A. Christodoulou;239
11.2.1;21.1 Introduction;239
11.2.2;21.2 Fuzzy cognitive maps;242
11.2.2.1;21.2.1 Fuzzy cognitive map representation;242
11.2.3;21.3 Existence and uniqueness of solutions in fuzzy cognitive maps;244
11.2.3.1;21.3.1 The contraction mapping principle;244
11.2.3.2;21.3.2 Exploring the results;247
11.2.3.3;21.3.3 FCM with input nodes;250
11.2.4;21.4 The fuzzy cognitive network approach;252
11.2.4.1;21.4.1 Close interaction with the real system;252
11.2.4.2;21.4.2 Weight updating procedure;252
11.2.4.3;21.4.3 Storing knowledge from previous operating conditions;253
11.2.5;21.5 Controlling a wastewater anaerobic digestion unit (Kottas et al., 2006);256
11.2.5.1;21.5.1 Control of the process using the FCN;258
11.2.5.2;21.5.2 Results;260
11.2.5.3;21.5.3 Discussion;263
11.2.6;21.6 The FCN approach in tracking the maximum power point in PV arrays (Kottas et al., 2007b);263
11.2.6.1;21.6.1 Simulation of the PV system;266
11.2.6.2;21.6.2 Control of the PV system using FCN;267
11.2.6.3;21.6.3 Discussion;269
11.2.7;21.7 Conclusions;270
11.2.8;References;270
11.3;22 On the Use of Self-Organising Maps to Analyse Spectral Data ;274
11.4;Véronique Cariou and Dominique Bertrand;274
11.4.1;22.1 Introduction;274
11.4.2;22.2 Self-organising map clustering and visualisation tools;275
11.4.3;22.3 Illustrative examples;276
11.4.4;22.4 Conclusion;280
11.4.5;References;281
11.5;23 Neuro-Fuzzy Versus Traditional Models for Forecasting Wind Energy Production ;282
11.6;George Atsalakis, Dimitris Nezis, and Constantinos Zopounidis;282
11.6.1;23.1 Introduction;282
11.6.2;23.2 Related research;283
11.6.3;23.3 Methodology;287
11.6.4;23.4 Model presentation;288
11.6.5;23.5 Results;290
11.6.6;23.6 Conclusion;291
11.6.7;References;292
12;Part VII Parametric and Nonparametric Statistics;295
12.1;24 Nonparametric Comparison of Several Sequential k-out-of-n Systems ;296
12.2;Eric Beutner;296
12.2.1;24.1 Introduction;296
12.2.2;24.2 Preliminaries and derivation of the test statistics;297
12.2.2.1;24.2.1 Sequential order statistics: Introduction and motivation;297
12.2.2.2;24.2.2 Sequential order statistics and associated counting processes;299
12.2.3;24.3 K-sample tests for known 's;302
12.2.4;24.4 K-sample tests for unknown 's;304
12.2.5;References;308
12.3;25 Adjusting p-Values when n Is Large in the Presence of Nuisance Parameters ;310
12.4;Sonia Migliorati and Andrea Ongaro;310
12.4.1;25.1 Introduction;310
12.4.2;25.2 Normal model with known variance;311
12.4.3;25.3 Normal model with unknown variance;314
12.4.4;25.4 Conclusion;319
12.4.5;25.5 Appendix;320
12.4.6;References;323
13;Part VIII Statistical Theory and Methods;324
13.1;26 Fitting Pareto II Distributions on Firm Size: Statistical Methodology and Economic Puzzles ;325
13.2;Aldo Corbellini, Lisa Crosato, Piero Ganugi, and Marco Mazzoli;325
13.2.1;26.1 Introduction;325
13.2.2;26.2 Data description;326
13.2.3;26.3 Fitting the Pareto II distribution by means of the forward search;327
13.2.4;26.4 Empirical results;328
13.2.5;26.5 Economic implications;329
13.2.6;26.6 Concluding remarks;331
13.2.7;References;332
13.3;27 Application of Extreme Value Theory to Economic Capital Estimation ;333
13.4;Samit Paul and Andrew Barnes;333
13.4.1;27.1 Introduction;333
13.4.2;27.2 Background mathematics;334
13.4.2.1;27.2.1 Risk measure;334
13.4.2.2;27.2.2 Extreme value theory;334
13.4.2.3;27.2.3 Estimating VaR using EVT;335
13.4.3;27.3 Threshold uncertainty;336
13.4.3.1;27.3.1 Tail-data versus accuracy tradeoff;336
13.4.3.2;27.3.2 Mean residual life plot;336
13.4.3.3;27.3.3 Fit threshold ranges;337
13.4.4;27.4 Experimental framework and results;337
13.4.4.1;27.4.1 Data;337
13.4.4.2;27.4.2 Simulation engine;337
13.4.4.3;27.4.3 Threshold selection;337
13.4.4.4;27.4.4 Bootstrap results on VaR stability;338
13.4.5;27.5 Conclusion;338
13.4.6;References;339
13.5;28 Multiresponse Robust Engineering: Industrial Experiment Parameter Estimation ;341
13.6;Elena G. Koleva and Ivan N. Vuchkov;341
13.6.1;28.1 Introduction;341
13.6.2;28.2 Combined method for regression parameter estimation;343
13.6.3;28.3 Experimental designs;345
13.6.4;28.4 Experimental application;345
13.6.5;28.5 Conclusion;347
13.6.6;References;348
13.7;29 Inference for Binomial Change Point Data ;349
13.8;James M. Freeman;349
13.8.1;29.1 Introduction;349
13.8.2;29.2 Analysis;350
13.8.3;29.3 Applications;352
13.8.3.1;29.3.1 Page's data;352
13.8.3.2;29.3.2 Lindisfarne Scribes' data;353
13.8.3.3;29.3.3 Club foot data;354
13.8.3.4;29.3.4 Simulated data;354
13.8.4;29.4 Conclusion;355
13.8.5;References;356
14;Index;357




