E-Book, Englisch, 345 Seiten
Christiano Silva / Zhao Machine Learning in Complex Networks
1. Auflage 2016
ISBN: 978-3-319-17290-3
Verlag: Springer Nature Switzerland
Format: PDF
Kopierschutz: 1 - PDF Watermark
E-Book, Englisch, 345 Seiten
ISBN: 978-3-319-17290-3
Verlag: Springer Nature Switzerland
Format: PDF
Kopierschutz: 1 - PDF Watermark
This book presents the features and advantages offered by complex networks in the machine learning domain. In the first part, an overview on complex networks and network-based machine learning is presented, offering necessary background material. In the second part, we describe in details some specific techniques based on complex networks for supervised, non-supervised, and semi-supervised learning. Particularly, a stochastic particle competition technique for both non-supervised and semi-supervised learning using a stochastic nonlinear dynamical system is described in details. Moreover, an analytical analysis is supplied, which enables one to predict the behavior of the proposed technique. In addition, data reliability issues are explored in semi-supervised learning. Such matter has practical importance and is not often found in the literature. With the goal of validating these techniques for solving real problems, simulations on broadly accepted databases are conducted. Still in this book, we present a hybrid supervised classification technique that combines both low and high orders of learning. The low level term can be implemented by any classification technique, while the high level term is realized by the extraction of features of the underlying network constructed from the input data. Thus, the former classifies the test instances by their physical features, while the latter measures the compliance of the test instances with the pattern formation of the data. We show that the high level technique can realize classification according to the semantic meaning of the data. This book intends to combine two widely studied research areas, machine learning and complex networks, which in turn will generate broad interests to scientific community, mainly to computer science and engineering areas.
Autoren/Hrsg.
Weitere Infos & Material
1;Preface;8
1.1;Acknowledgments;9
2;Contents;10
3;Biography;16
4;List of Symbols;18
5;1 Introduction;20
5.1;1.1 General Background;20
5.2;1.2 Focus of This Book;23
5.3;1.3 Organization of the Remainder of the Book;29
5.4;References;30
6;2 Complex Networks;33
6.1;2.1 Basic Concepts of Graphs;33
6.1.1;2.1.1 Graph Definitions;34
6.1.2;2.1.2 Connectivity;38
6.1.3;2.1.3 Paths and Cycles;42
6.1.4;2.1.4 Subgraphs;44
6.1.5;2.1.5 Trees and Forest;46
6.1.6;2.1.6 Graph Representation;47
6.2;2.2 Complex Network Models;49
6.2.1;2.2.1 Random Networks;50
6.2.2;2.2.2 Small-World Networks;52
6.2.3;2.2.3 Scale-Free Networks;53
6.2.4;2.2.4 Random Clustered Networks;55
6.2.5;2.2.5 Core-Periphery Networks;55
6.3;2.3 Complex Network Measures;58
6.3.1;2.3.1 Degree and Degree-Correlation Measures;58
6.3.2;2.3.2 Distance and Path Measures;61
6.3.3;2.3.3 Structural Measures;62
6.3.4;2.3.4 Centrality Measures;65
6.3.4.1;2.3.4.1 Distance-Based Centrality Measures;65
6.3.4.2;2.3.4.2 Path- and Walk-Based Centrality Measures;66
6.3.4.3;2.3.4.3 Vitality;68
6.3.4.4;2.3.4.4 General Feedback Centrality;69
6.3.5;2.3.5 Classification of the Network Measurements;72
6.4;2.4 Dynamical Processes in Complex Networks;73
6.4.1;2.4.1 Random Walks;73
6.4.2;2.4.2 Lazy Random Walks;80
6.4.3;2.4.3 Self-Avoiding Walks;81
6.4.4;2.4.4 Tourist Walks;81
6.4.5;2.4.5 Epidemic Spreading;83
6.4.5.1;2.4.5.1 Susceptible-Infected-Recovered (SIR) Model;83
6.4.5.2;2.4.5.2 Susceptible-Infected-Susceptible (SIS) Model;84
6.4.5.3;2.4.5.3 Epidemic Spreading in Complex Networks;84
6.5;2.5 Chapter Remarks;85
6.6;References;85
7;3 Machine Learning;89
7.1;3.1 Overview of Machine Learning;89
7.2;3.2 Supervised Learning;92
7.2.1;3.2.1 Mathematical Formalization and Fundamental Assumptions;92
7.2.2;3.2.2 Overview of the Techniques;95
7.3;3.3 Unsupervised Learning;96
7.3.1;3.3.1 Mathematical Formalization and Fundamental Assumptions;96
7.3.2;3.3.2 Overview of the Techniques;98
7.4;3.4 Semi-Supervised Learning;100
7.4.1;3.4.1 Motivations;100
7.4.2;3.4.2 Mathematical Formalization and Fundamental Assumptions;101
7.4.3;3.4.3 Overview of the Techniques;103
7.5;3.5 Overview of Network-Based Machine Learning;104
7.6;3.6 Chapter Remarks;105
7.7;References;106
8;4 Network Construction Techniques;110
8.1;4.1 Introduction;110
8.2;4.2 Similarity and Dissimilarity Functions;113
8.2.1;4.2.1 Formal Definitions;113
8.2.2;4.2.2 Examples of Vector-Based Similarity Functions;115
8.2.2.1;4.2.2.1 Numerical Data;116
8.2.2.2;4.2.2.2 Categorical Data;119
8.3;4.3 Transforming Vector-Based Data into Networks;121
8.3.1;4.3.1 Analysis of k-Nearest Neighbors and ?-Radius Networks;123
8.3.2;4.3.2 Combination of k-Nearest Neighbors and ?-Radius Network Formation Techniques;125
8.3.3;4.3.3 b-Matching Networks;126
8.3.4;4.3.4 Linear Neighborhood Networks;127
8.3.5;4.3.5 Relaxed Linear Neighborhood Networks;129
8.3.6;4.3.6 Network Formation Using Clustering Heuristics;131
8.3.7;4.3.7 Network Formation Using Overlapping Histogram Segments;132
8.3.8;4.3.8 More Advanced Network Formation Techniques;136
8.4;4.4 Transforming Time Series Data into Networks;138
8.4.1;4.4.1 Cycle Networks;141
8.4.2;4.4.2 Correlation Networks;142
8.4.3;4.4.3 Recurrence Networks;143
8.4.4;4.4.4 Transition Networks;143
8.5;4.5 Classification of Network Formation Techniques;144
8.6;4.6 Challenges in Transforming Unstructured Data to Networked Data;145
8.7;4.7 Chapter Remarks;147
8.8;References;147
9;5 Network-Based Supervised Learning;150
9.1;5.1 Introduction;150
9.2;5.2 Representative Network-Based Supervised Learning Techniques;152
9.2.1;5.2.1 Classification Using k-Associated Graphs;153
9.2.2;5.2.2 Network Learning Toolkit (NetKit);154
9.2.3;5.2.3 Classification Using Ease of Access Heuristic;155
9.2.3.1;5.2.3.1 Training Phase;156
9.2.3.2;5.2.3.2 Classification Phase;156
9.3;5.3 Chapter Remarks;157
9.4;References;158
10;6 Network-Based Unsupervised Learning;159
10.1;6.1 Introduction;159
10.2;6.2 Community Detection;162
10.2.1;6.2.1 Relevant Concepts and Motivations;162
10.2.2;6.2.2 Mathematical Formalization and Fundamental Assumptions;164
10.2.3;6.2.3 Overview of the State-of-the-Art Techniques;166
10.2.4;6.2.4 Community Detection Benchmarks;166
10.3;6.3 Representative Network-Based Unsupervised Learning Techniques;167
10.3.1;6.3.1 Betweenness;168
10.3.2;6.3.2 Modularity Maximization;169
10.3.2.1;6.3.2.1 Clauset et al. Algorithm;170
10.3.2.2;6.3.2.2 Louvain Algorithm;172
10.3.3;6.3.3 Spectral Bisection Method;173
10.3.4;6.3.4 Community Detection Using Particle Competition;175
10.3.5;6.3.5 Chameleon;177
10.3.6;6.3.6 Community Detection by Space Transformation and Swarm Dynamics;179
10.3.7;6.3.7 Synchronization Methods;183
10.3.8;6.3.8 Finding Overlapping Communities;185
10.3.8.1;6.3.8.1 Clique Percolation;186
10.3.8.2;6.3.8.2 Bayesian Nonnegative Matrix Factorization Algorithm;187
10.3.8.3;6.3.8.3 Fuzzy Partition Algorithm;188
10.3.9;6.3.9 Network Embedding and Dimension Reduction;190
10.4;6.4 Chapter Remarks;192
10.5;References;193
11;7 Network-Based Semi-Supervised Learning;197
11.1;7.1 Introduction;197
11.2;7.2 Network-Based Semi-Supervised Learning Assumptions;199
11.3;7.3 Representative Network-Based Semi-Supervised Learning Techniques;201
11.3.1;7.3.1 Maximum Flow and Minimum Cut;202
11.3.2;7.3.2 Gaussian Field and Harmonic Function;203
11.3.3;7.3.3 Tikhonov Regularization Framework;205
11.3.4;7.3.4 Local and Global Consistency;206
11.3.5;7.3.5 Adsorption;207
11.3.6;7.3.6 Semi-Supervised Modularity Method;210
11.3.7;7.3.7 Interaction Forces;213
11.3.8;7.3.8 Discriminative Walks (D-Walks);214
11.4;7.4 Chapter Remarks;218
11.5;References;219
12;8 Case Study of Network-Based Supervised Learning: High-Level Data Classification;222
12.1;8.1 A Quick Overview of the Chapter;223
12.2;8.2 Motivation;224
12.3;8.3 Model Description;227
12.3.1;8.3.1 Fundamental Ideas Behind the Model;227
12.3.1.1;8.3.1.1 Training Phase;227
12.3.1.2;8.3.1.2 Classification Phase;229
12.3.2;8.3.2 Derivation of the Hybrid Classification Framework;231
12.4;8.4 Possible Ways of Composing High-Level Classifiers;234
12.4.1;8.4.1 High-Level Classification Using a Mixture of Complex Network Measures;234
12.4.1.1;8.4.1.1 First Network Measure: Assortativity;235
12.4.1.2;8.4.1.2 Second Network Measure: Clustering Coefficient;235
12.4.1.3;8.4.1.3 Third Network Measure: Average Degree or Connectivity;236
12.4.2;8.4.2 High-Level Classification Using Tourist Walks;237
12.4.2.1;8.4.2.1 Calculating the Variational Descriptors Ti(y)(?) and Ci(y)(?);239
12.4.2.2;8.4.2.2 Determining the Intra-Modulation Parameters wintra(y)(?);240
12.4.2.3;8.4.2.3 Determining the Inter-Modulation Parameters winter(y)(?);240
12.5;8.5 Numerical Analysis of the High-Level Classification;241
12.5.1;8.5.1 An Illustrative Example;241
12.5.2;8.5.2 Parameter Sensitivity Analysis;242
12.5.2.1;8.5.2.1 Influence of the Parameters Related to the Network Formation;243
12.5.2.2;8.5.2.2 Influence of the Critical Memory Length;243
12.6;8.6 Application: Handwritten Digits Recognition;246
12.6.1;8.6.1 Motivation;246
12.6.2;8.6.2 Description of the MNIST Data Set;247
12.6.3;8.6.3 A Suitable Similarity Measure for Images;247
12.6.4;8.6.4 Configurations of the Low-Level Classification Techniques;248
12.6.5;8.6.5 Experimental Results;248
12.6.6;8.6.6 Illustrative Examples: High-Level Classification vs. Low-Level Classification;249
12.7;8.7 Chapter Remarks;253
12.8;References;253
13;9 Case Study of Network-Based Unsupervised Learning: Stochastic Competitive Learning in Networks;256
13.1;9.1 A Quick Overview of the Chapter;256
13.2;9.2 Description of the Stochastic Competitive Model;257
13.2.1;9.2.1 Intuition of the Model;258
13.2.2;9.2.2 Derivation of the Transition Matrix;259
13.2.3;9.2.3 Definition of the Stochastic Nonlinear Dynamical System;267
13.2.4;9.2.4 Method for Estimating the Number of Communities;269
13.2.5;9.2.5 Method for Detecting Overlapping Structures;270
13.2.6;9.2.6 Parameter Sensitivity Analysis;270
13.2.6.1;9.2.6.1 Impact of the Parameter ?;270
13.2.6.2;9.2.6.2 Impact of the Parameter ?;272
13.2.6.3;9.2.6.3 Impact of the Parameter K;273
13.2.7;9.2.7 Convergence Analysis;274
13.3;9.3 Theoretical Analysis of the Model;278
13.3.1;9.3.1 Mathematical Analysis;278
13.3.1.1;9.3.1.1 Discovering the Factor Pp(t+1);279
13.3.1.2;9.3.1.2 Discovering the Factor PN(t+1);279
13.3.1.3;9.3.1.3 Discovering the Factor PE(t+1);280
13.3.1.4;9.3.1.4 Discovering the Factor PS(t+1);281
13.3.1.5;9.3.1.5 The Transition Probability Function;281
13.3.1.6;9.3.1.6 Discovering the Distribution P(N(t));282
13.3.1.7;9.3.1.7 Discovering the Distribution of the Domination Level Matrix P(N(t));286
13.3.2;9.3.2 Linking the Particle Competition Model and the Classical Multiple Independent Random Walks System;289
13.3.3;9.3.3 A Numerical Example;291
13.4;9.4 Numerical Analysis of the Detection of Overlapping Vertices and Communities;295
13.4.1;9.4.1 Zachary's Karate Club Network;295
13.4.2;9.4.2 Dolphin Social Network;297
13.4.3;9.4.3 Les misérables Novel Network;298
13.5;9.5 Application: Handwritten Digits and Letters Clustering;299
13.5.1;9.5.1 Brief Information of the Handwritten Digits and Letters Data Sets;299
13.5.2;9.5.2 Determining the Optimal Number of Particles and Clusters;300
13.5.3;9.5.3 Handwritten Data Clustering;300
13.6;9.6 Chapter Remarks;303
13.7;References;304
14;10 Case Study of Network-Based Semi-Supervised Learning: Stochastic Competitive-Cooperative Learning in Networks;306
14.1;10.1 A Quick Overview of the Chapter;306
14.2;10.2 Description of the Stochastic Competitive-Cooperative Model;307
14.2.1;10.2.1 Differences of the Semi-Supervised and the Unsupervised Versions;308
14.2.2;10.2.2 Familiarizing with the Semi-Supervised Environment;310
14.2.3;10.2.3 Deriving the Modified Competitive Transition Matrix;310
14.2.4;10.2.4 Modified Initial Conditions of the System;311
14.3;10.3 Theoretical Analysis of the Model;313
14.3.1;10.3.1 Mathematical Analysis;313
14.3.2;10.3.2 A Numerical Example;316
14.4;10.4 Numerical Analysis of the Model;319
14.4.1;10.4.1 Simulation on a Synthetic Data Set;319
14.4.2;10.4.2 Simulations on Real-World Data Sets;321
14.5;10.5 Application: Detection and Prevention of Error Propagation in Imperfect Learning;323
14.5.1;10.5.1 Motivation;323
14.5.2;10.5.2 Detecting Imperfect Training Data;324
14.5.3;10.5.3 Preventing Label Propagation from Imperfect Training Data;326
14.5.4;10.5.4 Definition of the Modified Learning System to Withstand Imperfect Data;328
14.5.5;10.5.5 Parameter Sensitivity Analysis;328
14.5.5.1;10.5.5.1 Impact of ?;329
14.5.5.2;10.5.5.2 Impact of ?;331
14.5.6;10.5.6 Computer Simulations;332
14.5.6.1;10.5.6.1 Synthetic Data Sets;332
14.5.6.2;10.5.6.2 Real-World Data Sets;334
14.6;10.6 Chapter Remarks;335
14.7;References;336
15;Index;337




