Zhou / Li / Yang | Advances in Knowledge Discovery and Data Mining | E-Book | www.sack.de
E-Book

E-Book, Englisch, 1186 Seiten, eBook

Reihe: Lecture Notes in Artificial Intelligence

Zhou / Li / Yang Advances in Knowledge Discovery and Data Mining

11th Pacific-Asia Conference, PAKDD 2007, Nanjing, China, May 22-25, 2007, Proceedings
2007
ISBN: 978-3-540-71701-0
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark

11th Pacific-Asia Conference, PAKDD 2007, Nanjing, China, May 22-25, 2007, Proceedings

E-Book, Englisch, 1186 Seiten, eBook

Reihe: Lecture Notes in Artificial Intelligence

ISBN: 978-3-540-71701-0
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark



This book constitutes the refereed proceedings of the 11th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2007, held in Nanjing, China, May 2007. It covers new ideas, original research results and practical development experiences from all KDD-related areas including data mining, machine learning, data warehousing, data visualization, automatic scientific discovery, knowledge acquisition and knowledge-based systems.

Zhou / Li / Yang Advances in Knowledge Discovery and Data Mining jetzt bestellen!

Zielgruppe


Research

Weitere Infos & Material


Keynote Speeches.- Research Frontiers in Advanced Data Mining Technologies and Applications.- Finding the Real Patterns.- Class Noise vs Attribute Noise: Their Impacts, Detection and Cleansing.- Multi-modal and Multi-granular Learning.- Regular Papers.- Hierarchical Density-Based Clustering of Categorical Data and a Simplification.- Multi-represented Classification Based on Confidence Estimation.- Selecting a Reduced Set for Building Sparse Support Vector Regression in the Primal.- Mining Frequent Itemsets from Uncertain Data.- QC4 - A Clustering Evaluation Method.- Semantic Feature Selection for Object Discovery in High-Resolution Remote Sensing Imagery.- Deriving Private Information from Arbitrarily Projected Data.- Consistency Based Attribute Reduction.- A Hybrid Command Sequence Model for Anomaly Detection.- ?-Algorithm: Structured Workflow Process Mining Through Amalgamating Temporal Workcases.- Multiscale BiLinear Recurrent Neural Network for Prediction of MPEG Video Traffic.- An Effective Multi-level Algorithm Based on Ant Colony Optimization for Bisecting Graph.- A Unifying Method for Outlier and Change Detection from Data Streams Based on Local Polynomial Fitting.- Simultaneous Tuning of Hyperparameter and Parameter for Support Vector Machines.- Entropy Regularization, Automatic Model Selection, and Unsupervised Image Segmentation.- A Timing Analysis Model for Ontology Evolutions Based on Distributed Environments.- An Optimum Random Forest Model for Prediction of Genetic Susceptibility to Complex Diseases.- Feature Based Techniques for Auto-Detection of Novel Email Worms.- Multiresolution-Based BiLinear Recurrent Neural Network.- Query Expansion Using a Collection Dependent Probabilistic Latent Semantic Thesaurus.- Scaling Up Semi-supervised Learning: An Efficient and Effective LLGC Variant.- A Machine Learning Approach to Detecting Instantaneous Cognitive States from fMRI Data.- Discovering Correlated Items in Data Streams.- Incremental Clustering in Geography and Optimization Spaces.- Estimation of Class Membership Probabilities in the Document Classification.- A Hybrid Multi-group Privacy-Preserving Approach for Building Decision Trees.- A Constrained Clustering Approach to Duplicate Detection Among Relational Data.- Understanding Research Field Evolving and Trend with Dynamic Bayesian Networks.- Embedding New Data Points for Manifold Learning Via Coordinate Propagation.- Spectral Clustering Based Null Space Linear Discriminant Analysis (SNLDA).- On a New Class of Framelet Kernels for Support Vector Regression and Regularization Networks.- A Clustering Algorithm Based on Mechanics.- DLDA/QR: A Robust Direct LDA Algorithm for Face Recognition and Its Theoretical Foundation.- gPrune: A Constraint Pushing Framework for Graph Pattern Mining.- Short Papers.- Modeling Anticipatory Event Transitions.- A Modified Relationship Based Clustering Framework for Density Based Clustering and Outlier Filtering on High Dimensional Datasets.- A Region-Based Skin Color Detection Algorithm.- Supportive Utility of Irrelevant Features in Data Preprocessing.- Incremental Mining of Sequential Patterns Using Prefix Tree.- A Multiple Kernel Support Vector Machine Scheme for Simultaneous Feature Selection and Rule-Based Classification.- Combining Supervised and Semi-supervised Classifier for Personalized Spam Filtering.- Qualitative Simulation and Reasoning with Feature Reduction Based on Boundary Conditional Entropy of Knowledge.- A Hybrid Incremental Clustering Method-Combining Support Vector Machine and Enhanced Clustering by Committee Clustering Algorithm.- CCRM: An Effective Algorithm for Mining Commodity Information from Threaded Chinese Customer Reviews.- A Rough Set Approach to Classifying Web Page Without Negative Examples.- Evolution and Maintenance of Frequent Pattern Space When Transactions Are Removed.- Establishing Semantic Relationship in Inter-query Learning for Content-Based Image Retrieval Systems.- Density-Sensitive Evolutionary Clustering.- Reducing Overfitting in Predicting Intrinsically Unstructured Proteins.- Temporal Relations Extraction in Mining Hepatitis Data.- Supervised Learning Approach to Optimize Ranking Function for Chinese FAQ-Finder.- Combining Convolution Kernels Defined on Heterogeneous Sub-structures.- Privacy-Preserving Sequential Pattern Release.- Mining Concept Associations for Knowledge Discovery Through Concept Chain Queries.- Capability Enhancement of Probabilistic Neural Network for the Design of Breakwater Armor Blocks.- Named Entity Recognition Using Acyclic Weighted Digraphs: A Semi-supervised Statistical Method.- Contrast Set Mining Through Subgroup Discovery Applied to Brain Ischaemina Data.- Intelligent Sequential Mining Via Alignment: Optimization Techniques for Very Large DB.- A Hybrid Prediction Method Combining RBF Neural Network and FAR Model.- An Advanced Fuzzy C-Mean Algorithm for Regional Clustering of Interconnected Systems.- Centroid Neural Network with Bhattacharyya Kernel for GPDF Data Clustering.- Concept Interconnection Based on Many-Valued Context Analysis.- Text Classification for Thai Medicinal Web Pages.- A Fast Algorithm for Finding Correlation Clusters in Noise Data.- Application of Discrimination Degree for Attributes Reduction in Concept Lattice.- A Language and a Visual Interface to Specify Complex Spatial Patterns.- Clustering Ensembles Based on Normalized Edges.- Quantum-Inspired Immune Clonal Multiobjective Optimization Algorithm.- Phase Space Reconstruction Based Classification of Power Disturbances Using Support Vector Machines.- Mining the Impact Factors of Threads and Participators on Usenet Using Link Analysis.- Weighted Rough Set Learning: Towards a Subjective Approach.- Multiple Self-Splitting and Merging Competitive Learning Algorithm.- A Novel Relative Space Based Gene Feature Extraction and Cancer Recognition.- Experiments on Kernel Tree Support Vector Machines for Text Categorization.- A New Approach for Similarity Queries of Biological Sequences in Databases.- Anomaly Intrusion Detection Based on Dynamic Cluster Updating.- Efficiently Mining Closed Constrained Frequent Ordered Subtrees by Using Border Information.- Approximate Trace of Grid-Based Clusters over High Dimensional Data Streams.- BRIM: An Efficient Boundary Points Detecting Algorithm.- Syntactic Impact on Sentence Similarity Measure in Archive-Based QA System.- Semi-structure Mining Method for Text Mining with a Chunk-Based Dependency Structure.- Principal Curves with Feature Continuity.- Kernel-Based Linear Neighborhood Propagation for Semantic Video Annotation.- Learning Bayesian Networks with Combination of MRMR Criterion and EMI Method.- A Cooperative Coevolution Algorithm of RBFNN for Classification.- ANGEL: A New Effective and Efficient Hybrid Clustering Technique for Large Databases.- Exploring Group Moving Pattern for an Energy-Constrained Object Tracking Sensor Network.- ProMail: Using Progressive Email Social Network for Spam Detection.- Multidimensional Decision Support Indicator (mDSI) for Time Series Stock Trend Prediction.- A Novel Support Vector Machine Ensemble Based on Subtractive Clustering Analysis.- Keyword Extraction Based on PageRank.- Finding the Optimal Feature Representations for Bayesian Network Learning.- Feature Extraction and Classification of Tumor Based on Wavelet Package and Support Vector Machines.- Resource Allocation and Scheduling Problem Based on Genetic Algorithm and Ant Colony Optimization.- Image Classification and Segmentation for Densely Packed Aggregates.- Mining Temporal Co-orientation Pattern from Spatio-temporal Databases.- Incremental Learning of Support Vector Machines by Classifier Combining.- Clustering Zebrafish Genes Based on Frequent-Itemsets and Frequency Levels.- A Practical Method for Approximate Subsequence Search in DNA Databases.- An Information Retrieval Model Based on Semantics.- AttributeNets: An Incremental Learning Method for Interpretable Classification.- Mining Personalization Interest and Navigation Patterns on Portal.- Cross-Lingual Document Clustering.- Grammar Guided Genetic Programming for Flexible Neural Trees Optimization.- A New Initialization Method for Clustering Categorical Data.- L0-Constrained Regression for Data Mining.- Application of Hybrid Pattern Recognition for Discriminating Paddy Seeds of Different Storage Periods Based on Vis/NIRS.- Density-Based Data Clustering Algorithms for Lower Dimensions Using Space-Filling Curves.- Transformation-Based GMM with Improved Cluster Algorithm for Speaker Identification.- Using Social Annotations to Smooth the Language Model for IR.- Affection Factor Optimization in Data Field Clustering.- A New Algorithm for Minimum Attribute Reduction Based on Binary Particle Swarm Optimization with Vaccination.- Graph Nodes Clustering Based on the Commute-Time Kernel.- Identifying Synchronous and Asynchronous Co-regulations from Time Series Gene Expression Data.- A Parallel Algorithm for Learning Bayesian Networks.- Incorporating Prior Domain Knowledge into a Kernel Based Feature Selection Algorithm.- Geo-spatial Clustering with Non-spatial Attributes and Geographic Non-overlapping Constraint: A Penalized Spatial Distance Measure.- GBKII: An Imputation Method for Missing Values.- An Effective Gene Selection Method Based on Relevance Analysis and Discernibility Matrix.- Towards Comprehensive Privacy Protection in Data Clustering.- A Novel Spatial Clustering with Obstacles Constraints Based on Particle Swarm Optimization and K-Medoids.- Online Rare Events Detection.- Structural Learning About Independence Graphs from Multiple Databases.- An Effective Method For Calculating Natural Adjacency Relation in Spatial Database.- K-Centers Algorithm for Clustering Mixed Type Data.- Proposion and Analysis of a TCP Feature of P2P Traffic.


Research Frontiers in Advanced Data Mining Technologies and Applications (p. 25)
Data mining, as the confluence of multiple intertwined disciplines, including statistics, machine learning, pattern recognition, database systems, information retrieval, World-Wide Web, and many application domains, has achieved great progress in the past decade [1]. Similar to many research fields, data mining has two general directions: theoretical foundations and advanced technologies and applications.

Here we focus on advanced technologies and applications in data mining and discuss some recent progress in this direction. Notice that some popular research topics, such as privacypreserving data mining, are not covered in the discussion for lack of space/time. Our discussion is organized into nine themes, and we briefly outline the current status and research problems in each theme.

1 Pattern Mining, Pattern Usage, and Pattern Understanding
Frequent pattern mining has been a focused theme in data mining research for over a decade. Abundant literature has been dedicated to this research and tremendous progress has been made, ranging from efficient and scalable algorithms for frequent itemset mining in transaction databases to numerous research frontiers, such as sequential pattern mining, structural pattern mining, correlation mining, associative classification, and frequent-pattern-based clustering, as well as their broad applications.

Recently, studies have proceeded to scalable methods for mining colossal patterns where the size of the patterns could be rather large so that the step-by-step growth using an Apriori-like approach does not work, methods for pattern compression, extraction of high-quality top-k patterns, and understanding patterns by context analysis and generation of semantic annotations.

Moreover, frequent patterns have been used for effective classification by top-k rule generation for long patterns and discriminative frequent pattern analysis. Frequent patterns have also been used for clustering of high-dimensional biological data. Scalable methods for mining long, approximate, compressed, and sophisticated patterns for advanced applications, such as biological sequences and networks, and the exploration of mined patterns for classification, clustering, correlation analysis, and pattern understanding will still be interesting topics in research.

2 Information Network Analysis

Google’s PageRank algorithm has started a revolution on Internet search. However, since information network analysis covers many additional aspects and needs scalable and effective methods, the systematic study of this domain has just started, with many interesting issues to be explored. Information network analysis has broad applications, covering social and biological network analysis, computer network intrusion detection, software program analysis, terrorist network discovery, and Web analysis.

One interesting direction is to treat information network as graphs and further develop graph mining methods. Recent progress on graph mining and its associated structural pattern-based classification and clustering, graph indexing, and similarity search will play an important role in information network analysis.

Moreover, since information networks often form huge, multidimensional heterogeneous graphs, mining noisy, approximate, and heterogeneous subgraphs based on different applications for the construction of application-specific networks with sophisticated structures will help information network analysis substantially.The discovery of the power law distribution of information networks and the rules on density evolution of information networks will help develop effective algorithms for network analysis.



Ihre Fragen, Wünsche oder Anmerkungen
Vorname*
Nachname*
Ihre E-Mail-Adresse*
Kundennr.
Ihre Nachricht*
Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.
Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.