E-Book, Englisch, 292 Seiten
Reihe: Chapman & Hall/CRC Data Mining and Knowledge Discovery Series
Spangler Accelerating Discovery
Erscheinungsjahr 2015
ISBN: 978-1-4822-3914-0
Verlag: Taylor & Francis
Format: PDF
Kopierschutz: Adobe DRM (»Systemvoraussetzungen)
Mining Unstructured Information for Hypothesis Generation
E-Book, Englisch, 292 Seiten
Reihe: Chapman & Hall/CRC Data Mining and Knowledge Discovery Series
ISBN: 978-1-4822-3914-0
Verlag: Taylor & Francis
Format: PDF
Kopierschutz: Adobe DRM (»Systemvoraussetzungen)
Unstructured Mining Approaches to Solve Complex Scientific Problems
As the volume of scientific data and literature increases exponentially, scientists need more powerful tools and methods to process and synthesize information and to formulate new hypotheses that are most likely to be both true and important. Accelerating Discovery: Mining Unstructured Information for Hypothesis Generation describes a novel approach to scientific research that uses unstructured data analysis as a generative tool for new hypotheses.
The author develops a systematic process for leveraging heterogeneous structured and unstructured data sources, data mining, and computational architectures to make the discovery process faster and more effective. This process accelerates human creativity by allowing scientists and inventors to more readily analyze and comprehend the space of possibilities, compare alternatives, and discover entirely new approaches.
Encompassing systematic and practical perspectives, the book provides the necessary motivation and strategies as well as a heterogeneous set of comprehensive, illustrative examples. It reveals the importance of heterogeneous data analytics in aiding scientific discoveries and furthers data science as a discipline.
Autoren/Hrsg.
Fachgebiete
- Mathematik | Informatik EDV | Informatik Informatik Theoretische Informatik
- Wirtschaftswissenschaften Volkswirtschaftslehre Volkswirtschaftslehre Allgemein Wirtschaftsstatistik, Demographie
- Wirtschaftswissenschaften Betriebswirtschaft Wirtschaftsmathematik und -statistik
- Mathematik | Informatik EDV | Informatik Daten / Datenbanken Data Mining
Weitere Infos & Material
Introduction
Why Accelerate Discovery?
Scott Spangler and Ying Chen
THE PROBLEM OF SYNTHESIS
THE PROBLEM OF FORMULATION
WHAT WOULD DARWIN DO?
THE POTENTIAL FOR ACCELERATED DISCOVERY: USING COMPUTERS TO MAP THE KNOWLEDGE SPACE
WHY ACCELERATE DISCOVERY: THE BUSINESS PERSPECTIVE
COMPUTATIONAL TOOLS THAT ENABLE ACCELERATED DISCOVERY
ACCELERATED DISCOVERY FROM A SYSTEM PERSPECTIVE
ACCELERATED DISCOVERY FROM A DATA PERSPECTIVE
ACCELERATED DISCOVERY IN THE ORGANIZATION
CHALLENGE (AND OPPORTUNITY) OF ACCELERATED DISCOVERY
Form and Function
THE PROCESS OF ACCELERATED DISCOVERY
CONCLUSION
Exploring Content to Find Entities
SEARCHING FOR RELEVANT CONTENT
HOW MUCH DATA IS ENOUGH? WHAT IS TOO MUCH?
HOW COMPUTERS READ DOCUMENTS
EXTRACTING FEATURES
FEATURE SPACES: DOCUMENTS AS VECTORS
CLUSTERING
DOMAIN CONCEPT REFINEMENT
MODELING APPROACHES
DICTIONARIES AND NORMALIZATION
COHESION AND DISTINCTNESS
SINGLE AND MULTIMEMBERSHIP TAXONOMIES
SUBCLASSING AREAS OF INTEREST
GENERATING NEW QUERIES TO FIND ADDITIONAL RELEVANT CONTENT
VALIDATION
SUMMARY
Organization
DOMAIN-SPECIFIC ONTOLOGIES AND DICTIONARIES
SIMILARITY TREES
USING SIMILARITY TREES TO INTERACT WITH DOMAIN
EXPERTS
SCATTER-PLOT VISUALIZATIONS
USING SCATTER PLOTS TO FIND OVERLAPS BETWEEN NEARBY ENTITIES OF DIFFERENT TYPES
DISCOVERY THROUGH VISUALIZATION OF TYPE SPACE
Relationships
WHAT DO RELATIONSHIPS LOOK LIKE?
HOW CAN WE DETECT RELATIONSHIPS?
REGULAR EXPRESSION PATTERNS FOR EXTRACTING
RELATIONSHIPS
NATURAL LANGUAGE PARSING
COMPLEX RELATIONSHIPS
EXAMPLE: P53 PHOSPHORYLATION EVENTS
PUTTING IT ALL TOGETHER
EXAMPLE: DRUG/TARGET/DISEASE RELATIONSHIP
NETWORKS
CONCLUSION
Inference
CO-OCCURRENCE TABLES
CO-OCCURRENCE NETWORKS
RELATIONSHIP SUMMARIZATION GRAPHS
HOMOGENEOUS RELATIONSHIP NETWORKS
HETEROGENEOUS RELATIONSHIP NETWORK