Buch, Englisch, 121 Seiten, Format (B × H): 155 mm x 235 mm, Gewicht: 2488 g
Towards an Intelligent, Fuzzy Based, Multimodal, Two-Stage Speech Enhancement System
Buch, Englisch, 121 Seiten, Format (B × H): 155 mm x 235 mm, Gewicht: 2488 g
Reihe: SpringerBriefs in Cognitive Computation
ISBN: 978-3-319-13508-3
Verlag: Springer International Publishing
This book presents a summary of the cognitively inspired basis behind multimodal speech enhancement, covering the relationship between audio and visual modalities in speech, as well as recent research into audiovisual speech correlation. A number of audiovisual speech filtering approaches that make use of this relationship are also discussed. A novel multimodal speech enhancement system, making use of both visual and audio information to filter speech, is presented, and this book explores the extension of this system with the use of fuzzy logic to demonstrate an initial implementation of an autonomous, adaptive, and context aware multimodal system. This work also discusses the challenges presented with regard to testing such a system, the limitations with many current audiovisual speech corpora, and discusses a suitable approach towards development of a corpus designed to test this novel, cognitively inspired, speech filtering system.
Zielgruppe
Research
Autoren/Hrsg.
Fachgebiete
- Mathematik | Informatik EDV | Informatik Informatik Künstliche Intelligenz Spracherkennung, Sprachverarbeitung
- Technische Wissenschaften Sonstige Technologien | Angewandte Technik Signalverarbeitung, Bildverarbeitung, Scanning
- Mathematik | Informatik EDV | Informatik Informatik Tonsignalverarbeitung
Weitere Infos & Material
Introduction.- Audio and visual speech relationship.- The research context.- 4 A two stage multimodal speech enhancement system.- Experiments, results, and analysis.- Towards fuzzy logic based multimodal speech filtering.- Evaluation of fuzzy logic proof of concept.- Conclusions and future work.