E-Book, Englisch, 839 Seiten, eBook
Karpov / Potapova Speech and Computer
1. Auflage 2021
ISBN: 978-3-030-87802-3
Verlag: Springer International Publishing
Format: PDF
Kopierschutz: 1 - PDF Watermark
23rd International Conference, SPECOM 2021, St. Petersburg, Russia, September 27–30, 2021, Proceedings
E-Book, Englisch, 839 Seiten, eBook
Reihe: Lecture Notes in Artificial Intelligence
ISBN: 978-3-030-87802-3
Verlag: Springer International Publishing
Format: PDF
Kopierschutz: 1 - PDF Watermark
Zielgruppe
Research
Autoren/Hrsg.
Weitere Infos & Material
Text-Independent Speaker Verification Employing CNN-LSTM-TDNN Hybrid Networks.- End-to-End Voice Spoofing Detection Employing Time Delay Neural Networks and Higher Order Statistics.- Assessing Velar Gestures Timing in European Portuguese Nasal Vowels with RT-MRI Data.- Designing and Deploying an Interaction Modality for Articulatory-Based Audiovisual Speech Synthesis.- Kurdish Spoken Dialect Recognition Using X-vector Speaker Embedding.- An ASR-based Tutor for Learning to Read: How to Optimize Feedback to First Graders.- Velocity Differences Between Velum Raising and Lowering Movements.- Pragmatic Markers of Russian Everyday Speech: Invariants in Dialogue and Monologue.- Language Adaptation for Speaker Recognition Systems using Contrastive Learning.- Evaluating X-vector-based Speaker Anonymization Under White-box Assessment.- Improved Prosodic Clustering for Multispeaker and Speaker-Independent Phoneme-Level Prosody Control.- Initial Experiments on Question Answering from the Intrinsic Structure of Oral History Archives.- Imagined, Intended, and Spoken Speech Envelope Synthesis from Neuromagnetic Signals.- What Causes Phonetic Reduction in Russian Speech: New Evidence from Machine Learning Algorithms.- Toxic Comment Classification Service in Social Network.- Deep Learning based Engagement Recognition in Highly Imbalanced Data.- Intraspeaker Variability of a Professional Lecturer: Ageing, Genre, Pragmatics vs. Voice Acting (Case Study).- An Ensemble Approach for the Diagnosis of COVID-19 from Speech and Cough Sounds.- Where are We in Semantic Concept Extraction for Spoken Language Understanding?.- Learning Mizo Tones from F0 Contours using 1D-CNN.- OCR Improvements for Images of Multi-Page Historical Documents.- X-Bridge: Image-to-Image Translation with Reconstruction Capabilities.- Who is Selling to Whom - Feature Evaluation for Multi-block Classification in Invoice Information Extraction.- Multimodal Corpus Analysis of Autoblog 2020: Lecture Videos inMachine Learning.- Text and Synthetic Data for Domain Adaptation in End-to-End Speech Recognition.- Speaker-invariant Speech-To-Intent Classification for Low-Resource Languages.- Speaker-Dependent Visual Command Recognition in Vehicle Cabin: Methodology and Evaluation.- Optimised Code-Switched Language Model Data Augmentation in Four Under-Resourced South African Languages.- Synthesis Speech based Data Augmentation for Low Resource Children ASR.- End-to-End Russian Speech Recognition Models with Multi-Head Attention.- Word-level Style Control for Expressive, Non-attentive Speech Synthesis.- Perceiving Speech Aggression with and without Textual Context on Twitter Social Network Site.- Assessing Speaker Interpolation in Neural Text-to-Speech.- A Mobile Application for Detection of Amyotrophic Lateral Sclerosis via Voice Analysis.- Child's Emotional Speech Classification by Human across Two Languages: Russian & Tamil.- Analysis of Dialogues of Typically Developing Children, Children withDown Syndrome and ASD using Machine Learning Methods.- Speaker Adaptation with Continuous Vocoder-based DNN-TTS.- Automatic Recognition of the Psychoneurological State of Children: Autism Spectrum Disorders, Down Syndrome, Typical Development.- Study on Acoustic Model Personalization in a Context of Collaborative Learning Constrained by Privacy Preservation.- USC: An Open-Source Uzbek Speech Corpus and Initial Speech Recognition Experiments.- A Study of Multilingual End-to-End Speech Recognition for Kazakh, Russian, and English.- Dialog Speech Sentiment Classification for Imbalanced Datasets.- Explicit Control of the Level of Expressiveness in DNN-based Speech Synthesis by Embedding Interpolation.- Experimental Analysis of Expert and Quantitative Estimates of Syllable Recordings in the Process of Speech Rehabilitation.- Methods for Using Class Based N-gram Language Models in the Kaldi Toolkit.- Spectral Root Features for Replay Spoof Detection in Voice Assistants.- Influence of the Aggressive Internet Environment on Cognitive Personality Disorders (in Relation to the Russian Young Generation of Users).- Media Content vs Nature Stimuli Influence on Human Brain Activity.- Can Your Eyes Tell Us Why You Hesitate? Comparing Reading Aloud in Russian as L1 and Japanese as L2.- Recognition of Heavily Accented and Emotional Speech of English and Czech Holocaust Survivors using Various DNN Architectures.- Assessing Speaker-Independent Character Information for Acted Voices.- Influence of Speaker Pre-training on Character Voice Representation.- Opinion Classification via Word and Emoji Embedding Models with LSTM.- An Equal Data Setting for Attention-based Encoder-Decoder andHMM/DNN Models: a Case Study in Finnish ASR.- Speaker-aware Training of Speech Emotion Classifier with Speaker Recognition.- Neural Network Recognition of Russian Noun and Adjective Cases in the Google Books Ngram Corpus.- Is it a Filler or a Pause? A Quantitative Analysis of Filled Pauses inHebrew.- Modified Group Delay Function using Different Spectral Smoothing Techniques for Voice Liveness Detection.- Complex Rhythm Adjustments in Multilingual Code-Switching across Mandarin, English and Russian.- Increasing the Precision of Dysarthric Speech Intelligibility and Severity Level Estimate.- Articulation During Voice Disguise: a Pilot Study.- Improvement of Speaker Number Estimation by Applying an Overlapped Speech Detector.- Mind Your Tweet: Abusive Tweet Detection.- Speaker Authorization for Air Traffic Control Security.- Prosodic Changes with Age: a Longitudinal Study on a Famous European Portuguese Native Speaker.- Automatic Selection of the Most Characterizing Features for Detecting COPD in Speech.- Multilingual Training Set Selection for ASR in Under-Resourced MalianLanguages.- Human and Transformer-Based Prosodic Phrasing in Two Speech Genres.- Learning Efficient Representations for Keyword Spottingwith Triplet Loss.- Regularized Forward-Backward Decoder for Attention Models.- Induced Local Attention for Transformer Models in Speech Recognition.- Applying EEND Diarization to Telephone Recordings from a Call Center.- Acoustic Characteristics of Speech Entrainment in Dialogues in Similar Phonetic Sequences.- Predicting Biometric Error Behaviour from Speaker Embeddings and a Fast Score Normalization Scheme.