E-Book, Englisch, Band 13141, 641 Seiten, eBook
Þór Jónsson / Gurrin / Tran MultiMedia Modeling
1. Auflage 2022
ISBN: 978-3-030-98358-1
Verlag: Springer International Publishing
Format: PDF
Kopierschutz: 1 - PDF Watermark
28th International Conference, MMM 2022, Phu Quoc, Vietnam, June 6–10, 2022, Proceedings, Part I
E-Book, Englisch, Band 13141, 641 Seiten, eBook
Reihe: Lecture Notes in Computer Science
ISBN: 978-3-030-98358-1
Verlag: Springer International Publishing
Format: PDF
Kopierschutz: 1 - PDF Watermark
Zielgruppe
Research
Autoren/Hrsg.
Weitere Infos & Material
BEST PAPER SESSION.- Real-time detection of tiny objects based on a weighted bi-directional FPN.- Multi-Modal Fusion Network for Rumor Detection with Texts and Images.- PF-VTON: Toward High-Quality Parser-Free Virtual Try-On Network.- MF-GAN: Multi-conditional fusion Generative Adversarial Network for Text-to-Image Synthesis.- APPLICATIONS 1.- Learning to classify weather conditions from single images without labels.- Learning Image Representation via Attribute-aware Attention Networks for Fashion Classification.- Toward Detail-Oriented Image-Based Virtual Try-On with Arbitrary Poses.- Parallel DBSCAN-Martingale estimation of the number of concepts for automatic satellite image clustering .- MULTIMEDIA APPLICATIONS - PERSPECTIVES, TOOLS & APPLICATIONS (Special Session) & BRAVE NEW IDEAS.- AI for the Media Industry: Application Potential and Automation Level.- Color the Word: Leveraging Web Images for Machine Translation of Untranslatable Words .- ACTIVITIES & EVENTS.- MGMP: Multimodal Graph Message Propagation Network for Event Detection.- Pose-Enhanced Relation Feature for Action Recognition in Still Images.-Prostate Segmentation of Ultrasound Images based on Interpretable-guided Mathematical Model.- Spatiotemporal Perturbation Based Dynamic Consistency for Semi-Supervised Temporal Action Detection. - MULTIMEDIA DATASETS FOR REPEATABLE EXPERIMENTATION (Special Session).- A Task Category Space for User-Centric Comparative Multimedia Search Evaluations.- GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval.- LLQA - Lifelog Question Answering Dataset.- LEARNING.- Category-sensitive Incremental Learning For Image-based 3D Shape Reconstruction.- AdaConfigure: Reinforcement Learning-based Adaptive Configuration for Video Analytics Services.- Mining Minority-class Examples With Uncertainty Estimates.- Conditional Context-aware Feature Alignment for Domain Adaptive Detection Transformer .- MULTIMEDIA for MEDICAL APPLICATIONS (Special Session).- Human activity recognition with IMU and vital signs feature fusion.- On Identifying Pareidolia Phenomenon by Emulating Patient Behavior.- Using Explainable AI to Identify Differences between Clinical and Experimental Pain Detection Models Based on Facial Expressions.- APPLICATIONS 2.- Double Granularity Relation Network with Self-Criticism for Occluded Person Re-Identification.- A Complementary Fusion Strategy for RGB-D Face Recognition.- Multi-scale Cross-modal Transformer Network for RGB-D Object Detection.- Joint Re-Detection and Re-Identification for Multi-Object Tracking.- MULTIMEDIA ANALYTICS for CONTEXTUAL HUMAN UNDERSTANDING (Special Session).- An Investigation into Keystroke Dynamics and Heart Rate Variability as Indicators of Stress.- Fall detection using multimodal data.- Prediction of Blood Glucose using Contextual LifeLog Data.- Multimodal Embedding for Lifelog Retrieval.- APPLICATIONS 3.- A Multiple Positives Enhanced NCE Loss for Image-Text Retrieval.- SAM: Self Attention Mechanism for Scene Text Recognition based on Swin Transformer.- JVCSR: Video Compressive Sensing Reconstruction with Joint In-loop Reference Enhancement and Out-loop Super-resolution.- Point Cloud Upsampling via a Coarse-to-fine Network .- IMAGE ANALYTICS.- Arbitrary Style Transfer With Adaptive Channel Network.- Fast Single Image Dehazing Using Morphological Reconstruction and Saturation Compensation.- One-Stage Image Inpainting with Hybrid Attention.- Real-time FPGA Design for OMP Targeting 8K Image Reconstruction.- SPEECH & MUSIC.- Time-Frequency Attention For Speech Emotion Recognition With Squeeze-and-Excitation Blocks.- SPEECH INTELLIGIBILITY ENHANCEMENT BY NON-PARALLEL SPEECH STYLE CONVERSION USING CWT AND iMetricGAN BASED CycleGAN.- A-Muze-Net: Music Generation by Composing the Harmony based on the Generated Melody.- MULTIMODAL ANALYTICS.- Bi-attention modal separation network for multimodal video fusion.- Combining Knowledge and Multi-modal Fusion for Meme Classification.- Non-Uniform Attention Network for Multi-modal Sentiment Analysis.- Multimodal Unsupervised Image-to-Image Translation Without Independent Style Encoder.