Man-Machine Speech Communication [electronic resource] : 17th National Conference, NCMMSC 2022, Hefei, China, December 15-18, 2022, Proceedings / edited by Ling Zhenhua, Gao Jianqing, Yu Kai, Jia Jia.

Contributor(s):

Material type: Text

TextLanguage: English Series: Publication details: Singapore : Springer Nature Singapore : Imprint: Springer, 2023.Edition: 1st ed. 2023Description: XI, 332 p. 91 illus., 86 illus. in color. online resourceISBN:

9789819924011

Subject(s):

DDC classification:

006.37 23

Online resources:

Click Here

Contents:

MCPN: A Multiple Cross-Perception Network for Real-Time Emotion Recognition in Conversation -- Baby Cry Recognition Based on Acoustic Segment Model -- A Multi-feature Sets Fusion Strategy with Similar Samples Removal for Snore Sound Classification -- Multi-Hypergraph Neural Networks for Emotion Recognition in Multi-Party Conversations -- Using Emoji as an Emotion Modality in Text-Based Depression Detection -- Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis -- Semantic enhancement framework for robust speech recognition -- Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model -- Predictive AutoEncoders are Context-Aware Unsupervised Anomalous Sound Detectors -- A pipelined framework with serialized output training for overlapping speech recognition -- Adversarial Training Based on Meta-Learning in Unseen Domains for Speaker Verification -- Multi-Speaker Multi-Style Speech Synthesis with Timbre and Style Disentanglement -- Multiple Confidence Gates for Joint Training of SE and ASR -- Detecting Escalation Level from Speech with Transfer Learning and Acoustic-Linguistic Information Fusion -- Pre-training Techniques For Improving Text-to-Speech Synthesis By Automatic Speech Recognition Based Data Enhancement -- A Time-Frequency Attention Mechanism with Subsidiary Information for Effective Speech Emotion Recognition -- Interplay between prosody and syntax-semantics: Evidence from the prosodic features of Mandarin tag questions -- Improving Fine-grained Emotion Control and Transfer with Gated Emotion Representations in Speech Synthesis -- Violence Detection through Fusing Visual Information to Auditory Scene -- Mongolian Text-to-Speech Challenge under Low-Resource Scenario for NCMMSC2022 -- VC-AUG Voice Conversion based Data Augmentation for Text-Dependent Speaker Veriﬁcation -- Transformer-based potential emotional relation mining network for emotion recognition in conversation -- FastFoley Non-Autoregressive Foley Sound Generation Based On Visual Semantics -- Structured Hierarchical Dialogue Policy with Graph Neural Networks -- Deep Reinforcement Learning for On-line Dialogue State Tracking -- Dual Learning for Dialogue State Tracking -- Automatic Stress Annotation and Prediction For Expressive Mandarin TTS -- MnTTS2 An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis Dataset.

Summary: This book constitutes the refereed proceedings of the 17th National Conference on Man-Machine Speech Communication, NCMMSC 2022, held in China, in December 2022. The 21 full papers and 7 short papers included in this book were carefully reviewed and selected from 108 submissions. They were organized in topical sections as follows: MCPN: A Multiple Cross-Perception Network for Real-Time Emotion Recognition in Conversation.- Baby Cry Recognition Based on Acoustic Segment Model, MnTTS2 An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis Dataset.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Current library	Call number	Materials specified	Status	Date due	Barcode	Item holds
E-Books	National Library of India Online Resource	006.37 (Browse shelf(Opens below))		Available		EBK000042745ENG

Total holds: 0

This book constitutes the refereed proceedings of the 17th National Conference on Man-Machine Speech Communication, NCMMSC 2022, held in China, in December 2022. The 21 full papers and 7 short papers included in this book were carefully reviewed and selected from 108 submissions. They were organized in topical sections as follows: MCPN: A Multiple Cross-Perception Network for Real-Time Emotion Recognition in Conversation.- Baby Cry Recognition Based on Acoustic Segment Model, MnTTS2 An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis Dataset.

There are no comments on this title.

to post a comment.