Directly data-derived articulatory gesture-like representations retain discriminatory information about phone categories [0.03%]
直接从数据中得出的类似发音姿态的表示保留了关于音素类别的区别信息
Vikram Ramanarayanan,Maarten Van Segbroeck,Shrikanth S Narayanan
Vikram Ramanarayanan
How the speech production and perception systems evolved in humans still remains a mystery today. Previous research suggests that human auditory systems are able, and have possibly evolved, to preserve maximal information about the speaker'...
Speaker-sensitive emotion recognition via ranking: Studies on acted and spontaneous speech☆ [0.03%]
基于排序的说话人敏感的情绪识别研究:表演和自然演讲中的对比
Houwei Cao,Ragini Verma,Ani Nenkova
Houwei Cao
We introduce a ranking approach for emotion recognition which naturally incorporates information about the general expressivity of speakers. We demonstrate that our approach leads to substantial gains in accuracy compared to conventional ap...
Automatic intelligibility classification of sentence-level pathological speech [0.03%]
基于句子级言语障碍语音的自动可懂度分类研究
Jangwon Kim,Naveen Kumar,Andreas Tsiartas et al.
Jangwon Kim et al.
Pathological speech usually refers to the condition of speech distortion resulting from atypicalities in voice and/or in the articulatory mechanisms owing to disease, illness or other physical or biological insult to the production system. ...
Acoustic and Lexical Representations for Affect Prediction in Spontaneous Conversations [0.03%]
基于自发对话的情感预测的声学和词汇表示形式研究
Houwei Cao,Arman Savran,Ragini Verma et al.
Houwei Cao et al.
In this article we investigate what representations of acoustics and word usage are most suitable for predicting dimensions of affect|AROUSAL, VALANCE, POWER and EXPECTANCY|in spontaneous interactions. Our experiments are based on the AVEC ...
Fully Automated Assessment of the Severity of Parkinson's Disease from Speech [0.03%]
基于语音的帕金森病严重程度全自动评估方法研究
Alireza Bayestehtashk,Meysam Asgari,Izhak Shafran et al.
Alireza Bayestehtashk et al.
For several decades now, there has been sporadic interest in automatically characterizing the speech impairment due to Parkinson's disease (PD). Most early studies were confined to quantifying a few speech features that were easy to compute...
The glottaltopogram: a method of analyzing high-speed images of the vocal folds [0.03%]
声带高速图像分析方法:喉顶图
Gang Chen,Jody Kreiman,Abeer Alwan
Gang Chen
Laryngeal high-speed videoendoscopy is a state-of-the-art technique to examine physiological vibrational patterns of the vocal folds. With sampling rates of thousands of frames per second, high-speed videoendoscopy produces a large amount o...
Intoxicated Speech Detection: A Fusion Framework with Speaker-Normalized Hierarchical Functionals and GMM Supervectors [0.03%]
基于说话人归一化核函数及GMM监督向量的醉酒说话检测融合框架
Daniel Bone,Ming Li,Matthew P Black et al.
Daniel Bone et al.
Segmental and suprasegmental speech signal modulations offer information about paralinguistic content such as affect, age and gender, pathology, and speaker state. Speaker state encompasses medium-term, temporary physiological phenomena inf...
Huffman scanning: using language models within fixed-grid keyboard emulation [0.03%]
霍夫曼扫描:在固定网格键盘模拟中使用语言模型
Brian Roark,Russell Beckley,Chris Gibbons et al.
Brian Roark et al.
Individuals with severe motor impairments commonly enter text using a single binary switch and symbol scanning methods. We present a new scanning method -Huffman scanning - which uses Huffman coding to select the symbols to highlight during...
Inferring Social Nature of Conversations from Words: Experiments on a Corpus of Everyday Telephone Conversations [0.03%]
从话语中推断日常电话会话的社会属性
Anthony Stark,Izhak Shafran,Jeffrey Kaye
Anthony Stark
Language is being increasingly harnessed to not only create natural human-machine interfaces but also to infer social behaviors and interactions. In the same vein, we investigate a novel spoken language task, of inferring social relationshi...
Phrase-level speech simulation with an airway modulation model of speech production [0.03%]
基于发声生理过程的端到端语音合成模型
Brad H Story
Brad H Story
Artificial talkers and speech synthesis systems have long been used as a means of understanding both speech production and speech perception. The development of an airway modulation model is described that simulates the time-varying changes...