Ieee-acm transactions on audio speech and language processing文章索引

[Formula: see text] Estimation and Voicing Detection With Cascade Architecture in Noisy Speech [0.03%] 带噪语音的级联结构共振峰估计与音调检测方法研究

Yixuan Zhang,Heming Wang,DeLiang Wang Yixuan Zhang

As a fundamental problem in speech processing, pitch tracking has been studied for decades. While strong performance has been achieved on clean speech, pitch tracking in noisy speech is still challenging. Severe non-stationary noises not on...

IEEE/ACM transactions on audio, speech, and language processing. 2023:31:3760-3770. DOI:10.1109/TASLP.2023.3313427 2023

Speech Enhancement for Cochlear Implant Recipients using Deep Complex Convolution Transformer with Frequency Transformation [0.03%] 基于深度复数卷积变换器和频率变换的 cochlear 种植体佩戴者语音增强技术

Nursadul Mamun,John H L Hansen Nursadul Mamun

The presence of background noise or competing talkers is one of the main communication challenges for cochlear implant (CI) users in speech understanding in naturalistic spaces. These external factors distort the time-frequency (T-F) conten...

IEEE/ACM transactions on audio, speech, and language processing. 2024:32:2616-2629. DOI:10.1109/taslp.2024.3366760 2024

Selective Acoustic Feature Enhancement for Speech Emotion Recognition With Noisy Speech [0.03%] 基于噪声环境的说话人情感识别的选择性声学特征增强方法研究

Seong-Gyun Leem,Daniel Fulford,Jukka-Pekka Onnela et al. Seong-Gyun Leem et al.

A speech emotion recognition (SER) system deployed on a real-world application can encounter speech contaminated with unconstrained background noise. To deal with this issue, a speech enhancement (SE) module can be attached to the SER syste...

IEEE/ACM transactions on audio, speech, and language processing. 2024:32:917-929. DOI:10.1109/taslp.2023.3340603 2024

Glottal Airflow Estimation using Neck Surface Acceleration and Low-Order Kalman Smoothing [0.03%] 使用颈部表面加速度和低阶卡尔曼平滑估计声门气流

Arturo Morales,Juan I Yuz,Juan Pablo Cortés et al. Arturo Morales et al.

The use of non-invasive skin accelerometers placed over the extrathoracic trachea has been proposed in the literature for measuring vocal function. Glottal airflow is estimated using inverse filtering or Bayesian techniques based on a subgl...

IEEE/ACM transactions on audio, speech, and language processing. 2023:31:2055-2066. DOI:10.1109/taslp.2023.3277269 2023

Bilateral Cochlear Implant Processing of Coding Strategies With CCi-MOBILE, an Open-Source Research Platform [0.03%] 双边Cochlear植入处理编码策略的CCi-MOBILE开放式研究平台

Ria Ghosh,John H L Hansen Ria Ghosh

While speech understanding for cochlear implant (CI) users in quiet is relatively effective, listeners experience difficulty in identification of speaker and sound location. To assist for better residual hearing abilities and speech intelli...

IEEE/ACM transactions on audio, speech, and language processing. 2023:31:1839-1850. DOI:10.1109/taslp.2023.3267608 2023

Robust Vocal Quality Feature Embeddings for Dysphonic Voice Detection [0.03%] 用于嗓音检测的鲁棒喉症声音质量特征嵌入技术研究

Jianwei Zhang,Julie Liss,Suren Jayasuriya et al. Jianwei Zhang et al.

Approximately 1.2% of the world's population has impaired voice production. As a result, automatic dysphonic voice detection has attracted considerable academic and clinical interest. However, existing methods for automated voice assessment...

IEEE/ACM transactions on audio, speech, and language processing. 2023:31:1348-1359. DOI:10.1109/taslp.2023.3261753 2023

Attentive Training: A New Training Framework for Speech Enhancement [0.03%] 基于注意力机制的说话人辨认增强模型训练框架研究

Ashutosh Pandey,DeLiang Wang Ashutosh Pandey

Dealing with speech interference in a speech enhancement system requires either speaker separation or target speaker extraction. Speaker separation has multiple output streams with arbitrary assignments while target speaker extraction requi...

IEEE/ACM transactions on audio, speech, and language processing. 2023:31:1360-1370. DOI:10.1109/taslp.2023.3260711 2023

Low-Latency Active Noise Control Using Attentive Recurrent Network [0.03%] 基于注意力循环网络的低延迟主动降噪方法

Hao Zhang,Ashutosh Pandey,DeLiang Wang Hao Zhang

Processing latency is a critical issue for active noise control (ANC) due to the causality constraint of ANC systems. This paper addresses low-latency ANC in the context of deep learning (i.e. deep ANC). A time-domain method using an attent...

IEEE/ACM transactions on audio, speech, and language processing. 2023:31:1114-1123. DOI:10.1109/taslp.2023.3244528 2023

Fusing Bone-conduction and Air-conduction Sensors for Complex-Domain Speech Enhancement [0.03%] 融合骨传导和空气传导传感器的复数域 speech enhancement 方法

Heming Wang,Xueliang Zhang,DeLiang Wang Heming Wang

Speech enhancement aims to improve the listening quality and intelligibility of noisy speech in adverse environments. It proves to be challenging to perform speech enhancement in very low signal-to-noise ratio (SNR) conditions. Conventional...

IEEE/ACM transactions on audio, speech, and language processing. 2022:30:3134-3143. DOI:10.1109/taslp.2022.3209943 2022

Microscopic and Blind Prediction of Speech Intelligibility: Theory and Practice [0.03%] 语音可懂度的微观及盲预测：理论与实践研究

Mahdie Karbasi,Steffen Zeiler,Dorothea Kolossa Mahdie Karbasi

Being able to estimate speech intelligibility without the need for listening tests would confer great benefits for a wide range of speech processing applications. Many attempts have therefore been made to introduce an objective, and ideally...

IEEE/ACM transactions on audio, speech, and language processing. 2022:30:2141-2155. DOI:10.1109/taslp.2022.3184888 2022