Ieee transactions on multimedia文章索引

Screen Detection from Egocentric Image Streams Leveraging Multi-View Vision Language Model [0.03%] 基于多视角视觉语言模型的主观图像流中的屏幕检测

Xueshen Li,Sen Shen,Xinlong Hou et al. Xueshen Li et al.

Accurately monitoring the screen exposure of young children is important for research related to screen use, such as childhood obesity, physical activity, and social interaction. Most existing studies rely upon self-report or manual measure...

IEEE transactions on multimedia. 2026 Feb 10:10.1109/tmm.2026.3660180. DOI:10.1109/tmm.2026.3660180 2026

Long-Tailed Continual Learning For Visual Food Recognition [0.03%] 长尾持续学习在视觉食品识别中的应用

Jiangpeng He,Xiaoyan Zhang,Luotao Lin et al. Jiangpeng He et al.

Deep learning-based food recognition has made significant progress in predicting food types from eating occasion images. However, two key challenges hinder real-world deployment: (1) continuously learning new food classes without forgetting...

IEEE transactions on multimedia. 2025 Dec 3:10.1109/tmm.2025.3632640. DOI:10.1109/tmm.2025.3632640 2025

Support Vector Regression-based Reduced-Reference Perceptual Quality Model for Compressed Point Clouds [0.03%] 基于支持向量回归的有损点云压缩减少参考感知质量模型

Honglei Su,Qi Liu,Hui Yuan et al. Honglei Su et al.

Video-based point cloud compression (V-PCC) is a state-of-the-art moving picture experts group (MPEG) standard for point cloud compression. V-PCC can be used to compress both static and dynamic point clouds in a lossless, near lossless, or ...

IEEE transactions on multimedia. 2024:26:6238-6249. DOI:10.1109/tmm.2023.3347638 2024

Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA [0.03%] 基于可能世界VQA的视觉问答跨模态偏差的因果视角

Ali Vosoughi,Shijian Deng,Songyang Zhang et al. Ali Vosoughi et al.

To increase the generalization capability of VQA systems, many recent studies have tried to de-bias spurious language or vision associations that shortcut the question or image to the answer. Despite these efforts, the literature fails to a...

IEEE transactions on multimedia. 2024:26:8609-8624. DOI:10.1109/tmm.2024.3380259 2024

Indoor Camera Pose Estimation from Room Layouts and Image Outer Corners [0.03%] 基于房间布局和图像外角的室内相机姿态估计

Xiaowei Chen,Guoliang Fan Xiaowei Chen

To support indoor scene understanding, room layouts have been recently introduced that define a few typical space configurations according to junctions and boundary lines. In this paper, we study camera pose estimation from eight common roo...

IEEE transactions on multimedia. 2023:25:7992-8005. DOI:10.1109/tmm.2022.3233308 2023

Cross-Referencing Self-Training Network for Sound Event Detection in Audio Mixtures [0.03%] 基于交叉引用的自训练网络在音频混合中的声音事件检测方法

Sangwook Park,David K Han,Mounya Elhilali Sangwook Park

Sound event detection is an important facet of audio tagging that aims to identify sounds of interest and define both the sound category and time boundaries for each sound event in a continuous recording. With advances in deep neural networ...

IEEE transactions on multimedia. 2023:25:4573-4585. DOI:10.1109/tmm.2022.3178591 2023

Real-Time and Accurate UAV Pedestrian Detection for Social Distancing Monitoring in COVID-19 Pandemic [0.03%] 面向新冠疫情的社交距离监测实时准确无人机行人检测方法研究

Zhenfeng Shao,Gui Cheng,Jiayi Ma et al. Zhenfeng Shao et al.

Coronavirus Disease 2019 (COVID-19) is a highly infectious virus that has created a health crisis for people all over the world. Social distancing has proved to be an effective non-pharmaceutical measure to slow down the spread of COVID-19....

IEEE transactions on multimedia. 2021 Apr 28:24:2069-2083. DOI:10.1109/TMM.2021.3075566 2021

Head Motion Modeling for Human Behavior Analysis in Dyadic Interaction [0.03%] 双人互动中用于人类行为分析的头部运动模型式研究

Bo Xiao,Panayiotis Georgiou,Brian Baucom et al. Bo Xiao et al.

This paper presents a computational study of head motion in human interaction, notably of its role in conveying interlocutors' behavioral characteristics. Head motion is physically complex and carries rich information; current modeling appr...

IEEE transactions on multimedia. 2015 Jul 13;17(7):1107-1119. DOI:10.1109/TMM.2015.2432671 2015