Le Liang,Cheng Wang,Lefei Zhang
Le Liang
Object detection is a fundamental task in computer vision, aiming to localize and classify objects within images. Feature pyramid networks (FPNs) play a crucial role in modern object detectors by constructing hierarchical multi-scale featur...
Multi-modal feature alignment networks for multi-label image classification [0.03%]
多标签图像分类的多模态特征对齐网络
Wenlan Kuang,Zhixin Li
Wenlan Kuang
Multi-label image classification is a classification task that assigns labels to multiple objects in an input image. Recent research ideas mainly focus on solving the semantic consistency of visual features and label features. However, sinc...
UniTrain: A universal iterative semi-supervised training framework for graph representation learning [0.03%]
UniTrain:一种通用的迭代半监督训练框架用于图形表示学习
Xinlong Chen,Jin Li,Yisong Huang et al.
Xinlong Chen et al.
Graph neural networks (GNNs) and graph transformers (GTs) perform well in graph-related tasks, but their potential is often limited in semi-supervised settings due to label scarcity. Although robust encoders and pre-training tasks enhance p...
FluidFormer : Transformer with continuous convolution for particle-based fluid simulation [0.03%]
流动的形成者:基于粒子流体模拟的连续卷积变压器
Nianyi Wang,Shuai Zheng,Yu Chen et al.
Nianyi Wang et al.
Learning-based fluid simulation has emerged as an efficient alternative to traditional Navier-Stokes solvers. However, existing neural methods that build upon Smoothed Particle Hydrodynamics (SPH) predominantly rely on local particle intera...
Shuran Wang,Hua Chen,Heng Xiong et al.
Shuran Wang et al.
Dynamical systems evolve over time, and predicting their behavior is difficult because of their complex spatiotemporal relationship. Although data-driven models have achieved great success in dynamical system analysis, extracting temporal d...
RWP: a robust watermarking plugin for attribution and protection in stable diffusion models [0.03%]
鲁棒型水印插件(RWP):稳定扩散模型中的归属和防护技术
Zongxin Liu,Jinhong Zhang,Yunyun Dong et al.
Zongxin Liu et al.
Diffusion models have achieved remarkable success in content generation, driving the rapid development of various customized models. However, this progress also presents significant challenges in provenance tracking, including the misuse of...
Mitigating sensitive information leakage in LLMs4Code through machine unlearning [0.03%]
通过机器遗忘减轻LLMs4Code中敏感信息泄露
Shanzhi Gu,Zhaoyang Qu,Ruotong Geng et al.
Shanzhi Gu et al.
Large Language Models for Code (LLMs4Code) have achieved strong performance in code generation, but recent studies reveal that they may memorize and leak sensitive information contained in training data, posing serious privacy risks. To add...
CoCoFR: Collaborative codebooks learning with soft matching strategy for blind face restoration [0.03%]
基于软匹配策略的协作码本学习盲脸恢复方法
Teng Feng,Junwei Xu,Tao Huang et al.
Teng Feng et al.
Blind Face Restoration (BFR) has garnered considerable attention for its practical applicability to recover high-quality (HQ) facial images from their degraded versions. Existing BFR methods primarily incorporate diverse priors to mitigate ...
Zhiyu Guo,Yang Liu,Xiang Ao et al.
Zhiyu Guo et al.
Graph Transformers (GTs), as emerging foundational encoders for graph-structured data, have shown promising performance due to the integration of local graph structures with global attention mechanisms. However, the complex attention functi...
Multi-Source Temporal-Depth fusion for robust end-to-End visual odometry [0.03%]
一种稳健的端到端视觉里程计多源时深融合方法
Sihang Zhang,Congqi Cao,Qiang Gao et al.
Sihang Zhang et al.
End-to-end visual odometry models have recently achieved localization accuracy on par with conventional techniques, while effectively reducing the occurrence of catastrophic failures. However, the relevant models cannot leverage the complet...