Unsupervised Pre-training with Language-Vision Prompts for Low-Data Instance Segmentation [0.03%]
基于语言视觉提示的无监督预训练在数据不足情况下面向实例分割的任务中的应用研究
Dingwen Zhang,Hao Li,Diqi He et al.
Dingwen Zhang et al.
In recent times, following the paradigm of DETR (DEtection TRansformer), query-based end-to-end instance segmentation (QEIS) methods have exhibited superior performance compared to CNN-based models, particularly when trained on large-scale ...
Task Augmentation-Based Meta-Learning Segmentation Method for Retinopathy [0.03%]
基于任务增强的元学习视网膜病变分割方法
Jingtao Wang,Muhammad Mateen,Dehui Xiang et al.
Jingtao Wang et al.
Deep learning (DL) requires large amounts of labeled data, which is extremely time-consuming and laborintensive to obtain for medical image segmentation tasks. Metalearning focuses on developing learning strategies that enable quick adaptat...
Yahao Shi,Yanmin Wu,Chenming Wu et al.
Yahao Shi et al.
This paper presents a 3D Gaussian Inverse Rendering (GIR) method, employing 3D Gaussian representations to effectively factorize the scene into material properties, light, and geometry. The key contributions are three-fold. We compute the n...
Chong Yu,Tao Chen,Zhongxue Gan
Chong Yu
Taylor-Series-Expansion (TSE) is a mathematics theorem. It proves that the expansion of the first few finite Taylor Series is a good approximation of a nonlinear function in most cases. Inspired by the TSE theorem, a brand-new TSE-based vis...
Dual-Level Cross-Modality Neural Architecture Search for Guided Image Super-Resolution [0.03%]
基于双层跨模态神经架构搜索的图像超分辨率指导方法
Zhiwei Zhong,Xianming Liu,Junjun Jiang et al.
Zhiwei Zhong et al.
Guided image super-resolution (GISR) aims to reconstruct a high-resolution (HR) target image from its low-resolution (LR) counterpart with the guidance of a HR image from another modality. Existing learning-based methods typically employ sy...
Yu-Huan Wu,Shi-Chen Zhang,Yun Liu et al.
Yu-Huan Wu et al.
Semantic segmentation tasks naturally require high-resolution information for pixel-wise segmentation and global context information for class prediction. While existing vision transformers demonstrate promising performance, they often util...
ComPtr: Towards Diverse Bi-source Dense Prediction Tasks via A Simple yet General Complementary Transformer [0.03%]
Comptr:一种简单而通用的互补变压器,用于多种双源密集预测任务
Youwei Pang,Xiaoqi Zhao,Lihe Zhang et al.
Youwei Pang et al.
Deep learning (DL) has advanced the field of dense prediction, while gradually dissolving the inherent barriers between different tasks. However, most existing works focus on designing architectures and constructing visual cues only for the...
SAS: A General Framework Induced by Sequence Association for Shape from Focus [0.03%]
序列关联下的形状从聚焦通用模型SAS
Tao Yan,Yuhua Qian,Jiangfeng Zhang et al.
Tao Yan et al.
Shape from focus (SFF) is a technique used to estimate the depth of a scene from a sequence of multifocus images. Existing SFF methods can be categorized into two groups: traditional methods and deep learning-based methods. Traditional meth...
From FastPoseGait to GPGait++: Bridging the Past and Future for Pose-based Gait Recognition [0.03%]
从FastPoseGait到GPGait++:基于姿态的步态识别的过去与未来之间的桥梁方法
Shibei Meng,Yang Fu,Saihui Hou et al.
Shibei Meng et al.
Recent studies on pose-based gait recognition have underscored the potential of utilizing such fundamental data to achieve superior outcomes. Nonetheless, the development of current pose-based methods faces significant obstacles due to seve...
The Synergy Between Data and Multi-Modal Large Language Models: A Survey From Co-Development Perspective [0.03%]
数据与多模态大语言模型的协同演进:共发展视角下的综述研究
Zhen Qin,Daoyuan Chen,Wenhao Zhang et al.
Zhen Qin et al.
Recent years have witnessed the rapid development of large language models (LLMs). Mmodal LLMs (MLLMs) extend modality from text to various domains, attracting widespread attention due to their diverse application scenarios. As LLMs and MLL...