首页 正文

Video Captioning Using Global-Local Representation

{{output}}
Video captioning is a challenging task as it needs to accurately transform visual understanding into natural language description. To date, state-of-the-art methods inadequately model global-local vision representation for sentence generation, leaving plenty o... ...