Robert Litschko,Ivan Vulić,Simone Paolo Ponzetto et al.
Robert Litschko et al.
Pretrained multilingual text encoders based on neural transformer architectures, such as multilingual BERT (mBERT) and XLM, have recently become a default paradigm for cross-lingual transfer of natural language processing models, rendering ...
Djoerd Hiemstra,Marie-Francine Moens
Djoerd Hiemstra
Vuong M Ngo,Sven Helmer,Nhien-An Le-Khac et al.
Vuong M Ngo et al.
The humanities, like many other areas of society, are currently undergoing major changes in the wake of digital transformation. However, in order to make collection of digitised material in this area easily accessible, we often still lack a...
Juha Makkonen,Helena Ahonen-Myka,Marko Salmenkivi
Juha Makkonen
Topic Detection and Tracking (TDT) is a research initiative that aims at techniques to organize news documents in terms of news events. We propose a method that incorporates simple semantics into TDT by splitting the term space into groups ...
There's a creepy guy on the other end at Google!: engaging middle school students in a drawing activity to elicit their mental models of Google [0.03%]
可怕的家伙在谷歌的另一端!诱发学生对谷歌精神模型的一种绘画活动
Christie Kodama,Beth St Jean,Mega Subramaniam et al.
Christie Kodama et al.
Although youth are increasingly going online to fulfill their needs for information, many youth struggle with information and digital literacy skills, such as the abilities to conduct a search and assess the credibility of online informatio...
Dennis Bachmann,Katarina Grolinger,Hany ElYamany et al.
Dennis Bachmann et al.
Recommender systems have dramatically changed the way we consume content. Internet applications rely on these systems to help users navigate among the ever-increasing number of choices available. However, most current systems ignore the fac...
Aldo Lipani,Thomas Roelleke,Mihai Lupu et al.
Aldo Lipani et al.
Every information retrieval (IR) model embeds in its scoring function a form of term frequency (TF) quantification. The contribution of the term frequency is determined by the properties of the function of the chosen TF quantification, and ...
Travis Gagie,Aleksi Hartikainen,Kalle Karhu et al.
Travis Gagie et al.
Most of the fastest-growing string collections today are repetitive, that is, most of the constituent documents are similar to many others. As these collections keep growing, a key approach to handling them is to exploit their repetitivenes...
#nowplaying Madonna: a large-scale evaluation on estimating similarities between music artists and between movies from microblogs [0.03%]
Madonna:一项基于微博的小规模估计音乐艺人及电影相似性的评测研究报告
Markus Schedl
Markus Schedl
Different term weighting techniques such as [Formula: see text] or BM25 have been used intensely for manifold text-based information retrieval tasks. Their use for modeling term profiles for named entities and subsequent calculation of simi...
The uncertain representation ranking framework for concept-based video retrieval [0.03%]
基于概念的视频检索中不确定表示排序框架
Robin Aly,Aiden Doherty,Djoerd Hiemstra et al.
Robin Aly et al.
Concept based video retrieval often relies on imperfect and uncertain concept detectors. We propose a general ranking framework to define effective and robust ranking functions, through explicitly addressing detector uncertainty. It can cop...