Wouter van Loon,Marjolein Fokkema,Botond Szabo et al.
Wouter van Loon et al.
Multi-view stacking is a framework for combining information from different views (i.e. different feature sets) describing the same set of objects. In this framework, a base-learner algorithm is trained on each view separately, and their pr...
Parsimony and parameter estimation for mixtures of multivariate leptokurtic-normal distributions [0.03%]
关于多变量峰度正常分布混合模型的简约与参数估计方法研究
Ryan P Browne,Luca Bagnato,Antonio Punzo
Ryan P Browne
Mixtures of multivariate leptokurtic-normal distributions have been recently introduced in the clustering literature based on mixtures of elliptical heavy-tailed distributions. They have the advantage of having parameters directly related t...
The role of diversity and ensemble learning in credit card fraud detection [0.03%]
多样性与集成学习在信用卡欺诈检测中的作用分析
Gian Marco Paldino,Bertrand Lebichot,Yann-Aël Le Borgne et al.
Gian Marco Paldino et al.
The number of daily credit card transactions is inexorably growing: the e-commerce market expansion and the recent constraints for the Covid-19 pandemic have significantly increased the use of electronic payments. The ability to precisely d...
Tin Lok James Ng,Thomas Brendan Murphy
Tin Lok James Ng
A probabilistic model for random hypergraphs is introduced to represent unary, binary and higher order interactions among objects in real-world problems. This model is an extension of the latent class analysis model that introduces two clus...
How many data clusters are in the Galaxy data set?: Bayesian cluster analysis in action [0.03%]
银河数据集中有多少个数据簇?:贝叶斯聚类分析的实际应用
Bettina Grün,Gertraud Malsiner-Walli,Sylvia Frühwirth-Schnatter
Bettina Grün
In model-based clustering, the Galaxy data set is often used as a benchmark data set to study the performance of different modeling approaches. Aitkin (Stat Model 1:287-304) compares maximum likelihood and Bayesian analyses of the Galaxy da...
Federico Ferraccioli,Giovanna Menardi
Federico Ferraccioli
The nonparametric formulation of density-based clustering, known as modal clustering, draws a correspondence between groups and the attraction domains of the modes of the density function underlying the data. Its probabilistic foundation al...
Basis expansion approaches for functional analysis of variance with repeated measures [0.03%]
重复测量的函数方差分析中的基扩张方法
Christian Acal,Ana M Aguilera
Christian Acal
The methodological contribution in this paper is motivated by biomechanical studies where data characterizing human movement are waveform curves representing joint measures such as flexion angles, velocity, acceleration, and so on. In many ...
Unobserved classes and extra variables in high-dimensional discriminant analysis [0.03%]
高维判别分析中的未观察类和额外变量
Michael Fop,Pierre-Alexandre Mattei,Charles Bouveyron et al.
Michael Fop et al.
In supervised classification problems, the test set may contain data points belonging to classes not observed in the learning phase. Moreover, the same units in the test data may be measured on a set of additional variables recorded at a su...
From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering [0.03%]
从有限到无限:基于模型的聚类中稀疏有限混合与狄利克雷过程混合的对比分析
Sylvia Frühwirth-Schnatter,Gertraud Malsiner-Walli
Sylvia Frühwirth-Schnatter
In model-based clustering mixture models are used to group data points into clusters. A useful concept introduced for Gaussian mixtures by Malsiner Walli et al. (Stat Comput 26:303-324, 2016) are sparse finite mixtures, where the prior dist...
Asma Gul,Aris Perperoglou,Zardad Khan et al.
Asma Gul et al.
Combining multiple classifiers, known as ensemble methods, can give substantial improvement in prediction performance of learning algorithms especially in the presence of non-informative features in the data sets. We propose an ensemble of ...