Bayesian Posterior Interval Calibration to Improve the Interpretability of Observational Studies [0.03%]
贝叶斯后验区间校准以提高观察性研究的可解释性
Jami J Mulgrave,David Madigan,George Hripcsak
Jami J Mulgrave
Observational healthcare data offer the potential to estimate causal effects of medical products on a large scale. However, the confidence intervals and p-values produced by observational studies only account for random error and fail to ac...
A treeless absolutely random forest with closed-form estimators of expected proximities [0.03%]
一种无树的完全随机森林及其期望接近度的显式估计式
Eugene Laska,Ziqiang Lin,Carole Siegel et al.
Eugene Laska et al.
We introduce a simple variant of a Purely Random Forest, an Absolute Random Forest (ARF) for clustering. At every node splits of units are determined by a randomly chosen feature and a random threshold drawn from a uniform distribution whos...
Data-driven Stochastic Model for Quantifying the Interplay Between Amyloid-beta and Calcium Levels in Alzheimer's Disease [0.03%]
一种数据驱动的随机模型量化淀粉样蛋白和钙离子在阿尔茨海默病相互作用关系
Hina Shaheen,Roderick Melnik,Sundeep Singh;Alzheimer’s Disease Neuroimaging Initiative
Hina Shaheen
The abnormal aggregation of extracellular amyloid-β(Aβ) in senile plaques resulting in calcium Ca+2 dyshomeostasis is one of the primary symptoms of Alzheimer's disease (AD). Significant research efforts have been devoted in the p...
Mengque Liu,Qingzhao Zhang,Shuangge Ma
Mengque Liu
Gene-environment (G-E) interaction analysis plays a critical role in understanding and modeling complex diseases. Compared to main-effect-only analysis, it is more seriously challenged by higher dimensionality, weaker signals, and the uniqu...
Integrative Learning of Structured High-Dimensional Data from Multiple Datasets [0.03%]
跨数据集结构高维数据分析的整合学习方法研究
Changgee Chang,Zongyu Dai,Jihwan Oh et al.
Changgee Chang et al.
Integrative learning of multiple datasets has the potential to mitigate the challenge of small n and large p that is often encountered in analysis of big biomedical data such as genomics data. Detection of weak yet important signals can be ...
A Tutorial on Generative Adversarial Networks with Application to Classification of Imbalanced Data [0.03%]
用于不平衡数据分类的生成对抗网络教程
Yuxiao Huang,Kara G Fields,Yan Ma
Yuxiao Huang
A challenge unique to classification model development is imbalanced data. In a binary classification problem, class imbalance occurs when one class, the minority group, contains significantly fewer samples than the other class, the majorit...
Regression-Based Bayesian Estimation and Structure Learning for Nonparanormal Graphical Models [0.03%]
回归图模型结构学习的贝叶斯方法研究
Jami J Mulgrave,Subhashis Ghosal
Jami J Mulgrave
A nonparanormal graphical model is a semiparametric generalization of a Gaussian graphical model for continuous variables in which it is assumed that the variables follow a Gaussian graphical model only after some unknown smooth monotone tr...
Ziqiang Lin,Eugene Laska,Carole Siegel
Ziqiang Lin
The quality of a cluster analysis of unlabeled units depends on the quality of the between units dissimilarity measures. Data dependent dissimilarity is more objective than data independent geometric measures such as Euclidean distance. As ...
Min Zhang,Gal Mishne,Eric C Chi
Min Zhang
Many machine learning algorithms depend on weights that quantify row and column similarities of a data matrix. The choice of weights can dramatically impact the effectiveness of the algorithm. Nonetheless, the problem of choosing weights ha...
Bag of little bootstraps for massive and distributed longitudinal data [0.03%]
大规模分布式纵向数据的“小自助.bootstrap袋"方法
Xinkai Zhou,Jin J Zhou,Hua Zhou
Xinkai Zhou
Linear mixed models are widely used for analyzing longitudinal datasets, and the inference for variance component parameters relies on the bootstrap method. However, health systems and technology companies routinely generate massive longitu...