首页 正文

Journal of the American Medical Informatics Association : JAMIA. 2022 Apr 13;29(5):918-927. doi: 10.1093/jamia/ocab267 Q14.72024

SAT: a Surrogate-Assisted Two-wave case boosting sampling method, with application to EHR-based association studies

基于电子健康记录的关联研究中的代理辅助两波案例助推采样方法(SAT) 翻译改进

Xiaokang Liu  1, Jessica Chubak  2  3, Rebecca A Hubbard  1, Yong Chen  1

作者单位 +展开

作者单位

  • 1 Department of Biostatistics, Epidemiology and Informatics, The University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, USA.
  • 2 Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA.
  • 3 Department of Epidemiology, University of Washington, Seattle, Washington, USA.
  • DOI: 10.1093/jamia/ocab267 PMID: 34962283

    摘要 Ai翻译

    Objectives: Electronic health records (EHRs) enable investigation of the association between phenotypes and risk factors. However, studies solely relying on potentially error-prone EHR-derived phenotypes (ie, surrogates) are subject to bias. Analyses of low prevalence phenotypes may also suffer from poor efficiency. Existing methods typically focus on one of these issues but seldom address both. This study aims to simultaneously address both issues by developing new sampling methods to select an optimal subsample to collect gold standard phenotypes for improving the accuracy of association estimation.

    Materials and methods: We develop a surrogate-assisted two-wave (SAT) sampling method, where a surrogate-guided sampling (SGS) procedure and a modified optimal subsampling procedure motivated from A-optimality criterion (OSMAC) are employed sequentially, to select a subsample for outcome validation through manual chart review subject to budget constraints. A model is then fitted based on the subsample with the true phenotypes. Simulation studies and an application to an EHR dataset of breast cancer survivors are conducted to demonstrate the effectiveness of SAT.

    Results: We found that the subsample selected with the proposed method contains informative observations that effectively reduce the mean squared error of the resultant estimator of the association.

    Conclusions: The proposed approach can handle the problem brought by the rarity of cases and misclassification of the surrogate in phenotype-absent EHR-based association studies. With a well-behaved surrogate, SAT successfully boosts the case prevalence in the subsample and improves the efficiency of estimation.

    Keywords: association study; electronic health records; error in phenotype; rare disease; sampling strategy.

    Keywords:surrogate-assisted sampling; case boosting; electronic health records; association studies

    Copyright © Journal of the American Medical Informatics Association : JAMIA. 中文内容为AI机器翻译,仅供参考!

    相关内容

    期刊名:Journal of the american medical informatics association

    缩写:J AM MED INFORM ASSN

    ISSN:1067-5027

    e-ISSN:1527-974X

    IF/分区:4.7/Q1

    文章目录 更多期刊信息

    全文链接
    引文链接
    复制
    已复制!
    推荐内容
    SAT: a Surrogate-Assisted Two-wave case boosting sampling method, with application to EHR-based association studies