Estimation of Conditional Standard Errors of Measurement for MLE Scores in MST [0.03%]
估计MST中MLE分数条件标准测量误差的方法
Yuanyuan J Stirn,Won-Chan Lee
Yuanyuan J Stirn
This paper proposes an information-based analytic method for calculating the conditional standard error of measurement (CSEM) in multistage testing (MST) using maximum likelihood estimation. The accuracy of the proposed method was evaluated...
Misclassification Produced by Rapid-Guessing Identification Methods and Their Suitability Under Various Conditions [0.03%]
快速猜测识别方法产生的误识率及其在各种条件下的适用性
Santeri Holopainen,Jari Metsämuuronen,Mikko-Jussi Laakso et al.
Santeri Holopainen et al.
Response Time Threshold Methods (RTTMs) are widely used to identify rapid-guessing behavior (RG) in low-stakes assessments, yet face two key challenges: (a) inevitable misclassifications due to overlapping response time distributions of eng...
From Agreement to Epistemic Alignment: A Signal Detection-Theoretic Model of Inter-Rater Reliability [0.03%]
从共识到知识对齐:评分者一致性检验的信号检测理论模型
Irene Gianeselli
Irene Gianeselli
Inter-rater reliability is commonly assessed using chance-corrected agreement coefficients such as Cohen's κ, which summarize concordance among categorical judgments without modeling the inferential processes that generate them. As a resul...
Mingfeng Xue,Xingyao Xiao,Yunting Liu et al.
Mingfeng Xue et al.
Large language models (LLMs) have shown great potential in automatic scoring. However, due to model characteristics and variation in training materials and pipelines, scoring inconsistency can exist within an LLM and across LLMs when rating...
Comparing Different Approaches of (Not) Accounting for Rapid Guessing in Plausible Values Estimation [0.03%]
几种(不)考虑猜测因素的概化值估计方法的比较研究
Jana Welling,Eva Zink,Timo Gnambs
Jana Welling
Educational large-scale assessments provide information on ability differences between groups, informing policies and shaping educational decisions. However, some of these differences might partly reflect variations in test-taking motivatio...
Consistent Factor Score Regression: A Better Alternative for Uncorrected Factor Score Regression? [0.03%]
一致的因素得分回归:未经校正的因素得分回归的一个更好替代选择吗?
Jasper Bogaert,Wen Wei Loh,Yves Rosseel
Jasper Bogaert
Researchers in the behavioral, educational, and social sciences often aim to analyze relationships among latent variables. Structural equation modeling (SEM) is widely regarded as the gold standard for this purpose. A straightforward altern...
Empowering Expert Judgment: A Data-Driven Decision Framework for Standard Setting in High-Dimensional and Data-Scarce Assessments [0.03%]
赋能专家判断:高维与数据稀缺性评估中的数据驱动决策框架
Tianpeng Zheng,Zhehan Jiang,Zhichen Guo et al.
Tianpeng Zheng et al.
A critical methodological challenge in standard setting arises in small-sample, high-dimensional contexts where the number of items substantially exceeds the number of examinees. Under such conditions, conventional data-driven methods that ...
Evaluation of Residual-Based Fit Statistics for Item Response Theory Models in the Presence of Non-Responses [0.03%]
缺失数据下对项目反应理论模型残差拟合准则的评价
Minho Lee,Juyoung Jung
Minho Lee
Residual-based fit statistics, which compare observed item statistics (e.g., proportions) with model-implied probabilities, are widely used to evaluate model fit, item fit, and local dependence in item response theory (IRT) models. Despite ...
Dimiter M Dimitrov,Dimitar V Atanasov
Dimiter M Dimitrov
Based on previous research on conditional reliability for number-correct test scores, conditioned on levels of the logit scale in item response theory, this article deals with conditional reliability of classical-type weighted scores condit...
Collapsing Sparse Responses in Likert-Type Scale Data: Advantages and Disadvantages for Model Fit in CFA [0.03%]
列克特型量表数据中反应选项的合并:对确认性因素分析模型适配的影响及得失之处
Jin Liu,Yu Bao,Christine DiStefano et al.
Jin Liu et al.
Applied researchers often encounter situations where certain item response categories receive very few endorsements, resulting in sparse data. Collapsing categories may mitigate sparsity by increasing cell counts, yet the methodological con...