首页 正文

On the Consistency of Automatic Scoring with Large Language Models

{{output}}
Large language models (LLMs) have shown great potential in automatic scoring. However, due to model characteristics and variation in training materials and pipelines, scoring inconsistency can exist within an LLM and across LLMs when rating the same response m... ...