Evaluating the diagnostic reasoning of large language models in complex neuro-ophthalmological cases: a comparative analysis of GPT-o1 Pro, GPT-4o, Gemini, Grok 2 and DeepSeek
{{output}}
Purpose: This study aims to evaluate and compare the diagnostic reasoning of five large language models (LLMs) in complex neuro-ophthalmological cases. We assessed the performance of GPT-o1 Pro, GPT-4o, Google Gemini, Grok 2 and ... ...