Background and aim: The stage at which Colorectal cancer (CRC) diagnosed is a crucial prognostic factor. Our study proposed a novel approach to aid in the diagnosis of stage IV CRC by utilizing supervised machine learning, analyzing clinical history, and laboratory values, comparing them with those of stage I CRC.
Methods: We conducted a respective study using patients diagnosed with stage I (n = 433) and stage IV CRC (n = 457). We employed supervised machine learning using random forest. The decision tree is used to visualize the model to identify key clinical and laboratory factors that differentiate between stage IV and stage I CRC.
Results: The decision tree classifier revealed that symptoms combined with laboratory values were critical predictors of stage IV CRC. Change in bowel habits was predictive for stage IV CRC among 14 of 22 patients (63 %). Weight loss, constipation, and abdominal pain in combination with different levels of carcinoembryonic antigen (CEA) were predictors for stage IV CRC. A CEA level higher than 260 was indicative for stage IV CRC in all observed patients (61 out of 61 patients). Additionally, a lower CEA level, in combination with hemoglobin, white blood cell count, and platelet count, also predicted stage IV CRC.
Conclusions: By applying a machine learning based approach, we identified symptoms and laboratory values (CEA, hemoglobin, white blood cell count, and platelet count), as crucial predictors for stage IV CRC diagnosis. This method holds potential for facilitating the diagnosis of stage IV CRC in clinical practice, even before imaging tests are conducted.
Keywords: Colorectal cancer; Machine learning; Stage IV.
Copyright © 2025 Elsevier Ltd. All rights reserved.