Medal Prediction Models Based on LASSO Regression and Random Forest Algorithm
DOI:
https://doi.org/10.62051/13xrxx95Keywords:
hierarchical clustering; lasso regression; random forest.Abstract
Medal prediction serves as a critical research direction in sports science and data analysis, holding significant implications for optimizing resource allocation and strategic decision-making in competitive sports. This study proposes an innovative hybrid predictive model that integrates hierarchical clustering, LASSO regression, and random forest algorithms. By constructing a purely competition-endogenous multidimensional competitiveness indicator system, the model overcomes the limitations of conventional approaches that rely heavily on external factors. The methodology begins with establishing feature-based indicators to categorize participating nations into three distinct clusters through hierarchical clustering, reflecting their respective stages of sports development and establishing an optimized differentiated modeling framework. For countries at different developmental stages, LASSO regression and random forest algorithms are strategically applied, achieving both model robustness and systematic exploration of feature importance. Empirical results demonstrate the model's capability to accurately forecast medal distributions for the 2028 Los Angeles event, with predictions aligning closely with historical trends and prediction errors confined within a margin of 2 medals. This research provides a quantifiable decision-making tool that substantially enhances the scientific basis for event resource allocation and policy formulation in competitive sports systems.
Downloads
References
[1] Schlembach C, Schmidt S L, Schreyer D, et al. Forecasting the Olympic medal distribution–a socioeconomic machine learning model[J]. Technological Forecasting and Social Change, 2022, 175: 121314. DOI: https://doi.org/10.1016/j.techfore.2021.121314
[2] Badoni P, Choudhary P, Rudesh C P, et al. Predicting Medal Counts in Olympics Using Machine Learning Algorithms: A Comparative Analysis[C]//2023 International Conference on Advanced Computing & Communication Technologies (ICACCTech). IEEE, 2023: 116-121. DOI: https://doi.org/10.1109/ICACCTech61146.2023.00027
[3] Tchamkerten A, Chaudron P, Girard N, et al. Career factors related to winning Olympic medals in swimming[J]. PLoS One, 2024, 19(6): e0304444. DOI: https://doi.org/10.1371/journal.pone.0304444
[4] Yeh C C, Peng H T, Lin W B. Achievement Prediction and Performance Assessment System for Nations in the Asian Games[J]. Applied Sciences, 2024, 14(2): 789. DOI: https://doi.org/10.3390/app14020789
[5] Scelles N, Andreff W, Bonnal L, et al. Forecasting national medal totals at the Summer Olympic Games reconsidered[J]. Social science quarterly, 2020, 101(2): 697-711. DOI: https://doi.org/10.1111/ssqu.12782
[6] Otamendi F J, Doncel L M, Martín‐Gutiérrez C. Meeting expectations at the 2016 Rio Olympic games: country potential and competitiveness[J]. Social Science Quarterly, 2020, 101(2): 656-677. DOI: https://doi.org/10.1111/ssqu.12764
[7] Li F, Hopkins W G, Lipinska P. Population, economic and geographic predictors of nations' medal tallies at the Pyeongchang and Tokyo Olympics and Paralympics[J]. Frontiers in Sports and Active Living, 2022, 4: 931817. DOI: https://doi.org/10.3389/fspor.2022.931817
[8] Wunderlich F, Memmert D. Forecasting the outcomes of sports events: A review[J]. European journal of sport science, 2021, 21(7): 944-957. DOI: https://doi.org/10.1080/17461391.2020.1793002
[9] Baumer B S, Matthews G J, Nguyen Q. Big ideas in sports analytics and statistical tools for their investigation[J]. Wiley Interdisciplinary Reviews: Computational Statistics, 2023, 15(6): e1612. DOI: https://doi.org/10.1002/wics.1612
[10] Wilkens S. Sports prediction and betting models in the machine learning age: The case of tennis[J]. Journal of Sports Analytics, 2021, 7(2): 99-117. DOI: https://doi.org/10.3233/JSA-200463
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Transactions on Computer Science and Intelligent Systems Research

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.








