Medal Prediction Models Based on LASSO Regression and Random Forest Algorithm

Authors

  • Xinran Chen Soochow University, Suzhou, China
  • Xuming Yan Soochow University, Suzhou, China
  • Rongtao Zhang Soochow University, Suzhou, China

DOI:

https://doi.org/10.62051/13xrxx95

Keywords:

hierarchical clustering; lasso regression; random forest.

Abstract

Medal prediction serves as a critical research direction in sports science and data analysis, holding significant implications for optimizing resource allocation and strategic decision-making in competitive sports. This study proposes an innovative hybrid predictive model that integrates hierarchical clustering, LASSO regression, and random forest algorithms. By constructing a purely competition-endogenous multidimensional competitiveness indicator system, the model overcomes the limitations of conventional approaches that rely heavily on external factors. The methodology begins with establishing feature-based indicators to categorize participating nations into three distinct clusters through hierarchical clustering, reflecting their respective stages of sports development and establishing an optimized differentiated modeling framework. For countries at different developmental stages, LASSO regression and random forest algorithms are strategically applied, achieving both model robustness and systematic exploration of feature importance. Empirical results demonstrate the model's capability to accurately forecast medal distributions for the 2028 Los Angeles event, with predictions aligning closely with historical trends and prediction errors confined within a margin of 2 medals. This research provides a quantifiable decision-making tool that substantially enhances the scientific basis for event resource allocation and policy formulation in competitive sports systems.

Downloads

Download data is not yet available.

References

[1] Schlembach C, Schmidt S L, Schreyer D, et al. Forecasting the Olympic medal distribution–a socioeconomic machine learning model[J]. Technological Forecasting and Social Change, 2022, 175: 121314. DOI: https://doi.org/10.1016/j.techfore.2021.121314

[2] Badoni P, Choudhary P, Rudesh C P, et al. Predicting Medal Counts in Olympics Using Machine Learning Algorithms: A Comparative Analysis[C]//2023 International Conference on Advanced Computing & Communication Technologies (ICACCTech). IEEE, 2023: 116-121. DOI: https://doi.org/10.1109/ICACCTech61146.2023.00027

[3] Tchamkerten A, Chaudron P, Girard N, et al. Career factors related to winning Olympic medals in swimming[J]. PLoS One, 2024, 19(6): e0304444. DOI: https://doi.org/10.1371/journal.pone.0304444

[4] Yeh C C, Peng H T, Lin W B. Achievement Prediction and Performance Assessment System for Nations in the Asian Games[J]. Applied Sciences, 2024, 14(2): 789. DOI: https://doi.org/10.3390/app14020789

[5] Scelles N, Andreff W, Bonnal L, et al. Forecasting national medal totals at the Summer Olympic Games reconsidered[J]. Social science quarterly, 2020, 101(2): 697-711. DOI: https://doi.org/10.1111/ssqu.12782

[6] Otamendi F J, Doncel L M, Martín‐Gutiérrez C. Meeting expectations at the 2016 Rio Olympic games: country potential and competitiveness[J]. Social Science Quarterly, 2020, 101(2): 656-677. DOI: https://doi.org/10.1111/ssqu.12764

[7] Li F, Hopkins W G, Lipinska P. Population, economic and geographic predictors of nations' medal tallies at the Pyeongchang and Tokyo Olympics and Paralympics[J]. Frontiers in Sports and Active Living, 2022, 4: 931817. DOI: https://doi.org/10.3389/fspor.2022.931817

[8] Wunderlich F, Memmert D. Forecasting the outcomes of sports events: A review[J]. European journal of sport science, 2021, 21(7): 944-957. DOI: https://doi.org/10.1080/17461391.2020.1793002

[9] Baumer B S, Matthews G J, Nguyen Q. Big ideas in sports analytics and statistical tools for their investigation[J]. Wiley Interdisciplinary Reviews: Computational Statistics, 2023, 15(6): e1612. DOI: https://doi.org/10.1002/wics.1612

[10] Wilkens S. Sports prediction and betting models in the machine learning age: The case of tennis[J]. Journal of Sports Analytics, 2021, 7(2): 99-117. DOI: https://doi.org/10.3233/JSA-200463

Downloads

Published

25-12-2025

How to Cite

Chen, X., Yan, X., & Zhang, R. (2025). Medal Prediction Models Based on LASSO Regression and Random Forest Algorithm. Transactions on Computer Science and Intelligent Systems Research, 11, 298-310. https://doi.org/10.62051/13xrxx95