
融合XGBoost与SHAP模型的足球运动员身价预测及特征分析方法
- 比赛战术复盘
- 2025-08-22 22:58:44
- 6638
摘要/Abstract
摘要: 随着足球运动全球化程度的不断提升,全球转会市场愈发庞大,然而针对影响转会交易最关键的因素球员身价的深入模型及应用研究还较为缺乏。以国际足球联合会FIFA的官方球员数据库为研究对象,首先,在区分不同球员位置的前提下,运用Box-Cox变换、F-Score特征选择等方法对原始数据集进行特征处理;其次,通过XGBoost构建球员身价预测模型,并与Random Forest,Adaboost,GBDT,SVR等主流机器学习算法进行10折交叉验证实验对比,证明了XGBoost模型在R2,MAE,RMSE这3项指标上的性能优势;最后,在身价预测模型的基础上,融合SHAP框架分析不同位置影响球员身价的重要因素,为球员身价评估、身价对比分析、球员训练策略制定等场景提供决策支持。
关键词:
机器学习,
球员身价预测,
训练策略,
XGBoost算法,
SHAP值
Abstract: With the increasing globalization of football,the global player transfer market is becoming more and more prosperous.However,as the most important factor affecting player transfer transaction,the player’s transfer value lacks in-depth model and application research.In this paper,the FIFA’s official player database is taken as the research object.Firstly,on the premise of distinguishing different player positions,Box-Cox transformation,F-Score feature selection,etc.are used to perform feature processing on the original data set.Secondly,the player value prediction model is constructed by XGBoost,and compared with the main machine learning algorithms such as random forest,AdaBoost,GBDT and SVR for 10-fold cross validation experiments.Experimental results prove that the XGBoost model has a performance advantage over the existing models on the indicators of R2,MAE and RMSE.Finally,on the basis of constructing the value prediction model,this paper integrates the SHAP framework to analyze the important factors affecting the players’ value score in different positions,and provides decision support for some scenarios,such as player’s value score evaluation,comparative analysis,and training strategy formulation,etc.
Key words:
Machine learning,
Player’s value prediction,
Training strategy,
XGBoost algorithm,
SHAP value
中图分类号:
TP391
引用本文
廖彬, 王志宁, 李敏, 孙瑞娜. 融合XGBoost与SHAP模型的足球运动员身价预测及特征分析方法[J]. 计算机科学, 2022, 49(12): 195-204. https://doi.org/10.11896/jsjkx.210600029
LIAO Bin, WANG Zhi-ning, LI Min, SUN Rui-na. Integrating XGBoost and SHAP Model for Football Player Value Prediction and Characteristic Analysis[J]. Computer Science, 2022, 49(12): 195-204. https://doi.org/10.11896/jsjkx.210600029
使用本文
0
/
/
推荐
导出引用管理器 EndNote|Reference Manager|ProCite|BibTeX|RefWorks
链接本文:
https://www.jsjkx.com/CN/10.11896/jsjkx.210600029
https://www.jsjkx.com/CN/Y2022/V49/I12/195
参考文献
[1]Football Clubs’Valuation:The European Elite 2020[EB/OL].(2020-05-28)[2020-10-13].http://www.footballbenchmark.com/library/football_clubs_valuation_the_european_elite_2020.[2]Global Transfer Market Report 2020[EB/OL].(2020-01-18)[2020-10-13].http://img.fifa.com/image/upload/ijiz9rtpkfnbhxwbqr70.pdf.[3]AO X Q,GONG Y J,LI J.Prediction of soccer match results based on handicapdata[J].Journal of Chongqing Technology Business University(Natural Science),2016,33(6):86-89.[4]NAZIM R,AIDA M,ROSHIDI D,et al.A Review on football match outcome prediction using bayesian networks [J].Journal of Physics:Conference Series,2018,1020(1):1-9.[5]LEONARDO E,FRANCESCO P,NICOLA T.Combining historical data and bookmakers’ odds in modelling football scores[J].Statistical Modelling,2018,18(6):1-24.[6]XIA Z C,YANG G B,ZHANG Z Y,et al.Video adaptationscheme for football sports video on mobile terminals[J].Journal of Chinese Computer Systems,2011,32(8):1660-1664.[7]TONG M,DING L W,JI C L.Fusion of HCRF and AAM highlight events detection in soccer videos[J].Journal of Computer Research and Development,2014,51(1):225-236.[8]YU J Q,ZHANG Q,WANG Z K,et al.Soccer highlight detection based on replay and affection arousal model[J].Chinese Journal of Computers,2014,37(6):1268-1280.[9]CHAWLA S,ESTEPHAN J,GUDMUNDSSON J,et al.Classification of passes in football matches using spatiotemporal data[J].ACM Transactions on Spatial Algorithms and Systems,2017,3(6):11-25.[10]GOES F R,KEMPE M,MEERHOFF L A,et al.Not every pass can be an assist:a data-driven model to measure pass effectiveness in professional soccer matches[J].Big Data,2018,7(1):57-70.[11]REIN R,RAABE D,MEMMERT D.‘Which pass is better?’ Novel approaches to assess passing effectiveness in elite soccer[J].Hum Movement Science,2017,55(10):172-181.[12]HERM S,CALLSEN-BRACKER H M,KREIS H.When thecrowd evaluates soccer players’ market values:Accuracy and evaluation attributes of an online community[J].Sport Management Review,2014,17(4):484-492.[13]SCELLES N,HELLEU B,DURAND C,et al.Professionalsports firm values:Bringing new determinants to the foreground?A study of European soccer,2005-2013[J].Journal of Sports Economics,2014,17(7):1-18.[14]WAN B.Study on the transfer of the super league players inwinter of the 2016 Season[J].Bulletinof Sport Science & Technology,2016,24(9):107-109.[15]ROSSETTI G,CAPRONI V.Football Market Strategies:Think Locally,Trade Globally [C]//IEEE 16th International Confe-rence on Data Mining Workshops (ICDMW).Barcelona,Spain:IEEE,2016:152-159.[16]CHEN C.The model construction of transfer price about football forward players in China football association super league[D].Beijing:Beijing Sport University,2017.[17]YE X S,MA L,CHEN J T,et al.Study on the inter-team gap of players’ market value in Chinese football association super league[J].China Sport Science and Technology,2017,53(3):63-70.[18]OLIVER M,ALEXANDER S,MARKUS W.Beyond crowdjudgments:data-driven estimation of market value in association football[J].European Journal of Operational Research,2017,263(2):611-624.[19]PRABHNOOR S,PUNEET S L.Influence of crowd-sourcing,popularity and previous year statistics in market value estimation of football players[J].Journal of Discrete Mathematical Sciences & Cryptography,2019,22(2):113-126.[20]KIRSCHSTEIN T,STEFFEN L.Assessing the market values of soccer players-a robust analysis of data from German 1.and 2.Bundesliga[J].Journal of Applied Statistics,2019,46(7):1336-1349.[21]ZHAO Y.Analysis of professional soccer player transfer market based on complex network theory[D].Nanjing:Southeast University,2018.[22]IMAN B,SEYED M R.A novel machine learning method for estimating football players’ value in the transfer market[J].Soft Computing,2020,25(10):2499-2511.[23]HUO D.Evaluation of the value of basketball players based on wireless network and improved Bayesian algorithm[J].EURASIP Journal on Wireless Communications and Networking,2020,236(9):1-11.[24]CHEN T,GUESTRIN C.XGBoost:A Scalable Tree Boosting System[C]//Proceedings of the 22nd ACMSIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,2016:785-794.[25]SONG L L,WANG S H,YANG C,et al.Application research of improved XGBoost in imbalanced data processing[J].Computer Science,2020,47(6):98-103.[26]LI B S,LI L Z,SUN Y,et al.Intranet defense algorithm based on pseudo boosting decision tree[J].Computer Science,2018,45(4):157-162.[27]LUNDBERG S M,LEE S I.A unified approach to interpreting model predictions[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.ACM,2017:4765-4774.[28]CHEN Y W,LIN C J.Combining SVMs with various selection strategies[J].Studies in Fuzziness and Soft Computing,Berlin:Springer,2008:315-324.
相关文章 15
[1]
冷典典, 杜鹏, 陈建廷, 向阳.
面向自动化集装箱码头的AGV行驶时间估计 Automated Container Terminal Oriented Travel Time Estimation of AGV
计算机科学, 2022, 49(9): 208-214. https://doi.org/10.11896/jsjkx.210700028
[2]
宁晗阳, 马苗, 杨波, 刘士昌.
密码学智能化研究进展与分析 Research Progress and Analysis on Intelligent Cryptology
计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053
[3]
何强, 尹震宇, 黄敏, 王兴伟, 王源田, 崔硕, 赵勇.
基于大数据的进化网络影响力分析研究综述 Survey of Influence Analysis of Evolutionary Network Based on Big Data
计算机科学, 2022, 49(8): 1-11. https://doi.org/10.11896/jsjkx.210700240
[4]
李瑶, 李涛, 李埼钒, 梁家瑞, Ibegbu Nnamdi JULIAN, 陈俊杰, 郭浩.
基于多尺度的稀疏脑功能超网络构建及多特征融合分类研究 Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network
计算机科学, 2022, 49(8): 257-266. https://doi.org/10.11896/jsjkx.210600094
[5]
张光华, 高天娇, 陈振国, 于乃文.
基于N-Gram静态分析技术的恶意软件分类研究 Study on Malware Classification Based on N-Gram Static Analysis Technology
计算机科学, 2022, 49(8): 336-343. https://doi.org/10.11896/jsjkx.210900203
[6]
陈明鑫, 张钧波, 李天瑞.
联邦学习攻防研究综述 Survey on Attacks and Defenses in Federated Learning
计算机科学, 2022, 49(7): 310-323. https://doi.org/10.11896/jsjkx.211000079
[7]
李亚茹, 张宇来, 王佳晨.
面向超参数估计的贝叶斯优化方法综述 Survey on Bayesian Optimization Methods for Hyper-parameter Tuning
计算机科学, 2022, 49(6A): 86-92. https://doi.org/10.11896/jsjkx.210300208
[8]
赵璐, 袁立明, 郝琨.
多示例学习算法综述 Review of Multi-instance Learning Algorithms
计算机科学, 2022, 49(6A): 93-99. https://doi.org/10.11896/jsjkx.210500047
[9]
肖治鸿, 韩晔彤, 邹永攀.
基于多源数据和逻辑推理的行为识别技术研究 Study on Activity Recognition Based on Multi-source Data and Logical Reasoning
计算机科学, 2022, 49(6A): 397-406. https://doi.org/10.11896/jsjkx.210300270
[10]
姚烨, 朱怡安, 钱亮, 贾耀, 张黎翔, 刘瑞亮.
一种基于异质模型融合的 Android 终端恶意软件检测方法 Android Malware Detection Method Based on Heterogeneous Model Fusion
计算机科学, 2022, 49(6A): 508-515. https://doi.org/10.11896/jsjkx.210700103
[11]
王飞, 黄涛, 杨晔.
基于Stacking多模型融合的IGBT器件寿命的机器学习预测算法研究 Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion
计算机科学, 2022, 49(6A): 784-789. https://doi.org/10.11896/jsjkx.210400030
[12]
许杰, 祝玉坤, 邢春晓.
机器学习在金融资产定价中的应用研究综述 Application of Machine Learning in Financial Asset Pricing:A Review
计算机科学, 2022, 49(6): 276-286. https://doi.org/10.11896/jsjkx.210900127
[13]
么晓明, 丁世昌, 赵涛, 黄宏, 罗家德, 傅晓明.
大数据驱动的社会经济地位分析研究综述 Big Data-driven Based Socioeconomic Status Analysis:A Survey
计算机科学, 2022, 49(4): 80-87. https://doi.org/10.11896/jsjkx.211100014
[14]
李野, 陈松灿.
基于物理信息的神经网络:最新进展与展望 Physics-informed Neural Networks:Recent Advances and Prospects
计算机科学, 2022, 49(4): 254-262. https://doi.org/10.11896/jsjkx.210500158
[15]
张潆藜, 马佳利, 刘子昂, 刘新, 周睿.
以太坊Solidity智能合约漏洞检测方法综述 Overview of Vulnerability Detection Methods for Ethereum Solidity Smart Contracts
计算机科学, 2022, 49(3): 52-61. https://doi.org/10.11896/jsjkx.210700004
Metrics
Viewed
Full text
Abstract
Cited
Shared
Discussed
本文评价
推荐阅读 0
No Suggested Reading articles found!
