BACK TO PROJECTS
002 — PROJECTS

SoccerML

Data Analysis & Prediction System2021

Tech Stack

PythonPandasNumPyXGBoostPyTorchScikit-learnSQLAlchemyPostgreSQLJupyter NotebookBoruta

Role

Machine Learning Engineer

Team Size

2

Duration

September 2021 - June 2022

Project Overview

SoccerML is a football match result prediction system that integrates multi-source data (odds, historical records, real-time indicators, etc.) to build various machine learning models to predict home win, draw, or away win outcomes in football matches. The system uses multiple algorithms including SVM, XGBoost, and neural networks, combined with Poisson distribution and statistical analysis methods to improve prediction accuracy.

Highlights

  • Multi-model ensemble prediction system (SVM, XGBoost, Neural Networks)
  • Poisson distribution-based goal prediction model
  • Feature engineering and automatic feature selection (Boruta algorithm)
  • Real-time odds data processing and analysis
  • Multi-version iterative optimization model architecture

Challenges

  • Handling high uncertainty in football match results
  • Integrating multi-source heterogeneous data and extracting effective features
  • Balancing model complexity and prediction accuracy
  • Responding to real-time changes in odds data

Solutions

  • Designed modular data processing workflow
  • Implemented multi-model voting system to improve prediction robustness
  • Used Boruta and RFE algorithms for feature selection optimization
  • Developed automated data collection and model training workflow

References

  • Deep Generative Multi-Agent Imitation Model as a Computational Benchmark for Evaluating Human Performance in Complex Interactive Tasks: A Soccer Case Study
  • Prediction of football match results with Machine LearningarXiv
  • Machine Learning in Football Betting PredictionIEEE
  • Investigating the efficiency of the Asian handicapJournal of Sports Economics
  • Bayesian modelling of football outcomes: Using the Skellam's distributionInternational Journal of Forecasting
SoccerML Interface
SoccerML | Klauden