A world first Tennis Hackathon, launched to coincide with Australian Open 2018, drew a record number of competitors from around the world in an attempt to find a way to automate the call of forced and unforced errors in professional tennis.
Tennis Australia’s Game Insight Group (GIG) teamed up with leading Silicon Valley-based crowd-sourcing platform CrowdANALYTIX to launch the single biggest release of tracking data in the history of tennis.
Tennis Australia Head of Innovation Dr Machar Reid said the results of the competition were an exciting step forward in the quest for the full automation of outcomes in tennis.
“The hackathon winner managed to achieve a solution with an overall accuracy of 95 percent which in simple terms means that a computer could accurately predict the outcome of a point 95 out of 100 times.
“This is a significant step towards the real-time automation of context in tennis, something that would significantly increase the ability of tennis to bridge the data and analytics gap to other major international sports,” Dr Reid continued.
Dr Stephanie Kovalchik, a Research Fellow at Victoria University and Tennis Australia Senior Data Scientist was impressed by the standard of the competitors.
“The hackathon was designed to attract machine learning scientists from around the world to get their help in solving a tennis-specific data problem. It was the first time such a competition has been held for the sport and by holding it alongside the Australian Open we were able to maximise the interest around the world,” Dr Kovalchik said.
“The winning solution is a high-quality tool that could be the first major step toward automating point call in tennis. The competition showed us the potential value of data in tennis and what amazing things can result when access is given to that data to tennis aficionados with a gift for data analysis.”
CrowdANALYTIX Principal Data Scientist Mohan Singh called the Tennis Hackathon a great success.
"The AO to AI competition was among the top three contests we’ve ever hosted in terms of community engagement. It was a great success in terms of number of solver participation and model submissions and the final models were varied in the approaches used," he said.
Tennis Hackathon fast facts
- 750 data scientists and machine learners from 55 countries around the world analysed 10,000 points of Australian Open tracking data
- 223 participants hailed from India which was by far the most represented country, followed by the USA (78) and Australia (51)
- 2731 solutions were submitted as part of the competition
- 90 percent of all participants competed as individuals
- The two most popular languages for solutions were written in R and Python
- The winning model achieved an overall accuracy of 95 percent – 98 percent for winners, 89 percent for forced errors and 95 percent for unforced errors
- A total prize pool of $US8500 was awarded as part of the competition – $US5000 for first place, $US2500 for second place and $US1000 for third place.
The winning participant, American data scientist Scott Sobel, was fascinated by the challenge of developing a computer algorithm that could predict winners or errors just by using the hawk-eye tracking data.
“Not only could it be possible to improve efficiency and consistency of otherwise manually recording point outcomes, but the greater value from using analytics is providing data-driven insights into why – what were the top factors that characterise a winner? How can you maximise the chance of an error from your opponent? What do you need to work on as a player? It was exciting to add to the pioneering work from Tennis Australia’s Game Insight Group.”
Runner-up Nickil Maveli, a Data Scientist in Research & Development from India, had fun competing in the hackathon and was happy to have finished second place.
"I entered the contest mainly because the dataset was tabular and small, of the order of few thousand rows, and I wanted to experiment with different algorithms – both traditional Machine Learning as well as neural networks – for later creation of ensembles. It was challenging to come up with a robust validation technique to evaluate this small sized dataset and also to create new features based on domain knowledge.”