Weather-based analytics in SAS Hackathon

The SAS Hackathon has demonstrated the growing interest in analytics models and processes that effectively correlate sales data with weather forecast information to create effective data-driven decision-making processes, concretely recognizing weather as one of the main drivers behind human choices, with special reference to shopping decisions.

The hackathon

On April 6 and 7, SAS promoted a hackathon addressed to university students, challenging them to develop a fully-fledged data mining process using the SAS Visual Data Mining and Machine Learning tool.

At the competition 70 students divided into 19 teams tested their skills in Statistics, Computer Science, Machine Learning and Economics.

The hackathon has taken place over two days of competition. The goal of the first day was to fully develop a “data mining” process using the SAS tool. The problem, which required among other things the creation of a forecasting model that relates winds and weather forecasts, is described in more detail below. The various groups were then evaluated according to the accuracy of the implemented system and the first ten groups progressed to the next stage.

The second day of the hackathon has mainly focused on the principles and tools of data visualization requiring the participants to create an effective presentation of the results obtained by applying the process developed during the first day. The jury was composed of a mixed group of marketing experts, managers and academics.

The problem

The competing groups were asked to create a model that would explain and predict the interest of the female audience in a given football match.

In particular, on behalf of AC-Milan, participants were asked to predict for a certain season, match and sector of the stadium, and a week before the match takes place, if at least 15% of tickets will be sold to the women’s public.

The following data were made available for the development and training of the model:

  • Transactions for the ticket purchase of AC-Milan championship matches,
  • The calendar of matches with date, time, opposing team, score and day of the season
  • The league ranking with day and scores for each team
  • The weather forecasts from 2012 to 2018

The considered historical data set (about 500 MB) included data points from the 2012-2013 season to the current one (2017-2018).

The model created showed that the percentage of tickets sold to women for a certain match depends heavily on the percentage of tickets sold to the same category of people the previous week. In addition, the importance of the opposing team also plays an important role in the prediction. Finally, interesting correlations emerged between tickets sold and weather forecasts (especially average temperature, average wind speed, and visibility).

The winners (with a bit of EW-Shopp taste)

Renzo Arturo Alva Principe, second year student of the Master in Computer Science, principal developer of ABSTAT and involved in various ways in the EW-Shopp and EUBusinessGraph projects, was one of the 70 students and his team, the Data Byters, was one of the three winners.

The other members of the winning group are: Marco Capobussi, Marco Valzelli and Luca Pieri, all af them second year students of the Master in Computer Science at the University of Milan-Bicocca.

The Data Byters, together with the other 2 winners, will also have the opportunity to present their project at SAS Forum Milan, on 15 May 2018.

Leave a Reply

Your email address will not be published. Required fields are marked *