Partial Replication on dummy data

This notebook describes the analysis of behavioural player data performed in the 2021 study by Auer & Griffiths titled Reasons for Gambling and Problem Gambling Among Norwegian Horse Bettors: A Real-World Study Utilizing Combining Survey Data and Behavioural Player Data. Whilst the data for the study has not been made available, the analytical method used is well defined, so is implemented here using the gamba library on some alternative data.

Note that this study looks at both transaction data and survey data. As the gamba library focusses on transaction analysis only this part of the paper will be replicated here, specifically table 4 in the paper.

As always, the first step is to import the gamba library;

import gamba as gb

Next, we load in the data set - as mentioned above this data is not the same as that used in the study. If it becomes available it will be substituted here, but for now, a chunk of the cryptocurrency transaction set available here can be used as a placeholder.

player_bets ='/home/ojs/Data/osf_data/raw_data_part_4.csv', ['bet_time','payout_time','duration'])

The second parameter in the read_csv method is any variables in the data which are datetime values (see Pandas read_csv for details). With the player bets loaded, the next step is to create a measures table, which can be done using the create_measures_table method from the measures module. As this paper uses horse race betting data, there are some variables which are context specific (online only and horseraces only). See the table below for the measures described in the paper, and the names of those computed in this notebook in their place.

In-Paper Text Description Measures Module Method
amount of money bet total_wagered
amount of money bet online n/a (context specific derivative of total_wagered)
amount of money bet on Norwegian horseraces n/a (context specific derivative of total_wagered)
amount of money won net_loss (functionally identical)
number of playing days frequency_raw
number of games played number_of_bets (table in paper uses 'number of tickets')
bet conscious play n/a (context specific derivative of total_wagered)

To create the measures table, the measures used can be defined as a list of names of methods in the measures module. This is then passed to the create_measures_table method which computes them across the player bets.

measures_used = ['total_wagered','net_loss','frequency_raw','number_of_bets']
measures_table = gb.measures.create_measures_table(player_bets, measures_used)

The final step of this replication is to compute spearman correlations between each of the behavioural measures computed. As this replication only concerns the transaction data component of the study, it's much smaller than the original but can be expanded with the data by adding to the measures table.

spearman_coefficient_table = gb.statistics.spearmans_r(measures_table)
total_wagered net_loss frequency_raw number_of_bets
total_wagered -
net_loss 0.17** -
frequency_raw 0.83** 0.13** -
number_of_bets 0.86** 0.14** 0.97** -

That's it! It's not a huge output given only part of the study is in the realm of transaction analysis, but it's an extendable example of how to do this kind of analysis. To learn more about extending a measures table, follow the Advanced Tutorial.