Learn the four-step structure to all gamba research
INFO:tensorflow:Enabling eager execution
INFO:tensorflow:Enabling v2 tensorshape
INFO:tensorflow:Enabling resource variables
INFO:tensorflow:Enabling tensor equality
INFO:tensorflow:Enabling control flow v2

Using the gamba library can be as simple or as complicated as you want it to be. This tutorial shows you the simplest possible analysis you can do, and shows the four-step process behind all studies.

1. Import the gamba library

To use the gamba library's functions in your notebook or script, import it using the following line;

import gamba as gb

This imports the library and names it gb, so you can access all the methods by using gb.some_method() instead of having to write gamba.some_method() every time.

2. Load in transaction data

The next step is to load in some transaction data. This should be in the form of a CSV file with some specific headings (See the gamba.data module for details). In this tutorial we'll use a small example data set derived from cryptocurrency transactions since they're publicly available. The data set should be loaded using the read_csv method from the data module as follows, which takes both the directory of the file and the names of the datetime columns;

player_bets = gb.data.read_csv('/home/ojs/Data/osf_data/raw_data_part_4.csv', ['bet_time','payout_time','duration'])

Now we have the player bets loaded into memory they can be used to create a measures table, which is the most important part of the gamba library's design.

Creating a measures table is the most important part of the gamba library because it's the middle step between raw data and analysis - almost all of the analytical methods accept a measures table as input!

3. Create a measures table

To create a measures table like one used in a previous study, the library contains several custom methods. One of which is the calculate_labrie_measures in the measures module, which creates a behavioural fingerprint of each player using measures such as duration, frequency, number of bets, and so on;

measures_table = gb.measures.calculate_labrie_measures(player_bets, daily=False)
100%|██████████| 1694/1694 [01:05<00:00, 25.71it/s]

After the loading bar completes, the measures table is ready to use. Saving it to disk is a good idea (especially for large datasets), but we cover that in the next tutorial. With the measures table computed, the final step is analysis.

4. Run some analysis

For this basic tutorial, we'll simply describe the measures table, which has been the core analysis in several older studies. To do this, the statistics module contains a descriptive_table method which does exactly that;

mean std median
duration 24.747934 67.273759 1.000000
frequency 77.163874 35.429478 100.000000
num_bets 188.289847 887.705178 13.000000
mean_bets_per_day 25.851029 48.710538 7.000000
mean_bet_size 0.504854 2.325948 0.092696
total_wagered 169.065085 2566.957588 1.285000
net_loss 2.722660 56.550746 0.030138
percent_loss 11.172689 55.112935 5.736237

That's all there is to it! This four step template is the foundation for almost every player behaviour tracking study that has been done since the start of the field, and invites customisation in steps 3 and 4 for more sophisticated analyses.

Important questions to think about are;

  1. which behavioural measures to use
  2. how to best describe these measures
  3. which intermediate features could be applied
  4. which statistical tests make sense for your experimental setting

In the intermediate tutorial you'll learn some more advanced techniques, and cover some simple visualisation.