Apply labels to a measures table
INFO:tensorflow:Enabling eager execution
INFO:tensorflow:Enabling v2 tensorshape
INFO:tensorflow:Enabling resource variables
INFO:tensorflow:Enabling tensor equality
INFO:tensorflow:Enabling control flow v2


A measures table can be labelled so that more advanced analytical methods can be applied, which include between-group tests, classification methods in machine learning, and more. The most used historical labelling heuristic is to split players according to the total amount they wager. The labelling module provides a way to do this for any fractional split (e.g. 80:20, 95:5...) via the label_split method.



top_split(measures_table, split_by, percentile=95, loud=False)

Labels player's behavioural measures according to their presence in the top percentile of a given measure (split_by). Label column will be 1 if in the top percentile, 0 if not. E.g. if split_by is 'total_wagered' and percentile is 95, players in the top 5% by total wagered will be labelled 1, and all others 0.


get_labelled_groups(labelled_measures_table, labelname)

Provides a simple way of splitting a labelled measures table into multiple tables each corresponding to a given label.

Descriptive Outputs

Labeled measures tables are a powerful tool in themselves, and can be used in the machine_learning module or used to describe each labeled group.



Computes the percentages of total amount wagered of groups of players by presence in the top percentages of the total amount wagered. E.g. how much does the top 5% of players by total amount wagered account for out of the total amount wagered by everyone. It currently computes this statement for the percentages {1,5,10,25,50}, and prints the statements to the console.