The gamba library contains several wrapper methods for commonly applied machine learning methods. Whilst it would be impossible to cover every machine learning method used in existing literature, the gamba library aims to provide a number of examples so that if you have a particular method in mind, some variant or similar implementation is available for you to extend.
Logistic regression techniques can map simple relationships between variables, but typically become untenable in more complex applications. As with the clustering techniques, the gamba library contains wrappers around the sklearn library's logistic regression functions, offering simple methods to run analyses without having to interact with the sklearn library directly (although you should for advanced techniques!).
SVM's are another type of machine learning algorithm but have seen limited use in existing work, but are included as templates to see how they could be used as part of an analysis using the gamba library. The three SVM methods implemented include those used in Philander's 2014 study.
Random forests are collections of decision trees which are applied en-mass to classification and regression problems. The gamba library implements both techniques in a very basic form, which follow a similar structure to other machine learning methods in the library. As with many other methods on this page, the gamba library simply contains wrappers for the scikit-learn library, which should be used directly for more advanced analyses!
Performance measures of many machine learning methods can be derived from the values they predict and the actual values of a test dataset. The compute_performance
function below is useful for quick estimates of performance, but you should refer to the measures appropriate to the specific method you're implementing for advanced analyses.
Plotting
Some machine learning outputs favour plotting over raw tabular presentation. An example of this is a hierarchical clustering dendrogram, which is much more intuitive than a cluster membership report. The gamba library contains several methods for plotting the outputs of machine learning models;
import scipy.cluster.hierarchy as sch
import scipy.ndimage.filters as snf
import matplotlib.pyplot as plt