Rob Haslinger at BDF 2015 | “Bayes_logistic: A python package for Bayesian logistic regression”

Tags: , , ,

Abstract: Logistic regression is one of the most commonly used classification/probability models for binary data. Most logistic regression packages include an option (L2 regularization) for biasing fitted parameters towards zero to prevent over-fitting. From a Bayesian perspective, L2 regularization imposes a Gaussian prior with zero mean and identical variance upon each fitted parameter. A fully Bayesian treatment would allow for biasing fitted parameters to arbitrary values (with arbitrary uncertainty/variance) that best reflect one’s prior beliefs. Here we (MaxPoint Interactive) present a python package ‘bayes_logistic’ which implements fully Bayesian logistic regression under a Laplace (Gaussian) approximation to the posterior. We will explain the mathematics behind the algorithms and then present a number of interesting use cases including online updating of logistic regression, weighting of data points and use of the posterior variance for variable selection and fitting sparse models without using L1 regularization which can be computationally intensive.

Bio: Rob Haslinger is a Senior Data Scientist at MaxPoint Interactive. A PhD trained physicist, Rob previously researched complex systems at both the Santa Fe Institute and Los Alamos’s Center for Non-Linear Studies before spending ten years using statistical and machine learning techniques to research neural coding and information processing at MIT’s Department of Brain and Cognitive Sciences and Massachusetts General Hospital. Currently, at Maxpoint, Rob is leading long-term research efforts to optimize real time bidding algorithms for online ad placement which can handle petabytes of data.