Bridging the Gap Between Data and Insight using Open-Source Tools – Nicholas Arcolano ODSC Boston 2015
Despite the proliferation of open-source tools for analysis (such as Python and R) and those used for visualization (such as Javascript / D3), there often exist significant gaps between these areas, and those of...
Learning to Love Bayesian Statistics – Allen Downey ODSC Boston 2015
http://tinyurl.com/lovebayes Bayesian statistical methods provide powerful tools for answering questions and making decisions. For example, the result of Bayesian analysis is a set of values and probabilties that can be fed directly into a cost-benefit analysis, which is not possible with conventional statistics. But there are...
Data Workflows for Iteration, Collaboration, and Reproducibility – David Chudzicki ODSC Boston 2015
http://www.davidchudzicki.com/slides/odsc-2015-workflow/ For other data scientists to improve, build on, or even just trust your analysis, they need to be able to reproduce it. Even if you have shared code and data, reproducing your analysis may be difficult: which code was executed against which data in what...
Predictive Modeling Workshop – Max Kuhn ODSC Boston 2015
The workshop is an overview of creating predictive models using R. An example data set will be used to demonstrate a typical workflow: data splitting, pre-processing, model tuning and evaluation. Several R packages will be shown along with the caret package...
Making R Go Faster and Bigger – Jared Lander ODSC Boston 2015
http://bit.ly/JaredLanderPresentation The features of R that make it easy to use–dynamically typed, in-memory analysis, the interpreter engine and REPL–can also slow it down. Fortunately the R Core Team has made dramatic improvements in recent years with better memory management and faster interpretation of code. We look...
Probabilistic Programming in Data Science – Thomas Wiecki ODSC Boston 2015
http://bit.ly/ThomasWieckiPresentation There exist a large number of metrics to evaluate the performance-risk trade-off of a portfolio. Although those metrics have proven to be useful tools in practice, most of them require a large amount of data and implicitly assume returns to be normally distributed. Bayesian modeling...
Recurrent Neural Networks for Text Analysis – Alec Radford ODSC Boston 2015
Recurrent Neural Networks hold great promise as general sequence learning algorithms. As such, they are a very promising tool for text analysis. However, outside of very specific use cases such as handwriting recognition and recently, machine translation, they...
Machine Learning for Suits – Rahul Dave ODSC Boston 2015
You will learn the basic concepts of machine learning – such as Modeling, Model Selection, Loss or Profit, overfitting, and validation – in a non-mathematical way, so that you can ask for data analysis and interpret the results of a...
On Demand Analytic and Learning Environments with Jupyter – Kyle Kelley and Andrew Odewahn ODSC Boston 2015
http://bit.ly/Odewahn_KelleyPresentation The Jupyter/IPython project has been building systems to enable collections of users to work on a shared system within their team, lab, and on a wide web audience. There is the multi user server JupyterHub, the temporary notebook system (tmpnb), blossoming Google Drive integration (jupyter-drive),...
Adventures in Using R to Teach Mathematics – Paul Bamberg ODSC Boston 2015
In 2014 I launched a new course, "Mathematical Foundations of Statistical Software," in the Harvard Extension school, aimed at students with a solid background in calculus. Lectures were a mixture of proofs and R scripts, all homework...