The Art of Data Science – Josh Wills ODSC Boston 2015
The Art of Data Science from odsc Keynote Presenter Bio Josh Wills is Cloudera’s Senior Director of Data Science, working with customers and engineers to develop Hadoop-based solutions across a wide-range of industries. He is the founder and VP of the Apache Crunch project for creating optimized MapReduce pipelines... Read more
Frontiers of Open Data Science Research – Ani Aghababyan ODSC Boston 2015
Frontiers of Open Data Science Research from odsc Keynote Presenter Bio Ani loves writing about herself in third person and has written this all true bio. Ani is a Data Scientist for the Digital Platforms Group in McGraw-Hill Education company. She has a diverse educational background (some say she... Read more
Machine Learning for a Pet Insurance Company – TJ Houk & David Jaw ODSC Boston 2015
Machine Learning for a Pet Insurance Company from odsc As an insurance company, we receive a monthly premium from policy holders and in return, we pay claims on veterinary bills. Insurance risk for pet health is relatively uncharted territory; identifying key patterns can affect the company in a big... Read more
Feature Engineering – David Epstein ODSC Boston 2015
Feature Engineering from odsc One of the most important, yet often overlooked, aspects of predictive modeling is the transformation of data to create model inputs, better known as feature engineering (FE). This talk will go into the theoretical background behind FE, showing how it leverages existing data to produce... Read more
A Hybrid Approach to Data Science Project Management – Elaine Lee ODSC Boston 2015
A Hybrid Approach to Data Science Project Management from odsc In recent years, Data Science evolved into its own profession as a response to the proliferation of data that needed to be analyzed and made actionable — a job that could not be adequately addressed by any single one... Read more
Guest Blogger Cezary Podkul discusses Keeping Governments Accountable with Open Data Science Blog Post
Doing open data science on government financials is not easy. A lot of the info is not, well, open. The good news is that data on government spending, borrowing, pensions and the like exists, but often lies hidden in bulky PDFs that are difficult to work with. In my... Read more
On Demand Analytic and Learning Environments with Jupyter – Kyle Kelley and Andrew Odewahn ODSC Boston 2015
http://bit.ly/Odewahn_KelleyPresentation The Jupyter/IPython project has been building systems to enable collections of users to work on a shared system within their team, lab, and on a wide web audience. There is the multi user server JupyterHub, the temporary notebook system (tmpnb), blossoming Google Drive integration (jupyter-drive), notebook spawning in... Read more
Machine Learning for Suits – Rahul Dave ODSC Boston 2015
Machine Learning for Suits from odsc You will learn the basic concepts of machine learning – such as Modeling, Model Selection, Loss or Profit, overfitting, and validation – in a non-mathematical way, so that you can ask for data analysis and interpret the results of a model in the... Read more
Recurrent Neural Networks for Text Analysis – Alec Radford ODSC Boston 2015
Recurrent Neural Networks for Text Analysis from odsc Recurrent Neural Networks hold great promise as general sequence learning algorithms. As such, they are a very promising tool for text analysis. However, outside of very specific use cases such as handwriting recognition and recently, machine translation, they have not seen... Read more
Probabilistic Programming in Data Science – Thomas Wiecki ODSC Boston 2015
http://bit.ly/ThomasWieckiPresentation There exist a large number of metrics to evaluate the performance-risk trade-off of a portfolio. Although those metrics have proven to be useful tools in practice, most of them require a large amount of data and implicitly assume returns to be normally distributed. Bayesian modeling is a statistical... Read more