Data Science at Dow Jones: Monetizing Data, News and Information – Juan Huerta ODSC Boston 2015
Data Science at Dow Jones: Monetizing Data, News and Information from odsc In this presentation I will describe the way Data Science supports the business of information and news at Dow Jones. Specifically, I will describe how we are introducing innovative and advanced large-scale information mining and analytic approaches... Read more
Big Data: Pig, Hive, Hadoop w/MapReduce – Gil Benghiat, Chris Bergh, Eric Estabrooks
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive from odsc The main objective of this workshop is to give the audience hands on experience with several Hadoop technologies and jump start their hadoop journey. In this workshop, you will load data and submit queries using Hadoop!... Read more
API Driven Development: How I Build Things and Why – Kenneth Reitz ODSC Boston 2015
API Driven Development from odsc An exposé on human-centered design, as related to data science and “medium data”. Examples of great API design will be showcased, as well as other end-user facing tools that can enable data scientists to share their observations with the world. Presenter Bio Kenneth Reitz... Read more
Bridging the Gap Between Data and Insight using Open-Source Tools – Nicholas Arcolano ODSC Boston 2015
Bridging the Gap Between Data and Insight using Open-Source Tools from odsc Despite the proliferation of open-source tools for analysis (such as Python and R) and those used for visualization (such as Javascript / D3), there often exist significant gaps between these areas, and those of us trying to... Read more
Vowpal Wabbit – Paul Mineiro ODSC Boston 2015
Vowpal Wabbit from odsc Vowpal Wabbit is both an open-source machine learning toolkit and an active research platform. In this talk I introduce Vowpal Wabbit, discuss some of the design decisions, and the types of problems for which VW is (or is not) a good fit. The talk includes... Read more
Monary: Really fast analysis with MongoDB and NumPy – Anna Herlihy ODSC Boston 2015
Monary from odsc “MongoDB is a scalable, flexible and easy to use way of storing large data sets. Python and NumPy provide a comprehensive toolkit for data analysis. Unfortunately they don’t work together as well as they could: the official Python driver for MongoDB, PyMongo, is inefficient at loading... Read more
Enabling Graph Analytics at Scale: The Opportunity for GPU-Acceleration of Data-Parallel Graph Analytics (Application to Bioinformatics) – Brad Bebee ODSC Boston 2015
Enabling Graph Analytics at Scale: The Opportunity for GPU-Acceleration of Data-Parallel Graph Analytics (Application to Bioinformatics) from odsc From social networks to protein networks to financial transactions, graphs are everywhere. Graph Analytics represent a key tool for data science to take advance of this type of network information. Many... Read more
Probabilistic Programming in Data Science – Thomas Wiecki ODSC Boston 2015
http://bit.ly/ThomasWieckiPresentation There exist a large number of metrics to evaluate the performance-risk trade-off of a portfolio. Although those metrics have proven to be useful tools in practice, most of them require a large amount of data and implicitly assume returns to be normally distributed. Bayesian modeling is a statistical... Read more
DIY Deep Learning with Caffe Workshop – Kate Saenko ODSC Boston 2015
DIY Deep Learning with Caffe Workshop from odsc Caffe (Convolutional Architecture for Fast Feature Embedding) is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors. Caffe’s expressive architecture encourages application and... Read more
High Performance Hardware for Data Analysis – Michael Pittaro ODSC Boston 2015
High Performance Hardware for Data Analysis from odsc Choosing hardware for big data analysis is difficult because of the many options and variables involved. The problem is more complicated when you need a full cluster for big data analytics. This session will cover the basic guidelines and architectural choices... Read more