fbpx
Using Spark, Python, and Parquet for Loading Large Datasets – Douglas Eisenstein ODSC Boston 2015
Spark, Python and Parquet from odsc Have you been in the situation where you’re about to start a new project and ask yourself, what’s the right tool for the job here? I’ve been in that situation many times and thought it might be useful to share... Read more
Data Science at Dow Jones: Monetizing Data, News and Information – Juan Huerta ODSC Boston 2015
Data Science at Dow Jones: Monetizing Data, News and Information from odsc In this presentation I will describe the way Data Science supports the business of information and news at Dow Jones. Specifically, I will describe how we are introducing innovative and advanced large-scale information mining... Read more
Big Data: Pig, Hive, Hadoop w/MapReduce – Gil Benghiat, Chris Bergh, Eric Estabrooks
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive from odsc The main objective of this workshop is to give the audience hands on experience with several Hadoop technologies and jump start their hadoop journey. In this workshop, you will load data and submit... Read more
API Driven Development: How I Build Things and Why – Kenneth Reitz ODSC Boston 2015
API Driven Development from odsc An exposé on human-centered design, as related to data science and “medium data”. Examples of great API design will be showcased, as well as other end-user facing tools that can enable data scientists to share their observations with the world. Presenter... Read more
Bridging the Gap Between Data and Insight using Open-Source Tools – Nicholas Arcolano ODSC Boston 2015
Bridging the Gap Between Data and Insight using Open-Source Tools from odsc Despite the proliferation of open-source tools for analysis (such as Python and R) and those used for visualization (such as Javascript / D3), there often exist significant gaps between these areas, and those of... Read more
Vowpal Wabbit – Paul Mineiro ODSC Boston 2015
Vowpal Wabbit from odsc Vowpal Wabbit is both an open-source machine learning toolkit and an active research platform. In this talk I introduce Vowpal Wabbit, discuss some of the design decisions, and the types of problems for which VW is (or is not) a good fit.... Read more
Monary: Really fast analysis with MongoDB and NumPy – Anna Herlihy ODSC Boston 2015
Monary from odsc “MongoDB is a scalable, flexible and easy to use way of storing large data sets. Python and NumPy provide a comprehensive toolkit for data analysis. Unfortunately they don’t work together as well as they could: the official Python driver for MongoDB, PyMongo, is... Read more
Enabling Graph Analytics at Scale: The Opportunity for GPU-Acceleration of Data-Parallel Graph Analytics (Application to Bioinformatics) – Brad Bebee ODSC Boston 2015
Enabling Graph Analytics at Scale: The Opportunity for GPU-Acceleration of Data-Parallel Graph Analytics (Application to Bioinformatics) from odsc From social networks to protein networks to financial transactions, graphs are everywhere. Graph Analytics represent a key tool for data science to take advance of this type of... Read more
Probabilistic Programming in Data Science – Thomas Wiecki ODSC Boston 2015
http://bit.ly/ThomasWieckiPresentation There exist a large number of metrics to evaluate the performance-risk trade-off of a portfolio. Although those metrics have proven to be useful tools in practice, most of them require a large amount of data and implicitly assume returns to be normally distributed. Bayesian modeling... Read more
DIY Deep Learning with Caffe Workshop – Kate Saenko ODSC Boston 2015
DIY Deep Learning with Caffe Workshop from odsc Caffe (Convolutional Architecture for Fast Feature Embedding) is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors. Caffe’s expressive architecture... Read more