ODSC WEST 2015 | Jonathan Dinu – “Hands On with D3 js: Civic Impact”

Abstract: As a Data { Scientist | Artist | Engineer } you have incredible potential to impact the world at large, but often times how much change you can effect depends on how well you can communicate. This 2-hour workshop will focus on empowering civic minded developers and data scientists to leverage a wealth of […]

ODSC West 2015 | Nicole White – “Fundamentals of Neo4j”

Abstract: This workshop will establish the necessary skills for getting up and running with Neo4j. We assume no prior knowledge of graph databases. We’ll first cover the differences between graph databases and relational databases, highlighting specific use cases where a graph offers advantages over tables. The bulk of the workshop is then centered around learning […]

Workflows in Python: Pipeline & GridSearchCV

Abstract: This workshop will motivate and demonstrate pipelines and grid search cross-validation in sklearn as tools for building a robust, flexible and well-organized workflow in data science projects.Most data science projects are characterized by a proliferation of options: what machine learning algorithms to use, how to tune their parameters, whether to do feature transforms like […]

ODSC WEST 2015 | Wes McKinney – “Ibis: Scalable Python Analytics on Hadoop and SQL Engines”

Abstract: While Python is a de-facto language for modern data engineering and data science, Python development has been confined to local data processing—thereby limiting its users to smaller data sets. Bio: Wes McKinney is the main author of pandas, the popular open source Python library for data analysis. He is an active speaker and participant […]

Juliet Hougland – PySpark Best Practices

Abstract: PySpark (component of Spark allows users to write their code Python) has grabbed the attention of Python programmers who analyze and process data for a living. The appeal is obvious- you don’t need to learn a new language, and you still have access to modules (i.e., pandas, nltk, statsmodels, etc.) that you are familiar […]

Richard Socher – Deep Learning for the Enterprise

Abstract: Deep Learning has revolutionized several industries with its state of the art results in speech recognition, image classification and natural language understanding. I will introduce an easy to use enterprise solution based on MetaMind’s platform for a wide range of tasks that can now be automated. Examples include image classification for e-commerce and consumer […]

Scott Draves – Polyglot Beaker Notebook

Abstract: The Beaker Notebook is a new open source tool for collaborative data science. Beaker has an innovative UI and unique architecture to make it easier for novices to get started, and enable experts to work faster. Like IPython, Beaker uses a notebook-based UI metaphor, but Beaker was designed to be polyglot from the ground […]

Michael Li – Data Driven Hiring of Data Scientists

Abstract: Hiring — even for data scientists — is often not very data driven. At The Data Incubator, we run a fellowship to train and place data scientists in [the medical and pharmaceutical / financial] industry that regularly receives over 3000 applications per session. To cope, we have to rely on robust analytics and machine-learning […]

Fidan Boylu & Muxi Li – Cortana Analytics

Abstract: In this tutorial, you will create an end-to-end predictive model based on the extensive library of machine learning algorithms included in Microsoft Azure Machine Learning studio with its R and python language extensibility. You will deploy and consume your models and use them for making predictions over data. You will be walking through the […]