Jupyter, Zeppelin, Beaker: The Rise of the Notebooks
Standard software development practices for web, Saas, and industrial environments tend to focus on maintainability, code quality, robustness, and performance. Scientific programing in data science is more concerned with exploration, experimentation, making demos, collaborating, and sharing results. It is this very need for experiments, explorations, and collaborations that is... Read more
Standard software development practices for web, Saas, and industrial environments tend to focus on maintainability, code quality, robustness, and performance. Scientific programing in data science is more concerned with exploration, experimentation, making demos, collaborating, and sharing results. It is this very need for experiments, explorations, and collaborations that is... Read more
Intro to Text mining using R
Abstract: Attendees will learn the foundations of text mining approaches in addition to learn basic text mining scripting functions used in R. The audience will learn what text mining is, then perform primary text mining such as keyword scanning, dendogram and word cloud creation. Later participants will be able... Read more
Riding on Large Data with Scikit-learn
What’s a Large Data Set? A data set is said to be large when it exceeds 20% of the available RAM for a single machine. Which for your standard MacBook Pro with 8Gb of RAM, corresponds to a meager 2Gb dataset — size that is becoming more and more... Read more
Saul Diez-Guerra at ODSC Boston 2015
What We Learned While Teaching Python and Data Science Pedagogy and lessons learned from teaching an online introductory Python and Data Science courses. This is how we approached the matter, what we learned and where we want to go next. Presenter Bio: Saul Diez-Guerra works as Engineering Lead at... Read more
Scikit-Learn for Easy Machine Learning: the Vision, the Tool, and the Project Scikit-learn for easy machine learning: the vision, the tool, and the project from Gael Varoquaux Scikit-learn is a popular machine learning tool. What can it do for you?Why you you want to use it? What can you... Read more
Lynn Root at ODSC Boston 2015
Metric-Driven Development: See the Forest for the Trees At Spotify, my team struggled to be awesome. We had a very loose understanding of what product/service our squad was responsible for, and even less so of the expectations our internal and external customers had for those services. Other than “does... Read more
Wes McKinney at ODSC Boston 2015
DataFrames: The Extended Cut DataFrames: The Extended Cut from odsc This talk will give an overview of data frame libraries and toolkits across most languages and systems in use for data science and analytics today. We’ll highlight strengths and weaknesses and opportunities for community work. Presenter Bio: Wes McKinney... Read more
Big Data: Pig, Hive, Hadoop w/MapReduce – Gil Benghiat, Chris Bergh, Eric Estabrooks
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive from odsc The main objective of this workshop is to give the audience hands on experience with several Hadoop technologies and jump start their hadoop journey. In this workshop, you will load data and submit queries using Hadoop!... Read more
Using Open Source Solutions in Sports Business Operations – Matthew Wills ODSC Boston 2015
Using Open Source Solutions in Sports Business Operations from odsc This presentation will overview how the Grizzlies apply the use of R to their sales and marketing business operations. From basic data manipulation, to statistical modeling and enhanced visualization, the Grizzlies utilize R as a tool that efficiently positions... Read more