Handwritten digits recognition using Tensorflow with Python
The progress in technology that has happened over the last 10 years is unbelievable. Every corner of the world is using the top most technologies to improve existing products while also conducting immense research into inventing products that make the world the best place to live.... Read more
RIDE – A New Data Science IDE for Python and R
The data science world is split into two parts: the (i)Python and the R community. Both groups offer a plethora of tools and libraries enriching our work-life as a data scientist. Interestingly, many of the offerings are complementary, such that professional data scientists should know both... Read more
Hello all and welcome to the second of the series – NLP with NLTK. The first of the series can be found here, incase you have missed. In this article we will talk about basic NLP concepts and use NLTK to implement the concepts. Contents: Corpus... Read more
Dealing with arrays which are bigger than memory – an introduction to biggus
I often deal with huge gridded datasets which either stretch or indeed are beyond the limits of my computer’s memory. In the past I’ve implemented a couple of workarounds to help me handle this data to extract meaningful analyses from them. One of the most intuitive... Read more
New notebooks for Think Stats
Getting ready to teach Data Science in the spring, I am going back through Think Stats and updating the Jupyter notebooks.  When I am done, each chapter will have a notebook that shows the examples from the book along with some small exercises, with more substantial... Read more
Streaming Video Analysis in Python
This was originally posted on the Silicon Valley Data Science blog by authors Matthew Rubashkin Data Engineer at SVDS, and Colin Higgins, Data Scientist at Vevo. At SVDS we have analyzed Caltrain delays in an effort to use real time, publicly available data to improve Caltrain arrival predictions.... Read more
Dropout with Theano
Almost everyone working with Deep Learning would have heard a smattering about Dropout. Albiet a simple concept (introduced a couple of years ago), which sounds like a pretty obvious way for model averaging, further resulting into a more generalized and regularized Neural Net; still when you... Read more
How the Logistic Regression Model Works in Machine Learning
In this article, we are going to learn how the logistic regression model works in machine learning. The logistic regression model is one member of the supervised classification algorithm family. The building block concepts of logistic regression can be helpful in deep learning while building the... Read more
Maps and sets can have quadratic-time performance
Swift is a new programming language launched by Apple slightly over two years ago. Like C and C++, it offers ahead-of-time compilation to native code but with many new modern features. It is available on Linux and macOS. Like C++, Swift comes complete with its own... Read more
An Introduction to Object Oriented Data Science in Python
A lot of focus in the data science community is on reducing the complexity and time involved in data gathering, cleaning, and organization. This article discusses how object oriented design techniques from software engineering can be used to reduce coding overhead and create robust, reusable data... Read more