What is knyfe?

What is knyfe?...

Knyfe is a python utility for rapid exploration of datasets. Use it when you have some kind of dataset and you want to get a feel for how it is composed, run some simple tests on it, or prepare it for further processing. The great thing about knyfe is that you don’t have to know […]

Classifying segmented strokes as characters – Part 3 of an XKCD font saga

Classifying segmente...

In part two of my XKCD font saga I was able to separate strokes from the XKCD handwriting dataset into many smaller images. I also handled the easier cases of merging some of the strokes back together – I particularly focussed on “dotty” or “liney” type glyphs, such as i, !, % and =. Now […]

Segment, extract, and combine features of an image with SciPy and scikit-image – Part 2 of an XKCD font saga

Segment, extract, an...

In part one of XKCD font saga I gave some background on the XKCD handwriting dataset, and took an initial look at image segmentation in order to extract the individual strokes from the scanned image. In this installment, I will apply the technique from part 1, as well as attempting to merge together strokes to […]

Python as a way of thinking

Python as a way of t...

This article contains supporting material for this blog post at Scientific American.  The thesis of the post is that modern programming languages (like Python) are qualitatively different from the first generation (like FORTRAN and C), in ways that make them effective tools for teaching, learning, exploring, and thinking. I presented a longer version of this argument […]

Streaming Video Analysis in Python

Streaming Video Anal...

This was originally posted on the Silicon Valley Data Science blog by authors Matthew Rubashkin Data Engineer at SVDS, and Colin Higgins, Data Scientist at Vevo. At SVDS we have analyzed Caltrain delays in an effort to use real time, publicly available data to improve Caltrain arrival predictions. However, the station-arrival time data from Caltrain was not […]

Faster deep learning with GPUs and Theano

Faster deep learning...

Originally posted by Manojit Nandi, Data Scientist at STEALTHbits Technologies on the Domino data science blog Domino recently added support for GPU instances. To celebrate this release, I will show you how to: Configure the Python library Theano to use the GPU for computation. Build and train neural networks in Python. Using the GPU, I’ll […]

Dropout with Theano

Dropout with Theano...

Almost everyone working with Deep Learning would have heard a smattering about Dropout. Albiet a simple concept (introduced a couple of years ago), which sounds like a pretty obvious way for model averaging, further resulting into a more generalized and regularized Neural Net; still when you actually get into the nitty-gritty details of implementing it […]

How the Multinomial Logistic Regression Model Works

How the Multinomial ...

In the pool of supervised classification algorithms, the logistic regression model is the first most algorithm to play with. This classification algorithm again categorized into different categories. These categories purely based on the number of target classes. If the logistic regression model used for addressing the binary classification kind of problems it’s known as the […]

How the Logistic Regression Model Works in Machine Learning

How the Logistic Reg...

In this article, we are going to learn how the logistic regression model works in machine learning. The logistic regression model is one member of the supervised classification algorithm family. The building block concepts of logistic regression can be helpful in deep learning while building the neural networks. Logistic regression classifier is more like a […]