fbpx
TensorFlow Clusters: Questions and Code
One way to think about TensorFlow is as a framework for distributed computing. I’ve suggested that TensorFlow is a distributed virtual machine. As such, it offers a lot of flexibility. TensorFlow also suggests some conventions that make writing programs for distributed computation tractable. When is there... Read more
Technical preview: Native GPU programming with CUDAnative.jl
After 2 years of slow but steady development, we would like to announce the first preview release of native GPU programming capabilities for Julia. You can now write your CUDA kernels in Julia, albeit with some restrictions, making it possible to use Julia’s high-level language features... Read more
Faster deep learning with GPUs and Theano
Originally posted by Manojit Nandi, Data Scientist at STEALTHbits Technologies on the Domino data science blog Domino recently added support for GPU instances. To celebrate this release, I will show you how to: Configure the Python library Theano to use the GPU for computation. Build and... Read more
Intro to Caret: Pre-Processing
Editor’s note: This is the third of a series of posts on the caret package. Creating Dummy Variables Zero- and Near Zero-Variance Predictors Identifying Correlated Predictors Linear Dependencies The preProcess Function Centering and Scaling Imputation Transforming Predictors Putting It All Together Class Distance Calculations caret includes several... Read more
Implementing a CNN for Text Classification in Tensorflow
The full code is available on Github. In this post we will implement a model similar to Kim Yoon’s Convolutional Neural Networks for Sentence Classification. The model presented in the paper achieves good classification performance across a range of text classification tasks (like Sentiment Analysis) and... Read more
Google’s TensorFlow framework spread like wildfire upon its release. The slew of tutorials and extensions made an already robust ecosystem even more so. Recently, Google released one of their own extensions. It’s called SyntaxNet, a TensorFlow based syntactic parser for Natural Language Understanding. SyntaxNet uses neural... Read more
Jupyter, Zeppelin, Beaker: The Rise of the Notebooks
Standard software development practices for web, Saas, and industrial environments tend to focus on maintainability, code quality, robustness, and performance. Scientific programing in data science is more concerned with exploration, experimentation, making demos, collaborating, and sharing results. It is this very need for experiments, explorations, and... Read more
Standard software development practices for web, Saas, and industrial environments tend to focus on maintainability, code quality, robustness, and performance. Scientific programing in data science is more concerned with exploration, experimentation, making demos, collaborating, and sharing results. It is this very need for experiments, explorations, and... Read more
Riding on Large Data with Scikit-learn
What’s a Large Data Set? A data set is said to be large when it exceeds 20% of the available RAM for a single machine. Which for your standard MacBook Pro with 8Gb of RAM, corresponds to a meager 2Gb dataset — size that is becoming... Read more