Why Blockchain Will Improve Your Big Data
The rise of cloud storage has helped companies collect and manage massive amounts of data. Data comes from corporate systems, Internet of Things objects and unstructured sources like online forums. New analytics tools like Hadoop help companies make sense of that data. Yet simply having data... Read more
Data Visualization – Part 1
Introduction to Data Visualization – Theory, R & ggplot2 The topic of data visualization is very popular in the data science community. The market size for visualization products is valued at $4 Billion and is projected to reach $7 Billion by the end of 2022 according... Read more
Making sense of facts, numbers, and measurements is a form of art – the art of data visualization. There is a load of data in the sea of noise. To turn your numbers into knowledge, your job is not only to separate noise from the data, but... Read more
Why Machine Learning Is A Metaphor For Life
Seriously. Hear me out on this. The more I learn about ML, the more I see the number of similarities there are between life and machine learning concepts. Specifically, let’s think about neural networks. Let’s think of a neural net that has a bunch of input nodes... Read more
Custom Level Coding in vtreat
One of the services that the R package vtreat provides is level coding (what we sometimes call impact coding): converting the levels of a categorical variable to a meaningful and concise single numeric variable, rather than coding them as indicator variables (AKA “one-hot encoding”). Level coding can be computationally and statistically preferable to... Read more
This post is the first of a two-part series in which we apply NLP techniques to analyze articles about big data, data science, and AI. If you are tired of the hassles of web scraping, then this post might be just for you. I occasionally web scrape news... Read more
Today’s Weak AI Lacks Intelligence
While Deep Learning and other ANN-based methods of machine learning have produced some amazing capabilities over the past decade, they still leave me wanting more intelligence than they can deliver. The “point neuron” used in ANN is based on an understanding of neuroscience we had back... Read more
The number of letters in the word for each number
Just for fun, I generated these graphs of the number of letters in the word for each number. I really spent about 10 minutes on this (ok…possibly also another 40 minutes tweaking the plots):More languages!!I love how Spanish has a few super compact words: “cien mil”... Read more
Which Gender Is More Likely To Trust Artificial Intelligence
Many people are very skeptical of the governments adoption of AI to take over management of its citizen services, but which gender is more comfortable with this decision? The answer to that question, by way of surveys, is men. More men than women feel more comfortable... Read more
Firing on All Cylinders: The 2017 Big Data Landscape, part 2
A walk through the 2017 Data Ecosystem Landscape INFRASTRUCTURE A lot of themes from last year have continued to play out, such as the ever-increasing importance of streaming, with Spark reigning supreme for now, with interesting contenders such as Flink emerging. In addition, a few interesting themes have kept... Read more