fbpx
General Tips for Web Scraping with Python
The great majority of the projects about machine learning or data analysis I write about here on Bigish-Data have an initial step of scraping data from websites. And since I get a bunch of contact emails asking me to give them either the data I’ve scraped myself, or... Read more
The ODSC team was delighted to present the second Outstanding Data Science Project Award to ‘Pandas’ at ODSC West on November 3rd.    Why ODSC is gives these awards… Most data scientists/developers use an open source language, tool, software or platform daily. All of these resources... Read more
The Incredible Growth of Python
We recently explored how wealthy countries (those defined as high-income by the World Bank) tend to visit a different set of technologies than the rest of the world. Among the largest differences we saw was in the programming language Python. When we focus on high-income countries, the growth of Python... Read more
Thomas originally posted this article here at http://twiecki.github.io  We recently released PyMC3 3.1 after the first stable 3.0 release in January 2017. You can update either via pip install pymc3 or via conda install -c conda-forge pymc3. A lot is happening in PyMC3-land. One thing I am particularily proud of is the developer... Read more
2 Ways to Implement Multinomial Logistic Regression in Python
Logistic regression is one of the most popular supervised classification algorithm. This classification algorithm mostly used for solving binary classification problems. People follow the myth that logistic regression is only useful for the binary classification problems. Which is not true. Logistic regression algorithm can also use to solve the multi-classification... Read more
What is knyfe?
Knyfe is a python utility for rapid exploration of datasets. Use it when you have some kind of dataset and you want to get a feel for how it is composed, run some simple tests on it, or prepare it for further processing. The great thing... Read more
Another batch of Think Stats notebooks
Getting ready to teach Data Science in the spring, I am going back through Think Stats and updating the Jupyter notebooks.  When I am done, each chapter will have a notebook that shows the examples from the book along with some small exercises, with more substantial... Read more
General Tips for Web Scraping with Python
The great majority of the projects about machine learning or data analysis I write about here on Bigish-Data have an initial step of scraping data from websites. And since I get a bunch of contact emails asking me to give them either the data I’ve scraped myself, or... Read more
Regular Expression & Treemaps to Visualize Emergency Department Visits
It’s been a while since my last post on some TB WHO data. A lot has happened since then, including the opportunity to attend the Open Data Science Conference (ODSC) East held in Boston, MA. Over a two day period I had the opportunity to listen to a... Read more
Python as a way of thinking
This article contains supporting material for this blog post at Scientific American.  The thesis of the post is that modern programming languages (like Python) are qualitatively different from the first generation (like FORTRAN and C), in ways that make them effective tools for teaching, learning, exploring, and... Read more