Craft Minimal Bug Reports
Following up on a post on supporting users in open source this post lists some suggestions on how to ask a maintainer to help you with a problem. You don’t have to follow these suggestions. They are optional. They make it more likely that a project maintainer will spend time helping... Read more
This work is supported by Continuum Analytics, and the Data Driven Discovery Initiative from the Moore Foundation. This blogpost is about experimental software. The project may change or be abandoned without warning. You should not depend on anything within this blogpost. This week I built a small streaming library for Python. This was... Read more
WHAT PROGRAMMING LANGUAGES ARE USED MOST ON WEEKENDS?
Note: Cross-posted with the Stack Overflow blog. Check out the code for this analysis on Kaggle. For me, the weekends are mostly about spending time with my family, reading for leisure, and working on the open-source projects I am involved in. These weekend projects overlap with the work that I do... Read more
This blogpost is about topic modeling using data from this blog, opendatascience.com. From this, combined with the most visited articles of the year, we will generate the most popular topics of 2017. Last year, we did something similar with popular articles streamed through twitter using Non-Negative Matrix Factorization to determine topics, article... Read more
On Taking Things Too Seriously: Holiday Edition
For some reason Atlanta got a pretty significant amount of snow yesterday, and because of that I’ve been mostly stuck at home. When faced with that kind of time on hand, sometimes I spend too much time on things that don’t really matter all that much. Recently, I’ve been... Read more
Ripyr: Sampled Metrics on Datasets Using Python’s Asuncio
Today I’d like to introduce a little python library I’ve toyed around with here and there for the past year or so, ripyr. Originally it was written just as an excuse to try out some newer features in modern python: asyncio and type hinting. The whole package is type... Read more
On Machine Learning and Programming Languages
This article was co-written by Mike Innes (Julia Computing), David Barber (UCL), Tim Besard (UGent), James Bradbury (Salesforce Research), Valentin Churavy (MIT), Simon Danisch (MIT), Alan Edelman (MIT), Stefan Karpinski (Julia Computing), Jon Malmaud (MIT), Jarrett Revels (MIT), Viral Shah (Julia Computing), Pontus Stenetorp (UCL) and Deniz Yuret (Koç... Read more
Web Scraping with Python — Part Two — Library overview of requests, urllib2, BeautifulSoup, lxml, Scrapy, and more!
Welcome to part 2 of the Big-Ish Data general web scraping writeups! I wrote the first one a little bit ago, got some good feedback, and figured I should take some time to go through some of the many Python libraries that you can use for scraping, talk about them... Read more
General Tips for Web Scraping with Python
The great majority of the projects about machine learning or data analysis I write about here on Bigish-Data have an initial step of scraping data from websites. And since I get a bunch of contact emails asking me to give them either the data I’ve scraped myself, or help with getting... Read more
The ODSC team was delighted to present the second Outstanding Data Science Project Award to ‘Pandas’ at ODSC West on November 3rd.    Why ODSC is gives these awards… Most data scientists/developers use an open source language, tool, software or platform daily. All of these resources available because their... Read more