Unsupervised Learning: Evaluating Clusters
ModelingClusteringUnsupervised Learningposted by Daniel Gutierrez, ODSC November 23, 2018
K-means clustering is a partitioning approach for unsupervised statistical learning. It is somewhat unlike agglomerative approaches like hierarchical clustering. A partitioning approach starts with all data points and tries to divide them into a fixed number of clusters. K-means is applied to a set of quantitative... Read more
K-Means Clustering Applied to GIS Data
ToolsClusteringMachine Learningposted by Spencer Norris, ODSC October 11, 2018
Here, we use k-means clustering with GIS Data. GIS can be intimidating to data scientists who haven’t tried it before, especially when it comes to analytics. On its face, mapmaking seems like a huge undertaking. Plus esoteric lingo and strange datafile encodings can create a significant... Read more
Assessment Metrics for Clustering Algorithms
ModelingClusteringMachine Learningposted by Spencer Norris, ODSC October 10, 2018
Assessing the quality of your model is one of the most important considerations when deploying any machine learning algorithm. For supervised learning problems, this is easy. There are already labels for every example, so the practitioner can test the model’s performance on a reserved evaluation set.... Read more
How to Analyze Articles About Data Science Using Data Science
ModelingStatisticsClusteringposted by George McIntire, ODSC December 5, 2017
In a previous post, we demonstrated how to use the Python3 library Newspaper to painlessly extract data from news articles. Using Newspaper, I was able to extract text from over a 1000 articles about topics including, but limited to Data Science, Artificial Intelligence, and Big Data.... Read more