Supercharge Your Pandas Code with Apache Spark
Editor’s Note: Itai Yaffe and Daniel Haviv are speakers for ODSC East 2022. Be sure to check out their talk, “A bamboo of Pandas: crossing Pandas’ single-machine barrier with Apache Spark,” there! Pandas is a fast and powerful open-source data analysis and manipulation framework written in... Read more
Major Updates to the Most Popular Data Science Frameworks in 2019
This time last year we brought you a detailed report of all the important updates for popular data science (machine learning and deep learning) frameworks throughout 2018. The developers of these frameworks continue to innovate at an accelerated rate. Data scientists demand more powerful tools in... Read more
Deep Learning Frameworks You Need to Know in 2020
Deep learning networks have a mind-boggling ability to learn, so training these models requires massive computing power and intense amounts of data. You’ll need a framework to make that development easier. Deep learning requires massive processing power and lots of data. Because it uses unstructured, often non-text... Read more
Teaching pivot / un-pivot
Co-written by John Mount and Nina Zumel Introduction In teaching thinking in terms of coordinatized data we find the hardest operations to teach are joins and pivot. One thing we commented on is that moving data values into columns, or into a “thin” or entity/attribute/value form... Read more
Versatile Spark – Streaming
Distributed Computing is the fuel for large scale processing in modern data pipelines. Hadoop and its open-source competitors tool this system together. In recent years, rival Apache Spark gained favor due to its versatility. As preference for Apache grows, the software diversifies and its applications increase.... Read more