Top 9 Most Essential Python Libraries For Beginners
People worldwide know Python as the most used programming language to date. Major tech companies like Google, Amazon, Meta, Instagram, and Uber use Python for various applications. From web development to machine learning projects, Python is an essential tool in a data scientist’s kit. Many understand... Read more
What to Consider When Building Data Pipelines
In 2021 we watched Fivetran raise $565 million, Airbyte $150 Million, Matillion $100 million, Rivery raised $16 million and Informatica went public. All of these companies have some piece of their business connected to data pipelines. Also sometimes referenced as ETL, ELT, E(t)LT, and CDC. For... Read more
Testing Within the Shift-Left Philosophy
Traditionally, application testing was carried out during the last stages of the software development life cycle, that is after the application had been completed and then handed to the security teams. If an application did not satisfy quality standards, did not function properly, or otherwise failed... Read more
What Statistical Test Should I Do?
Being a teaching assistant in statistics for students with diverse backgrounds, I have the chance to see what is globally not well understood by students. I have realized that it is usually not a problem for students to do a specific statistical test when they are told which... Read more
How is Data Collection Used in the Justice System?
There’s no question that the world is becoming increasingly reliant on data and the criminal justice system is no exception. The justice system in the United States has used various data types and forms of data collection for years. For example, police departments, states, and the... Read more
Paving the Road to Facial Classification Accuracy
Facial classification is one of the most promising and controversial machine learning use cases. The technology has considerable potential in areas like security, but it also carries substantial privacy and bias concerns. Relying on racial recognition models that aren’t accurate can lead to severe consequences. Facial... Read more
How to Install R and RStudio
R is nothing more than a programming language. At the time of writing, this language is (one of) the leading program in statistics, although not the only programming language used by statisticians. In order to use R,... Read more
Why Is Python the Language of Choice for Data Scientists?
Python has grown to become one of the most popular and well-liked programming languages in the world, used by millions of developers since its creation in 1991. For data scientists in particular, Python has a strong, long-time base of developers. Why is Python the language of... Read more
PyCharm vs. VSCode: Which Is the Better Python IDE?
Python first debuted in 1991, making it older than many of the people who use it. In the intervening years, coders have turned it into one of the most popular programming languages ever conceived. The reasons for Python’s perennial popularity come down to three major features.... Read more
Supercharge Your Pandas Code with Apache Spark
Editor’s Note: Itai Yaffe and Daniel Haviv are speakers for ODSC East 2022. Be sure to check out their talk, “A bamboo of Pandas: crossing Pandas’ single-machine barrier with Apache Spark,” there! Pandas is a fast and powerful open-source data analysis and manipulation framework written in... Read more