3 Easy Tricks to Create New Columns in Python Pandas
In data processing & cleaning, we need to create new columns based on values in existing columns. In this blog, I explain How to create new columns derived from existing columns” with 3 simple methods. · Use lambda Function with apply() method · Use numpy.select() method... Read more
Tips and Tricks in RStudio and R Markdown
If you have the chance to work with an experienced programmer, you may be amazed by how fast she can write code. In this article, I share some tips and shortcuts you can use in RStudio and R Markdown to speed up the writing of your... Read more
Is Groovy a Viable Language for Data Science Applications? 5 Pros and Cons
Choosing the right programming language can make a remarkable difference in data science applications. While the industry standards are Python and R, some data scientists have branched off to use others they prefer. One such possible alternative is the Groovy programming language. Apache Groovy is an... Read more
Data science teams are multidisciplinary, each with different skills and technologies of choice. Some of them use SAS, others may have analytical assets already built in Python or R. Let’s just say each team is unique. As part of our Continuous Integration/Continuous Delivery with monthly releases,... Read more
How to Import an Excel File in RStudio
As we have seen in this article on how to install R and RStudio, R is useful for many kinds of computational tasks and statistical analyses. However, it would not be so powerful and useful without the possibility to import datasets into R. As you will most... Read more
6 Trending Python Machine Learning Packages on PyPI
As the most popular programming language for data science, Python packages, frameworks, and libraries are pulled by the millions each month. Month-to-month, Python packages reflect growing trends in the field of data science; as NLP is talked about more often, so will we see more packages... Read more
Systems built with software can be fragile. While the software is highly predictable, the runtime context can provide unexpected inputs and situations. Devices fail, networks are unreliable, mere anarchy is loosed on our application. We need to have a way to work around the spectrum of... Read more
5 Reasons to Learn Python in 2021
You never hear about data science without hearing about Python as well, and for good reason as it’s the most common language for data scientists. In fact, 69% of data scientists report using Python, compared to 24% using R. This doesn’t mean Python is superior in... Read more
Develop and Deploy a Machine Learning Pipeline in 45 Minutes with Ploomber
It’s standard industry practice to prototype Machine Learning pipelines in Jupyter notebooks, refactor them into Python modules and then deploy using production tools such as Airflow or Kubernetes. However, this process slows down development as it requires significant changes to the code. Ploomber enables a leaner... Read more
Decoupling Complex Systems with Event Driven Python Programming
We often think about events as ordered points in time that happen one after another, often with some kind of cause-effect relationship. But, in programming, events are often understood a bit differently. They are not necessarily “things that happen.” Events in programming are more often understood... Read more