fbpx
pyjanitor 0.3 Released! pyjanitor 0.3 Released!
A new release of pyjanitor is out! Two new features that I have added in include: Concatenating column names into a single column,... pyjanitor 0.3 Released!

A new release of pyjanitor is out!

Two new features that I have added in include:

  1. Concatenating column names into a single column, such that each item is separated by a delimiter.
  2. Deconcatenating a column into multiple columns, separating on the basis of a delimiter.

Both of these tasks come up frequently in data preparation.

For example, concatenating a few columns together oftentimes lets us create an unique index based sample properties.

On the other hand, deconcatenating columns into multiple columns can be useful when our index is used to store metadata. (This really shouldn’t be happening, but… sometimes that’s just how the world works right now…)

Here’s an example of how it works:

To install pyjanitor, grab it from PyPI:

$ pip install pyjanitor

The conda-forge build will be coming soon!


 

Original Source

Eric Ma

Eric Ma

Eric is an Investigator in the Scientific Data Analysis team at the Novartis Institutes for Biomedical Research, where he solves biological problems using machine learning. He obtained his Doctor of Science (ScD) from the Department of Biological Engineering, MIT, and was an Insight Health Data Fellow in the summer of 2017. He has taught Network Analysis at a variety of data science venues, including PyCon USA, SciPy, PyData and ODSC, and has also co-developed the Python Network Analysis curriculum on DataCamp. As an open source contributor, he has made contributions to PyMC3, matplotlib and bokeh. He has also led the development of the graph visualization package nxviz, and a data cleaning package pyjanitor (a Python port of the R package).

1