fbpx
Financial Data Modeling with RAPIDS.
A financial dataset is challenging in many ways. The data is usually anonymized to protect customers’ privacies. Sometimes even the column name of the tabular data is encoded, which can prevent feature engineering using domain knowledge. As required by financial regulation and laws, oftentimes the models must be interpretable, like logistic... Read more
gQuant — GPU-Accelerated examples for Quantitative Analyst Tasks
gQuant Background: Our prior blog gave a high-level overview of examples in the gQuant repository using GPU accelerated Python. Here we will dive more deeply into the technical details. The examples in gQuant are built on top of NVIDIA’s RAPIDS framework and feature fast data access provided by cuDF dataframes residing in high... Read more
RAPIDS 0.8: Same Community New Freedoms
RAPIDS released 0.8 a few weeks back. And afterwards, like most Americans, we took off for the 4th of July holiday. Over that break, I reflected on the purpose of RAPIDS. Speed is great, building a strong community is awesome, but the true power of RAPIDS is in the enablement... Read more
Nightly News: CI produces latest packages
“Release code early and often” is a software engineering philosophy that RAPIDS takes to heart. We try to release about every six weeks or so, partly to keep up the pace of feature development, but also so RAPIDS users don’t get stuck on older versions of our software for too long.... Read more
When Less is More: A Brief Story About Feature Engineering with XGBoost
I played a minor role launching RAPIDS on Google Dataproc by refining a model that predicts taxi fare in New York City. Geographic location of passenger pick-ups and drops-offs were columns in the data. These are recorded as longitude and latitude measurements, with precision to many decimal places. One of the... Read more
RAPIDS cuGraph
The Data Scientist has a collection of techniques within their proverbial toolbox. Data engineering, statistical analysis, and machine learning are among the most commonly known. However, there are numerous cases where the focus of the analysis is on the relationship between data elements. In those cases, the data is... Read more
The Rise of Notebooks Extended
I recently had the privilege of presenting a workshop at the AI + Education Curiosity Conference 2019. There, I demonstrated to educators, school district staff, researchers, and students how RAPIDS software enables students to learn and iteratively practice data science using full datasets all within classroom time constraints. Compared to current methods and workarounds,... Read more
Run RAPIDS on Google Colab — For Free
Google Colab is a hosted Jupyter-Notebook like service which has long offered free access to GPU instances. Recently, Colab got even sweeter. The GPUs powering Colab were upgraded to NVIDIA new T4 GPUs. This upgrade unlocks new software packages; which means you can now experiment with RAPIDS on Colab for free! Check out... Read more
Using RAPIDS with PyTorch
In this post we take a look at how to use cuDF, the RAPIDS dataframe library, to do some of the preprocessing steps required to get the mortgage data in a format that PyTorch can process so that we can explore the performance of deep learning on tabular data and... Read more