Optuna: An Automatic Hyperparameter Optimization Framework
Note: Please go here to see a high-resolution version of the title image) Preferred Networks has released a beta version of an open-source, automatic hyperparameter optimization framework called Optuna. In this blog, we will introduce the motivation behind the development of Optuna as well as its features. [Related Article:... Read more
Some Details on Running xgboost
While reading Dr. Nina Zumel’s excellent note on bias in common ensemble methods, I ran the examples to see the effects she described (and I think it is very important that she is establishing the issue, prior to discussing mitigation). ... Read more
Hierarchical Bayesian Models in R
Hierarchical approaches to statistical modeling are integral to a data scientist’s skill set because hierarchical data is incredibly common. In this article, we’ll go through the advantages of employing hierarchical Bayesian models and go through an exercise building one in R. If you’re unfamiliar with Bayesian modeling, I recommend... Read more
Financial Data Modeling with RAPIDS.
A financial dataset is challenging in many ways. The data is usually anonymized to protect customers’ privacies. Sometimes even the column name of the tabular data is encoded, which can prevent feature engineering using domain knowledge. As required by financial regulation and laws, oftentimes the models must be interpretable, like logistic... Read more
What is MLPerf?
AI might be a buzzword, but the hype is outpacing tools to ensure benchmarks. Up to this point, assessing the performance of ML software was difficult. You couldn’t just measure it objectively against other types of frameworks. Now, a collection of tech companies have released MLPerf, a consistent way... Read more
gQuant — GPU-Accelerated examples for Quantitative Analyst Tasks
gQuant Background: Our prior blog gave a high-level overview of examples in the gQuant repository using GPU accelerated Python. Here we will dive more deeply into the technical details. The examples in gQuant are built on top of NVIDIA’s RAPIDS framework and feature fast data access provided by cuDF dataframes residing in high... Read more
RAPIDS 0.8: Same Community New Freedoms
RAPIDS released 0.8 a few weeks back. And afterwards, like most Americans, we took off for the 4th of July holiday. Over that break, I reflected on the purpose of RAPIDS. Speed is great, building a strong community is awesome, but the true power of RAPIDS is in the enablement... Read more
Factors in R
The factor is a foundational data type in R. Factors are generally used to represent categorical variables, which may be intrinsically unordered (nominal) or ordered (ordinal). While the underlying data is often character, factors can be built on numerics as well. Factor variables are stored as integers pointing to unique values of underlying... Read more
Deep Learning in R with Keras
The primary professional hat I wear is as a data science consultant working with machine learning in a variety of problem domains. Due to my academic past in computer science and applied statistics, my development environment of choice today is typically R. Lately however, Python is taking the lead... Read more
Introduction to IBM Assistant
IBM Assistant is a chatbot service that many companies are deploying either on their websites or their portal. IBM Watson is providing cloud services, one of them is to build chatbots and you can deploy it either on the website or make a window application. In this blog, our... Read more