Editor’s note: Andrea and Jennifer are both speaking at ODSC West 2021. Be sure to check out their talk, “Practical Reinforcement Learning for Data Scientists,” there!
Reinforcement learning is a promising area of intense research within artificial intelligence. It addresses the problem of automatic learning of problem-solving optimally over time. In other words, reinforcement learning studies how an agent can interact with its environment to learn a policy that maximizes expected cumulative rewards for a task.
As we study reinforcement learning, we become open to new ideas. We learn that a world with static input and static output does not exist; instead, the world is constantly changing, even problems that look like a static input with actual output. In reality, problems are often dynamic over time. For example, imagine you want to determine which one drug is most effective at treating cancer versus all others. This postulate may seem like a static problem, but over time one realizes that new drugs appear on the market and work differently for different cancers. Because reinforcement algorithms learn over time and change with new data, they are powerful tools to predict the next-best moves for many industries.
Recently, reinforcement learning has experienced growth and interest due to use case results in robotics, finance, autonomous driving, business, education, energy, healthcare, to name a few. In the past, limits of scalability existed with reinforcement learning projects. However, two new distributed computing frameworks, Pytorch and Ray, have made reinforcement learning much more efficient and scalable. We find that these platforms are great supplements to classical reinforcement learning libraries such as OpenAI’s gym library.
Our course discusses how reinforcement learning applies to financial data modeling, prediction, and forecasting. There is an opportunity for the data scientist to predict financial volatility in the markets using Optiver’s Kaggle competition data and open-source financial data. Modeling markets is a forecasting problem one could apply reinforcement learning to solve. Large financial datasets include the US Funds Dataset from Yahoo Finance, among others.
Our tutorial focuses on the problem space of financial markets. We provide a gentle introduction during the discussion portion to the canonical Q-Learning with an example of winning the cart-pole problem game. In this problem, a pole, attached by a movable joint to a cart, swings. The cart moves along a track with no friction. The system applies a force of +1 or -1 to the cart. The goal is to prevent the pole (acting like a pendulum) from falling. We will use this simple example to demonstrate a q-learning approach and the problem space. Next, we move on to discuss more sophisticated reinforcement learning. We focus on a technique called Deep Q Learning. Deep Q Learning evolved from Q-Learning, the classic reinforcement learning technique that catalogs and maximizes rewards while learning the best policy.
Our ODSC-West tutorial outlines how reinforcement learning applies to a finance example, predicting the next best move (i.e., best purchase and sales of stocks) to maximize gains. Our valuable tutorial will teach both deep learning and distributed computing skills. Our tutorial provides a better understanding of reinforcement learning, its uses, and hands-on applications of reinforcement learning.
Article by Jennifer D. Davis, PhD, and Andrea Lowe, PhD of Domino Data Labs.