Exploring the Deep Learning Framework PyTorch
ConferencesDeep LearningModelingPyTorchWest 2018posted by Nathaniel Jermain July 26, 2019 Nathaniel Jermain
There are a variety of open-source deep learning frameworks to choose from including Keras, TensorFlow, Caffe2, and MXNet among others. At ODSC West in 2018, Stephanie Kim, a developer at Algorithmia, gave a great talk introducing the deep learning framework PyTorch. Primarily developed by Facebook, PyTorch enables a suite of machine learning functions in Python that offer some considerable advantages over libraries like TensorFlow.
[Related Article: Using RAPIDS with PyTorch]
PyTorch can be used for a variety of functions from building neural networks to decision trees due to the variety of extensible libraries including Scikit-Learn, making it easy to get onboard. Previous versions of PyTorch did not enable rapid deployment of models with a variety of environments, but since March 2018, PyTorch added a Caffe2 backend, substantially streamlining the deployment process. Importantly, the platform has gained substantial popularity and established community support that can be integral in solving usage problems.
A key feature of Pytorch is its use of dynamic computational graphs. Computation graphs (e.g. below) state the order of computations defined by the model structure in a neural network for example. The backpropagation process uses the chain rule to follow the order of computations and determine the best weight and bias values.
Dynamic computational graphs employed by PyTorch allow flexibility in the order of computations by creating a new dynamic computation graph for each iteration of an epoch. This allows models to change structure through training which can be advantageous for recurrent neural networks, for example. PyTorch then employs reverse automatic differentiation, working backwards through the computational graph to calculate derivatives. Some libraries like TensorFlow support only static computational graphs, so model structure stays the same for each epoch. While TensorFlow is less flexible through batch iterations, each node is independent from each other, so it can allow for parallel computation. One can then distribute the computational workload to improve efficiency and employ both CPUs and GPUs simultaneously. Regarding GPU usage, PyTorch enables users to specify its use at various stages of the development process such as embedding on CPU, but computing the neural network on GPU, saving GPU cost.
The imperative programming in PyTorch results in the code running exactly how the developer wrote it. Imperative programming allows developers to debug quickly because each line can be run sequentially, and will have a specific error message. Additionally, developers can set breakpoints or print computation as they go. In contrast, TensorFlow employs symbolic programming where a model is compiled into a function, and then called at the end. While symbolic programming does not allow each line to be run sequentially, it is more computationally efficient than imperative programming.
[Related Article: Deep Learning for Speech Recognition]
While there are many tools out there for deep learning, Stephanie Kim illustrated some key advantages of using PyTorch. PyTorch specifically offers natural support for recurrent neural networks that generally run faster in the platform due to the ability to include variable inputs and dynamic computational graphs. Another key advantage is the ease of deployment through the Caffe2 backend. While there seems to be no perfect library, PyTorch offers a competitive package for machine learning in Python.