PyTorch has quickly been gaining steam as a leading deep learning framework, with many practitioners choosing it over TensorFlow as their go-to framework of choosing lately. Ahead of his upcoming Ai+ Training session, PyTorch 101, coming up this August 24th, we spoke with Daniel Voigt Godoy, Senior Data Scientist at Deloitte, to learn more about PyTorch, why it’s popular, and reasons to learn it.
Why PyTorch in particular? What advantages does it have over other deep learning frameworks?
First, coding in PyTorch is fun 🙂 Really, there is something to it that makes it very enjoyable to write code in. Some say it is because it is very pythonic, or maybe there is something else, who knows? Hopefully, you will feel like that too!
Second, maybe there are even some unexpected benefits to your health—check Andrej Karpathy’s tweet  about it! Jokes aside, PyTorch is the fastest-growing  framework for developing deep learning models and it has a huge ecosystem.  That is, there are many tools and libraries developed on top of PyTorch. It is the preferred framework  in academia already and is making its way into the industry.
Why did you decide to do a course on PyTorch and deep learning? Is it a topic you’re passionate about?
I used to be a Keras user until, in 2019, I was attending the Nordic Probabilistic AI School, and I had to learn PyTorch to get the best out of that course. Of course, there’s a lot of transferable knowledge from one framework to another. But I bumped into some small pitfalls myself, and I organized my findings into a blog post outlining the very first steps one could take to train a simple model using PyTorch. That blog post was (and still is) wildly successful, and shortly after publishing it, I was approached by an acquisition editor from a traditional publisher in the tech scene to write a PyTorch book for them. I was flattered, but I ended up declining it because I wasn’t sure if I was able to pull it off (300+ pages in 6 months). The idea remained in my mind, though. In 2020, in lockdown mode due to the pandemic, I decided to start writing my book and, as they say, the rest is history 🙂
If someone is new to data science, why would you recommend they learn more about deep learning?
Deep learning models are being commoditized these days. You don’t need to be a researcher to successfully train (or actually fine-tune) a model. In Computer Vision, and also in Natural Language Processing since a few years ago, you can leverage pre-trained models to satisfy your own needs. Of course, you still need to understand the general mechanics behind the whole process and develop intuition, but – and this may be a somewhat controversial opinion – you don’t need to master calculus or be a PhD anymore. So, if you need to handle unstructured data (that is, images or text), deep learning has the potential to help you a lot.
What skills/tools should someone already be familiar with before getting started with PyTorch or deep learning?
Ideally, you’d need to have some experience with traditional Machine Learning, so you should be familiar with loss functions and evaluation metrics for regression and classification, and the reasoning behind the train-validation-test split, to name a few. If you have knowledge of object-oriented programming (OOP), even better, but that’s not a hard requirement. Finally, you should be comfortable with the PyData stack (Numpy, Matplotlib, Pandas).
More about the AI+ Training Session
Learn the basics of building a PyTorch model using a structured, incremental, and from first principles approach. Find out why PyTorch is the fastest growing Deep Learning framework and how to make use of its capabilities: autograd, dynamic computation graph, model classes, data loaders, and more. The main goal of this training is to show you how PyTorch works: we will start with a simple and familiar example in Numpy and “torch” it! At the end of it, you should be able to understand PyTorch’s key components and how to assemble them together into a working model.
About Daniel Voigt Godoy
Daniel Voigt Godoy has 20+ years of experience in developing solutions, programs, and models using analytical skills across different industries: software development, government, fintech, retail, and mobility. 7+ years of experience with data processing, data analysis, machine learning, and statistical tools: Python (NumPy, scipy, pandas, scikit-learn), Spark, R Studio, MatLab, and Statistica. Experience in stochastic simulation and agent-based modeling. Experienced programmer in SQL, Python, Java, R, PowerBuilder, PHP. Strong programming skills and eagerness to learn different languages, frameworks, and tools. Solid background in statistics, economics, capital markets, debt management, and financial instruments.