Hey Guys,
PyTorch has really become in 2022, a bit more dominant than Tensorflow.
So, both TensorFlow and PyTorch provide useful abstractions to reduce amounts of boilerplate code and speed up model development.
The main difference between them is that PyTorch may feel more “pythonic” and has an object-oriented approach while TensorFlow has several options from which you may choose.
But is Tensorflow now considered bloated and inferior?
Meta AI > Google AI ?
A recent article by Insider (Paywall), goes into some depth about how Meta has in a sense beat Google, where the framework PyTorch has become dominant.
The topic is interesting to me since I’ve been seeing signs of this elsewhere and in articles of late in the news cycle.
PyTorch is an open source machine learning framework based on the Torch library, used for applications such as computer vision and natural language processing, primarily developed by Meta AI. It is free and open-source software released under the Modified BSD license.
PyTorch vs. Tensorflow Overview
PyTorch and TensorFlow are far and away the two most popular Deep Learning frameworks perhaps today and the comparison is easy to make and is an age old debate. I’m no expert, but which framework is superior is a longstanding point of contentious debate, with each camp having its share of fervent supporters
Both PyTorch and TensorFlow have developed so quickly over their relatively short lifetimes that the debate landscape is ever-evolving. The conversation seems to even change by the month and year. Outdated or incomplete information is abundant, and further obfuscates the complex discussion of which framework has the upper hand in a given domain.
While TensorFlow has a reputation for being an industry-focused framework (Google) PyTorch has a reputation for being a research-focused (Meta AI) framework, these notions stem partially from outdated information, which is good to realize.
I’m not an AI researcher or anything like that but PyTorch and TensorFlow alike have unique development stories and complicated design-decision histories.
According to blog analysis on the topic:
PyTorch vs TensorFlow debate currently comes down to three practical considerations:
Model Availability: With the domain of Deep Learning expanding every year and models becoming bigger in turn, training State-of-the-Art (SOTA) models from scratch is simply not feasible anymore. There are fortunately many SOTA models publicly available, and it is important to utilize them where possible.
Deployment Infrastructure: Training well-performing models is pointless if they can’t be put to use. Lowering time-to-deploy is paramount, especially with the growing popularity of microservice business models; and efficient deployment has the potential to make-or-break many businesses that center on Machine Learning.
Ecosystems: No longer is Deep Learning relegated to specific use cases in highly controlled environments. AI is injecting new power into a litany of industries, so a framework that sits within a larger ecosystem which facilitates development for mobile, local, and server applications is important. Also, the advent of specialized Machine Learning hardware, such as Google’s Edge TPU, means that successful practitioners need to work with a framework that can integrate well with this hardware.
PyTorch (from Meta AI)
PyTorch is an open source deep learning framework built to be flexible and modular for research, with the stability and support needed for production deployment. PyTorch provides a Python package for high-level features like tensor computation (like NumPy) with strong GPU acceleration and TorchScript for an easy transition between eager mode and graph mode. With the latest release of PyTorch, the framework provides graph-based execution, distributed training, mobile deployment, and quantization.
Dynamic Neural Networks
While static graphs are great for production deployment, the research process involved in developing the next great algorithm is truly dynamic. PyTorch uses a technique called reverse-mode auto-differentiation, which allows developers to modify network behavior arbitrarily with zero lag or overhead, speeding up research iterations.
Model Availability
In the arena of model availability, PyTorch and TensorFlow diverge sharply. Both PyTorch and TensorFlow have their own official model repositories.
PyTorch Likely to Become Dominant in Research and Industry
For now, PyTorch is still the "research" framework and TensorFlow is still the "industry" framework, according to the reputation.
The majority of all papers on Papers with Code use PyTorch. I think what we are seeing in 2022 is that PyTorch is becoming the dominant one in industry as well.
These frameworks simplify the process of humanizing machines with supremacy through accurate large-scale complex deep learning models. So what happens if PyTorch becomes dominant over TensorFlow?
When Google is itself adopting PyTorch internally, it’s probably game over.
PyTorch is just the cool kid and TensorFlow is annoying, to many who use them.
Hugging Face May have Killed TensorFlow
Think about it guys, HuggingFace makes it possible to incorporate trained and tuned SOTA models into your pipelines in just a few lines of code.
When we compare HuggingFace model availability for PyTorch vs TensorFlow, the results are staggering. What do you think we find?
The number of models available for use exclusively in PyTorch absolutely blows the competition out of the water. Almost 85% of models are PyTorch exclusive, and even those that are not exclusive have about a 50% chance of being available in PyTorch as well. In contrast, only about 16% of all models are available for TensorFlow, with only about 8% being TensorFlow-exclusive.
Even as AI researchers bleed from Meta AI, you have to give them some credit.
So why did PyTorch become more dominant? Let’s say that as a younger framework with stronger community movement and being more Python friendly, it was never going to be a fair competition.
Younger new kid in the block
Meta AI did more community for it
More Python Friendly
Hugging Face and Research favored it
A ton of technical reasons too
I’m not one to argue, but spotting trends is sort of my thing so PyTorch I declare is the winner in 2022.
The proof is also on Hugging Face.
The proof now is even in Google internally, haha.
Research Papers Favorite will become Industry Favorite
For research practitioners especially, having access to models from recently-published papers is critical. Attempting to recreate new models that you want to explore in a different framework wastes valuable time, so being able to clone a repository and immediately start experimenting means that you can focus on the important work.
Given that PyTorch is the de facto research framework, I think it’s safe to say that we can expect the trend we observed on HuggingFace to continue into the research community as a whole; and our intuition is correct. I also think this dominance spreads into industry in the later half of 2022 and into 2023. Am I wrong?
Python’s incredible popularity in machine learning also helps PyTorch become more dominant. It’s all about having the best open-source community sponsored by BigTech. While it’s a touch sad for Google, it’s a natural changing of the guard for deep-learning.
Thanks for reading guys! If you want to support the channel, it’s like buying a cup of coffee.