What is TorchRec?
An open source library for building state-of-the-art recommendation Systems under PyTorch
CEO Mark Zuckerberg has introduced TorchRec, an open source library for building state-of-the-art recommendation Systems under PyTorch, at Inside the Lab event.
PyTorch is an open source machine learning framework based on the Torch library, used for applications such as computer vision and natural language processing, primarily developed by Facebook's AI Research lab.
https://github.com/pytorch/torchrec
Pytorch introduces TorchRec, an open source library to build recommendation systems
MarketTechPost somehow neglected to mention that it is a FAIR product. Recommendation Systems (RecSys) are a big part of today’s production-ready AI. In contrast to domains like Vision and NLP, most of RecSys’ continuous discovery and development takes place behind closed doors. The field is far from democratized for academic researchers exploring these approaches or creating individualized user experiences.
To understand TorchRec it’s important, it’s important to understand what PyTorch is.
PyTorch is an open source deep learning framework built to be flexible and modular for research, with the stability and support needed for production deployment. PyTorch provides a Python package for high-level features like tensor computation (like NumPy) with strong GPU acceleration and TorchScript for an easy transition between eager mode and graph mode. With the latest release of PyTorch, the framework provides graph-based execution, distributed training, mobile deployment, and quantization.
RecSys as a field is also defined by learning models over sparse and/or sequential events, which has a lot of overlap with other AI fields. Many of the approaches, particularly those for scalability and distributed execution, are portable.
TorchRec is a new PyTorch domain library for Recommendation Systems.
This library includes standard sparsity and parallelism primitives, allowing researchers to create and implement cutting-edge customization models.
Recommendation Systems (RecSys) comprise a large footprint of production-deployed AI today, but you might not know it from looking at Github.
By mid-2020, the PyTorch team received a lot of feedback that there hasn’t been a large-scale production-quality recommender systems package in the open-source PyTorch ecosystem.
Meta wanted to contribute Meta’s production RecSys stack as a PyTorch domain library, with a strong commitment to growing an ecosystem around it. This seemed like a good idea that benefits researchers and companies across the RecSys domain. So, starting from Meta’s stack, they began modularizing and designing a fully-scalable codebase that is adaptable for diverse recommendation use-cases.
What does TorchRec include?
In particular, the library includes:
Modeling primitives, such as embedding bags and jagged tensors, that enable easy authoring of large, performant multi-device/multi-node models using hybrid data-parallelism and model-parallelism.
Optimized RecSys kernels powered by FBGEMM , including support for sparse and quantized operations.
A sharder which can partition embedding tables with a variety of different strategies including data-parallel, table-wise, row-wise, table-wise-row-wise, and column-wise sharding.
A planner which can automatically generate optimized sharding plans for models.
Pipelining to overlap dataloading device transfer (copy to GPU), inter-device communications (input_dist), and computation (forward, backward) for increased performance.
GPU inference support.
Common modules for RecSys, such as models and public datasets (Criteo & Movielens).
Performance Scaling
TorchRec features cutting-edge architecture for scaled Recommendations AI, which powers some of Meta’s most complex models. It was utilized to train a 1.25 trillion parameter model that went life in January and a 3 trillion parameter model that will go live soon. This should indicate that PyTorch can solve the most complex RecSys challenges in the industry.
Meta Hopes to Allow TorchRec to Grow with Open-Source Community
Open-source and open-technology have universal benefits. Meta is seeding the PyTorch community with a state-of-the-art RecSys package, with the hope that many join in on building it forward, enabling new research and helping many companies. The team behind TorchRec plan to continue this program indefinitely, building up TorchRec to meet the needs of the RecSys community, to welcome new contributors, and to continue to power personalization at Meta.
BigTech companies appear to be using concepts such as Open-source and “decentralization” to their own benefit also getting a form of free work from prospective developers using their tools. Software engineering and software development is full of this kind of meddling from major technology companies, which is both good and bad for innovation.
Evolution of Open-Source Tools Continues in A.I.
Unfortunately, providing large-scale benchmarks using public datasets is problematic since most open-source criteria are too small to demonstrate performance at scale. Still TorchRec is certainly impressive:
The new library provides
Common sparsity and parallelism primitives.
Enables researchers to build state-of-the-art personalisation models and deploy them in production.
Includes a scalable low-level modelling foundation alongside rich batteries-included modules.
The library includes optimised Recommendation Systems kernels that run on FBGEMM, a high-performance kernel library, modelling primitives such as jagged tensors and embedding bags, to create multinodal models using model parallelism. The PyTorch team has released TorchRec after close to two years in testing.
PyTorch is one of the major Deep Learning libraries, besides TensorFlow and Keras. As such, the announcement of TorchRec is big news for the recommender-systems community.
The TorchRec announcement was made by Meta AI - Donny Greenberg, Colin Taylor, Dmytro Ivchenko, Xing Liu in late February, 2022.
If you enjoy my articles please considering tipping, patronage and support as I cannot continue to write without community support. I want to keep my articles free for the majority of my readers.
I’m hoping this Newsletter can help, inform and inspire someone out there.
Thanks for reading!