Hugging Face Course on applying Transformers in *industrial settings* 🔥🔥🔥
Sphere and Hugging Face are partnering
Hey Guys,
In an era where Transformers are “everywhere”, how do apply them in so-called industrial settings?
Lewis Tunstall knows a thing or two about Transformers. He’s the author of Natural Language processing with Transformers (the book).
So now Hugging Face is teaming up with Sphere:
This is not a sponsored post, I just found it perhaps noteworthy for some of my audience reading this Newsletter. So here is their pitch:
Transformers are changing the AI landscape as we know it. Learn how to apply these state-of-the-art models to your businesses with the Hugging Face team that has pioneered their open-source access.
Format: 5 x 2 hr live workshops (+ recordings of each)
Dates: 8am PST; Sep 19, 21, 26, 28 and Oct 3
Price: $700 per seat (expense through L&D)
Apply with your company E-mail here:
I’m really paying closer attention to Hugging Face these days. The BigScience initiative and language model BLOOM really peaked my interest.
Hugging Face and Democratization of A.I.
There’s a lot more attention in A.I. ethics in 2022 than we’ve seen previously. Concerns about misuse of LLMs inspired Cohere, OpenAI, and AI21 Labs to publish, in June 2022, a preliminary set of best practices for their responsible development and deployment.
BLOOM — an acronym for BigScience Large Open-science Open-access Multilingual Language Model — is the brainchild of BigScience, a collective of more than 1,000 volunteer researchers worldwide.
While French machine learning platform Hugging Face led the project, starting in 2021, contributors included Nvidia, Microsoft, and support from the French National Research Agency, CNRS. BLOOM was built and trained using the Jean Zay supercomputer.
Medium and Substack writer Alberto Romero has a Twitter thread about it here:
In the evolution of code and now machine learning and A.I. models, volunteers and BigTech employees are building new frameworks that might improve A.I. regulation and A.I ethics as well as the democratization of A.I. moving forwards.
For the course with Sphere and Hugging Face, many learning and development programs will cover the $700 expense, just ask if it’s relevant to your job.
Platform for Machine Learning Engineers
So Hugging Face are involved in some inclusive projects and is quickly becoming a sort of “GitHub for machine learning”. The Hugging Face model card breaks down the distribution of languages used in BLOOM’s training data, with English (30.04%), Simplified Chinese (16.2%), and French (12.9%) accounting for the greatest swaths.
In practice, Hugging Face is a community and data science platform that provides: Tools that enable users to build, train and deploy ML models based on open source (OS) code and technologies. That they are contributing courses is very promising.
The above Sphere course is designed for ML Engineers and ML Researchers.
Here is the LinkedIn post by Lewis about it:
We've teamed up with Sphere to create a course that focuses on applying Transformers in *industrial settings* 🔥🔥🔥
You'll learn how to:
📚 deal with long texts
🤖 pretrain on new domains
🏎 optimize for production
🚀 and more!
The course kicks off on September 19 and runs for 3 weeks 🤓
Register here 👉: https://lnkd.in/e-ZNeyY3
It makes you wonder where Transformers will be in a decade and how A.I. will scale in different settings.
What is a Transformer again?
A transformer is a deep learning model that adopts the mechanism of self-attention, differentially weighting the significance of each part of the input data. It is used primarily in the fields of natural language processing and computer vision.
What is NLP again?
According to Hugging Face,
NLP is a field of linguistics and machine learning focused on understanding everything related to human language. The aim of NLP tasks is not only to understand single words individually, but to be able to understand the context of those words.
The following is a list of common NLP tasks, with some examples of each:
Classifying whole sentences: Getting the sentiment of a review, detecting if an email is spam, determining if a sentence is grammatically correct or whether two sentences are logically related or not
Classifying each word in a sentence: Identifying the grammatical components of a sentence (noun, verb, adjective), or the named entities (person, location, organization)
Generating text content: Completing a prompt with auto-generated text, filling in the blanks in a text with masked words
Extracting an answer from a text: Given a question and a context, extracting the answer to the question based on the information provided in the context
Generating a new sentence from an input text: Translating a text into another language, summarizing a text
I thought it was worthwhile to mention, thanks for reading!
As for Substack, it’s early days for our A.I. community as there aren’t many Newsletters that have reached mass adoption yet that cover our topics.
How do you see DevOps and MLops evolving together and the data science community forming on Substack?
Thanks for reading! If you want to support the channel and allow me to continue to write Newsletters feel free to get access to more content.