Hugging Face

Software Development

The AI community building the future.

See jobs Follow

View all 441 employees

About us

The AI community building the future.

Website: https://huggingface.co
External link for Hugging Face
Industry: Software Development
Company size: 51-200 employees
Type: Privately Held
Founded: 2016
Specialties: machine learning, natural language processing, and deep learning

Products

Hugging Face

Natural Language Processing (NLP) Software

We’re on a journey to solve and democratize artificial intelligence through natural language.

Locations

Primary

Get directions
Paris, FR

Get directions

Employees at Hugging Face

See all employees

Updates

Hugging Face reposted this

Abhishek Thakur

AutoTrain @ 🤗 | GitHub ⭐️ | 1st 4x Kaggle GrandMaster ✨ | 150k+ LinkedIn, 100k+ YouTube 🚀
1d
Report this post
Brand new LLM finetuning docs for AutoTrain: https://lnkd.in/dCvUyacu Improvement suggestions most welcome :)

LLM Finetuning

huggingface.co

3 Comments

Like Comment Share
Hugging Face reposted this

Aritra Roy Gosthipaty
1d
Report this post
In the Large Language Model world, there are mostly two variants of the same model. * base model: completes a prompt (llama-3.2-1B) * instruction tuned model: completes an instruction (llama-3.2-1B-Instruct) BASE MODEL prompt: i am going to completion: i am going to school, to learn new things. INSTRUCTION TUNED (INSTRUCT) MODEL prompt: who are you? completion: who are you? I am an instruction tuned model. What do you need help with? As you might have noticed, both the models complete a given prompt. The instruction tuned model is the base model which was later fine-tuned on an instructional dataset. This makes it a really good conversational model. Now that we know about the two models let's dive into how to format the prompts for each. With the base model it is really simple, you write a piece of text, and tokenize it. That is it! Due to the chat completion, the instruction tuned models have special tokens to be incorporated into the prompt. ``` <|begin_of_text|><|start_header_id|>system<|end_header_id|> assistant<|eot_id|><|start_header_id|>user<|end_header_id|> Who are you?<|eot_id|> ``` To add on top of the chaos (for instruct models), the special tokens for different models might be (mostly are) different. It would be very difficult to consult the model card, and build the correct prompt for each instuct model. To solve this problem Matthew Carrigan from the Hugging Face team introduced "chat template". It is a Jinja template (for each model) that solves the problems of formatting the propmt correctly. Each (instrcut) model has a tokenizer, and each tokenizer will have a chat tempalte. All you have to do is compose a list of messages (conversation) and apply the chat template to it. To make the example more concrete, let's apply two distinct chat templates. For a more detailed understanding I advice reading Matthew Carrigan's blog post on chat template, titled "Chat Templates: An End to the Silent Performance Killer" (link in comments) Happy chatting!
13 Comments

Like Comment Share
Hugging Face reposted this

Aymeric Roucher

Machine Learning Engineer @ Hugging Face 🤗 | Polytechnique - Cambridge
1d
Report this post
📜 Old-school RNNs can actually rival fancy transformers! Remember good old RNNs (Recurrent Neural Networks)? Well, researchers from Mila - Quebec Artificial Intelligence Institute and Borealis AI just have shown that simplified versions of decade-old RNNs can match the performance of today's transformers. They took a fresh look at LSTMs (from 1997!) and GRUs (from 2014). They stripped these models down to their bare essentials, creating "minLSTM" and "minGRU". The key changes: ❶ Removed dependencies on previous hidden states in the gates ❷ Dropped the tanh that had been added to restrict output range in order to avoid vanishing gradients ❸ Ensured outputs are time-independent in scale (not sure I understood that well either, don't worry) ⚡️ As a result, you can use a “parallel scan” algorithm to train these new, minimal RNNs, in parallel, taking 88% more memory but also making them 200x faster than their traditional counterparts for long sequences 🔥 The results are mind-blowing! Performance-wise, they go toe-to-toe with Transformers or Mamba. And for Language Modeling, they need 2.5x fewer training steps than Transformers to reach the same performance! 🚀 🤔 Why does this matter? By showing there are simpler models with similar performance to transformers, this challenges the narrative that we need advanced architectures for better performance! 💬 François Chollet wrote in a tweet about this paper: “The fact that there are many recent architectures coming from different directions that roughly match Transformers is proof that architectures aren't fundamentally important in the curve-fitting paradigm (aka deep learning)” “Curve-fitting is about embedding a dataset on a curve. The critical factor is the dataset, not the specific hard-coded bells and whistles that constrain the curve's shape.” It’s the Bitter lesson by Richard Sutton striking again: don’t try fancy thinking architectures, just scale up your model and data! Read the paper 👉 https://lnkd.in/eQiV_8nZ
61 Comments

Like Comment Share
Hugging Face reposted this

Abhishek Thakur

AutoTrain @ 🤗 | GitHub ⭐️ | 1st 4x Kaggle GrandMaster ✨ | 150k+ LinkedIn, 100k+ YouTube 🚀
1d
Report this post
Brand new LLM finetuning docs for AutoTrain: https://lnkd.in/dCvUyacu Improvement suggestions most welcome :)

LLM Finetuning

huggingface.co

3 Comments

Like Comment Share
Hugging Face reposted this

Gradio

36,032 followers
2d
Report this post
Very excited about the release of FLUX1.1 [pro], one of the most advanced and efficient image-gen model yet! Explore FLUX1.1 [pro] with a Gradio app on Hugging Face Spaces [BYOK] : https://lnkd.in/gRzSnumT
2 Comments

Like Comment Share
Hugging Face reposted this

Lysandre Debut

Head of Open Source at Hugging Face
3d
Report this post
Transformers v4.45 was just released, and it introduces a change I would not have expected: Modularity in Modeling Files. Transformers has always been strict about its single-file policy: a model must be defined in a single file rather than through layers of abstraction. So, what changed, and why are we seemingly moving away from the concept that made transformers what it is today, with 250+ model architectures across many modalities? We respond to an issue that affects both contributors and maintainers: contributing a model to transformers is long and tedious. It oftens results in PRs spanning across 20+ files, with thousands of lines of code. We wanted a solution to remove that constraint from contributors, therefore significantly enabling model additions from model authors and community members. Still, the single-file policy is at the core of Transformers: controversial to some due to the constraints it brings with it, we know for a fact that it enabled: - Researchers to experiment and tweak the modeling files - Students to go through the code without jumping from abstraction to abstraction, - Community members to contribute models without first needing to understand the rest of the overwhelmingly large package. Therefore, we've worked on "Modular Transformers," an approach to designing modeling files in a modular way while maintaining the single-file policy. Contributing a model to Transformers can now be done by subclassing other models, inheriting all their attributes, methods, and forward definitions. The tool we contribute enables unraveling that inheritance into a single file. The RoBERTa "Modular" modeling file above defines the base and masked LM models. This is then unraveled in a 1700+ single-file model definition, which can be inspected, debugged, tweaked, and adapted. The model definition spans ~30 lines of code: only the differences are now explicit. This is particularly important in the wake of LLMs, with each released model being only slightly different in terms of architecture; most of the difference lying in the data for the pretrained checkpoints. While the "Modular" and "Single-file" model definitions serve different purposes, they should both result in the exact same code execution. We aim for no magic, no hidden behavior: define a code path, a property, a method in the modular file, and you'll see it reflected in the single file. With this now merged, we can start seeing model contributions coming in at 215 LoC for the modular file; being unraveled to several files, the single-file definition standing at 1300+ LoC. Now, please come and help us break it! It's experimental and brittle, but it should drastically lower the barrier of entry for model contribution. Come and contribute your model to make it accessible to the community at large
6 Comments

Like Comment Share
Hugging Face reposted this

Gradio

36,032 followers
3d Edited
Report this post
🎤 Voice-Restore is now LIVE on Hugging Face! 🚀 Cutting-edge model can fix background noise, reverberations, distortions, and signal loss. 📣 VoiceRestore uses Flow-Matching Transformers for Speech Recording Quality Restoration 🔊 The audio restoration app is build with Gradio 5 (we are still in Beta!😎): https://lnkd.in/g9NZpK2e 💻 Super easy to use: Built on 🤗 Transformers by Jade Choghari, integrated seamlessly with Gradio for a smooth experience! 🔧 Build the gradio app locally: https://lnkd.in/grbSusMV Kudos to the author for the release Stanislav Kirdey! With Gradio5, Python is the language for you if you want to build highly performant apps with a slick UI. Extremely simple to start using Beta release: `pip install gradio==5.0b5` Docs for Gradio 5 Beta: https://lnkd.in/ghJ97rRn

20 Comments

Like Comment Share
Hugging Face reposted this

Gradio

36,032 followers
4d Edited
Report this post
Now you can take audio notes and transcribe them in Real-time with Whisper Turbo and Gradio 5!🤩 ✨ Completely open-source stack for building high-performing Python apps. Build them locally or host them publicly. Realtime Whisper-Large-v3 Turbo with a Gradio app on Hugging Face Spaces: https://lnkd.in/gh_tgd7W Kudos to Nishith Jain (@kingnish24 on X) for the brilliant gradio app 👏

7 Comments

Like Comment Share
Hugging Face reposted this

Argilla

9,437 followers
4d
Report this post
How do you start your text classification project on the Hugging Face Hub? David Berenstein will guide you through the journey of creating a text classifier from scratch using Open Source tools. 🚀 Agenda: - Deploy Argilla on Hugging Face Spaces - Configure and create an Argilla dataset - Use model predictions to accelerate labeling - Train a SetFit model 👇🏾Link to the event in the comments
3 Comments

Like Comment Share
Hugging Face reposted this

Philipp Schmid

Technical Lead & LLMs at Hugging Face 🤗 | AWS ML HERO 🦸🏻♂️
4d
Report this post
Whisper Turbo is available on Hugging Face! 🚀 Model: https://lnkd.in/ez6D5Gkf Demo: https://lnkd.in/e4rRc6nD
Philipp Schmid

Technical Lead & LLMs at Hugging Face 🤗 | AWS ML HERO 🦸🏻♂️
4d

OpenAI has released new Whisper models! 👀 Yesterday, OpenAI updated their Github and added a new Whisper V3 Turbo model! The turbo model is an optimized version of large-v3 that offers 8x faster transcription speed with minimal degradation in accuracy (no benchmarks yet) with roughly half the size. ⚡️ Coming to Hugging Face soon… 🔜
4 Comments

Like Comment Share

Browse jobs

Funding

Hugging Face 7 total rounds

Last Round

Series D Feb 16, 2024

See more info on crunchbase

Hugging Face

Software Development

The AI community building the future.

About us

Products

Hugging Face

Natural Language Processing (NLP) Software

Locations

Employees at Hugging Face

Ludovic Huraux

Bassem ASSEH

Rajat Arya

Tech Lead & Software Engineer @ HF | prev: co-founder XetHub, Apple, Turi, AWS, Microsoft

Jeff Boudier

Product + Growth at Hugging Face

Updates

Join now to see what you are missing

Similar pages

Anthropic

OpenAI

Mistral AI

LangChain

Generative AI

Google DeepMind

LlamaIndex

DeepLearning.AI

Perplexity

Cohere

Browse jobs

Scientist jobs

Analyst jobs

Engineer jobs

Machine Learning Engineer jobs

Developer jobs

Manager jobs

Librarian jobs

Intern jobs

Data Scientist jobs

Director jobs

Operational Specialist jobs

Head jobs

Data Science Specialist jobs

Software Engineer jobs

Project Manager jobs

Data Analyst jobs

Account Executive jobs

Recruiter jobs

Product Manager jobs

Frontend Developer jobs

Funding