What are Few-Shot, Zero-Shot & One-Shot

王白水 · November 28, 2023, 7:26am

Recommended Terms

Original Link

https://zhuanlan.zhihu.com/p/624793654

Main Text

(If there are significant inaccuracies, please point them out, experts.)

First, let’s explain one-shot. The company’s access control system disables facial recognition. You provide only one photo, and the system can recognize you from different angles — this is one-shot. One-shot can be understood as fine-tuning a model with just one piece of data. In facial recognition scenarios, one-shot is very common.

Zero-shot and few-shot bring us back to the NLP scenario. Training a GPT model using Wikipedia, news, etc., and directly using it for dialogue tasks is zero-shot. Then, realizing there is quite a lot of nonsense generated, some people labeled a small amount of quality data and fed it in — this is few-shot.

The development history of chatGPT is from zero-shot to few-shot. (Excerpted from Mu Shen’s paper reading series)

Background. Before GPT-3, it was a competitive relationship with BERT along two different routes.
GPT-2 is zero-shot. Its performance did not surpass BERT, but wanting to publish a paper, it defined its selling point as zero-shot (methodological innovation), i.e., completely unsupervised learning. The paper title: Language Models are Unsupervised Multitask Learners.
GPT-3 is few-shot. Its performance is better than BERT, no need to find academic selling points anymore. Also, the cost-effectiveness of zero-shot for products is indeed not high, so it switched to few-shot, meaning some people did labeling. The paper title: Language Models are Few-Shot Learners.
chatGPT is HFRL. After GPT-3, the problem was: what exactly is the shot in few-shot (which data to label)? They combined it with reinforcement learning, that is, human feedback reinforcement learning, commonly known as HFRL. This is the core technology of chatGPT.

The essence of the HRFL method is: how to align the machine’s knowledge with human knowledge. It then pioneered a new direction called alignment. Many big names including openAI are following this new direction.

Note: The “alignment” here is completely different from the alignment in facial recognition.

Topic	Replies	Views
从零开始训练nanogpt 🤖人工智能	6	October 15, 2025
从零开始自己训练一个最小的大模型 🤖人工智能	36	December 17, 2025
深度学习概念backbone、neck、head 🤖人工智能 one-stage-anchor-free	239	November 1, 2023
始徒Chat内测 📢官方小喇叭	55	November 4, 2024
大模型基础知识快速补齐 💻编程	9	December 17, 2025

What are Few-Shot, Zero-Shot & One-Shot

Recommended Terms

Original Link

Main Text

Related topics