A Brief Overview of GPT-3 by OpenAI

You have probably already seen some articles like "A robot wrote an entire article. Aren't you scared yet, human?"

So, who is the robot here?

It's GPT-3 model. It's a transformer based language model. The full form of GPT is Generative Pre-trained Transformers. This model is developed by OpenAI.

There were GPT-2 and other models released by OpenAI previously. GPT-3 was released in May 2020. GPT-3 is more robust than its predecessors. Though architecturally it doesn't have that mush difference.

GPT-3 can write articles, poems, and even working code for you*, given some context. There are some limitations which I am going explain later in this article.

It's a language model means given a text, it probabilistically predicts what tokens from a known vocabulary will come next in that string. So, it's sort of a autocomplete that we see on a phone keyboard. We type a word, and then the keyboard suggests another word that can come next. What sets GPT-3 apart from earlier models is really not its architecture but its size, the number of its trainable parameters.


image source: sigmoid.com

As you can see in the above image, the trainable parameters of GPT-3 is around 175 billion, which is a lot more than the other similar models like GPT-2 or BERT.

In general, the more training parameters you have in your model, the more data you need for training. GPT-3 was trained on a very large dataset. Architecturally, it isn't any different from the original transformer model except it is much larger, and a little bit different from the BERT, another well known model, developed by Google.

BERT was designed to take in raw text, and then produce embeddings that can be used in other machine learning applications down the line.

In comparison, gpt-1, gpt-2 and gpt-3 use the decoder half so that they take in embeddings and then they produce text.

So, one question comes a lot in mind that does gpt-3 has some sort of intelligence? The simple answer is 'NO'.

There is nothing in the gpt-3 training so that it can create a structured system of knowledge about the world. The task it's been trained to do is to predict the next word. So, it can create both factually correct and incorrect sentences.

The benefits of gpt-3 model is that the produced texts sound more fluent, and seems like a human might wrote the sentence because of it's grammatically correct structure.

GPT-3 uses few-shot learning approach for learning. So to produce some specific type of text, you just need to give few examples of that type of texts' sample to the model, and gpt-3 will produce more examples of similar texts.

Few drawbacks of this model are that it's a predominantly large English model, 93% training data was English. It's also very expensive, as I have mentioned earlier that it has 175 billion trainable parameters. The cost mentioned in OpenAI to train it was 12 million USD. Right now, it has closed API access. As the model is very large, the chances that you will be able train this model for yourself is very low.

Comments

Naira AllamAugust 21, 2023 at 7:07 PM
This blog provides a brief overview of GPT-3 by OpenAI. It may cover the fundamental features, capabilities, and potential applications of the GPT-3 (Generative Pre-trained Transformer 3) language model developed by OpenAI. A valuable resource for individuals interested in understanding the key aspects of GPT-3's impact on natural language processing. If you are looking forward to hire openAi Developers, we will gladly help you.
ReplyDelete
Replies

Add comment

Soumik's Tech Blog

Search This Blog

A Brief Overview of GPT-3 by OpenAI

Labels

Comments

Post a Comment

Popular posts from this blog

DFS Performance Measurement

Regularization in Deep Learning / Machine Learning - Prevent Overfitting

Compare Static and Dynamic Binding