GPT-3 is the third-generation predictive text model from OpenAI, and it is one of the most powerful predictive text models available today. It has been trained on a large amount of data, and it can generate very realistic and coherent text. GPT-3 has many applications, including machine translation, text generation, and question answering. It is also very efficient, and can run on a standard laptop or even a smartphone. Overall, GPT-3 is a very impressive predictive text model, and it is likely to have a major impact on the field of natural language processing.
That paragraph was written entirely by GPT-3. In order to get the excerpt, I told it to “generate the first paragraph, in four sentences, for an article about GPT-3.” Like many human writers, it had a little trouble staying within the sentence limit, but you can see that this language-producing computer model has the ability to write very fluent summarizations. I’ll take over from here to tell you a bit more about why GPT-3 exists and what it can be used for.
Machine learning (ML), a subset of artificial intelligence, is a system in which a computer is fed input and then learns on its own how to process it in human-like ways. It uses algorithms to find patterns in data and gradually improves its understanding enough to make predictions or complete a task. These are usually narrow and concrete. For example, one of our interns set up an ML system that was able to beat a Super Mario World game. If asked to do something else, it wouldn’t know what to do. But what if an AI system could understand more, like the enormous repository of information on the internet?
OpenAI has set out to build AGI—artificial general intelligence—that benefits humanity. Its GPT-2 algorithm, released in early 2019, could produce realistic and knowledgeable language, although its output was inconsistent. For an essay contest on climate change, it was able to formulate an essay not unlike what a person might write. (Blind judges said that it lacked novelty and took too long to get to the point, but they likely had no clue that it wasn’t written by a human.)
GPT-3, introduced in June 2020, is over 100 times larger than GPT-2. This puts OpenAI much closer to its goal of generalizing artificial intelligence, and it makes this latest model a very capable natural language processing (NLP) system.
How Does GPT-3 Work?
GPT-3 uses neural networks in order to process, and learn correlations in, data. A neural network is basically a system of mathematical equations filtering information. Various elements of the input data pass through layers which are made up of nodes, producing an output with complex “understanding.” The more layers and nodes, the more potential the neural network has to capture interesting and useful relationships between the input and the output. This is perhaps easiest understood with visuals:
Relevant variables make up the input layer. For example, inputs about a particular city (x) might include employment numbers, age demographics, and weather history. When these attributes are “passed forward” through subsequent nodes, they form mathematical functions that can be trained algorithmically to gain an understanding of the input data. The output in this example (y) might be an answer to a question about the city, such as the predicted average cost of living, or, broadly extrapolating these ideas to our discussion of GPT-3, a paragraph describing the city.
The trend in natural language processing using AI has been to continue creating larger models, and because GPT-3 is larger than its predecessors, it is more capable of carrying out natural language tasks. In particular, GPT-3 can disseminate the natural language task that it’s being asked to fulfill; you “talk” to the model via natural language, giving it a direction and perhaps even some examples, and then it attempts to produce the desired output.
What Applications Can GPT-3 Perform?
The flagship use case of GPT-3 is generating text: it’s so flexible that it be applied to nearly any task where the output is natural language. It’s useful for summarization, such as a brief overview of a meeting transcript. Other tasks that can be assigned to it are classification, writing code, and answering questions. But, as Robert Dale writes in Cambridge University Press, it shouldn’t be trusted when the truth of the answers is vital (like, say, for medical advice). It still gets the answers wrong a fair amount of the time, so for fact-based scenarios it’s ideal to have its output post-edited by a knowledgeable human.
Looking back at our introduction paragraph authored by GPT-3, for instance, the model makes an ambiguous-at-best statement that GPT-3 is “very efficient, and can run on a standard laptop or even a smartphone.” This is inaccurate, as it is only ever used through a web service provided by OpenAI and never actually runs on a device.
Ethical Use of GPT-3
With such powerful capabilities, language processing models like this should be used responsibly and with AI ethics in mind. OpenAI itself recognizes that some applications of its API “(i.e., ones that enable frictionless generation of large amounts of customizable text via arbitrary prompts) are especially susceptible to misuse.” Safeguards against this potential misuse include required human involvement, post-editing, end-user access restrictions, content limitation and active monitoring.
As our language model itself says when prompted to “generate a rhyming poem about using GPT-3 responsibly”:
We all know GPT-3 is great
But we must use it responsibly
Or we’ll end up in a terrible state.