There’s been a lot of talk about GPT-3 and generative AI in the news, social media, and probably from every AI practitioner or vendor whom you’ve been speaking with lately.
Everyone is super excited about the future that such AI tools hold.
But what exactly is this AI technology specifically, and what does it mean for your business and AI problems? Let’s explore!
What is GPT-3?
GPT-3 is a large language model developed by Open AI. It’s the successor of Open AI’s older language model, GPT-2 which was much smaller in comparison.
So, what’s a language model? A language model is a probability distribution over sequences of words learned from data. This probability distribution can then be used to complete sentences, validate sentence correctness, validate speech recognition predictions, translate one language to another, and much more.
As you can see, language models are pretty powerful, and in concept, this is not new. Language models have been around for decades.
Here’s an example of how a language model completes a sentence:
The car is about to ______ crash => probability 0.08 stop => probability 0.92 Predicted answer: stop
Leveraging this general language model concept, GPT-3 is a gigantic language model capable of generating sequences of words, code, translations, summaries, or other types of data, starting from a source input, called the prompt.
Traditionally, language models have been trained on small datasets as it’s computationally expensive to train large language models. However, GPT-3, is trained on much of the Web, books, and Wikipedia data, which boils down to it being trained on billions of words. Further, GPT-3 is trained using a very deep and sophisticated neural network, helping it learn complex relationships between words.
This sort of training is not something we can easily replicate as it can cost millions of dollars for every training iteration. In fact, it cost approximately 4.6 million dollars to train GPT-3 using a Tesla V100 cloud instance over 9 days. But what this level of sophistication means is that GPT-3 can answer all sorts of questions and complete sophisticated tasks with little hand-holding. You can think of GPT-3 as a super-intelligent Q&A machine.
What Can GPT-3 Do?
Some of the capabilities of GPT-3 include:
- Predicting categories on text data
- Generating relevant source code based on description alone
- Extracting pertinent information from unstructured data to make it more structured
- Becoming your therapy chatbot
- Translating text in one language into several others that it understands
- Writing paragraphs of content with a prompt
- Rewriting article headlines
- Spelling correction
- And much much more
The Business Benefits of GPT-3
So, what is the benefit of GPT-3 for business applications?
In short: one model that can complete multiple tasks. Years ago, we had to develop a single specialized model for every task that we were looking to solve with AI. We needed the training data, the appropriate ML algorithm, and a data scientist.
But with large language models like GPT-3, for many tasks, you can leverage this single model by briefly teaching the model with examples of what types of output to produce. For certain tasks, you don’t even need that. You can just describe the task and provide the input and GPT-3 will generate relevant output. So, almost anyone can perform the AI “development”.
For example, if you’re performing a classification task, you can prime the model on the types of expected categories. If you’re wanting generated content, you can tell what type of content you’re expecting. So it essentially democratizes AI development and makes it less time-consuming.
Imagine developing a sentiment classifier with just 5 prompts. Is this too good to be true? The only way to know if it holds water on your data is to evaluate, evaluate, and evaluate. You will never run away from evaluation no matter how sophisticated the model as I repeatedly talk about in my book.
Is traditional ML going away because of GPT-3?
No. Task-specific models, smaller language models and classical ML is not going anywhere anytime soon. GPT-3 only works on tasks that it understands well or tasks that you can make it understand (see examples below). If you have highly domain-specific tasks, you’ll still have to build specialized models that are fine-tuned solely for those tasks.
This only means that it’s going to get much easier to develop ML solutions for certain well-understood tasks. Or these models can be used to generate supplementary input for your specialized ML tasks.
What are the risks of GPT-3?
Now let’s talk about the hard stuff. While GPT-3 has great potential, we still need to consider its broader implications for your AI applications and business. Some of the risks of GPT-3 include:
- Bias propagation—As GPT-3 was predominantly trained on Web data, it has learned both the good and the bad of the Web. This means any embedded bias, errors in data, and non-factual content can easily seep into your applications.
- Potential plagiarism—Having knowledge of the entire Web (almost) also means that GPT-3 can spit out content from various sources word-for-word without attribution. So, if you see familiar content within a third-party application, don’t be surprised—it could be YOUR content. Unfortunately, you may not be able to claim plagiarism and there’s not much we can do about it as the model is already open for public usage.
- Unpredictable performance—GPT-3 is essentially a language generator that can multitask. And because it’s not fine-tuned for your application-specific task, its performance on a single “narrow” task may be unpredictable. One small glitch can result in erroneous output.
- Hallucinations—As GPT-3 computes the probability of generating meaningful output, it could very well stitch together unrelated concepts that IT thinks make sense. This could end up being nonfactual and inaccurate information. If you’re using GPT-3 to generate content, you should validate the facts produced especially on uncommon or time-dependent topics, and topics that are subject to interpretation.
These risks are real and people are already raising these issues in various formats.
Here are two examples of GPT-3 in action.
In this example, GPT-3 is asked to edit English sentences.
GPT-3: Sentence correction
In this example, GPT-3 is given examples of how to classify sentences, and then it does it on the last task.
GPT-3: Sentiment orientation prediction
GPT-3 Key Takeaways
- GPT-3 is a large language model that can help you complete multiple tasks with little to no supervision.
- GPT-3 cannot solve every AI problem. It’s only as good as the prompts that you feed it and tasks that it understands and may not work well for specialized tasks such as predicting market movements.
- While GPT3 has great potential, it has its fair share of problems just like any ML model. Some of the potential problems include bias propagation, hallucinations, and unpredictable performance.
- Just as with any AI solution, evaluation is critical to the success of every initiative and there’s no exception to GPT-3.