OpenAI GPT-3 in RPA

Nagaraj Vaidya
7 min readAug 22, 2020

--

GPT-3 is the largest language model ever created and is capable of generating text that is indistinguishable from human text in many cases.

GPT-3 comes from a company called Open AI. Open AI was founded by Elon Musk and Sam Altman. Elon Musk is a prolific entrepreneur you may be familiar with him from SpaceX and Tesla, and Sam Altman was one of the founders of Y Combinator, a very famous start-up accelerator. Combined they invested over a billion dollars in OpenAI to advance the state of the art of artificial intelligence.

What is a language model?

It’s a distribution function of the next word in a sentence, possibly given the previous word(s). It is used to compute the probability of a text in the language, to predict the next word in sentence, or to generate a whole new sentence which is likely in the language.

The GPT-3 model has a 175 billion parameters. To put that figure into perspective its previous model GPT 2 which was considered state of the art and shockingly massive when it was released last year had 1.5 billion parameters which was soon eclipsed by Nvidia’s Megatron with 8 billion parameters followed by Microsoft’s Turing energy that had 17 billion parameters. Now Open AI turns the table by releasing a model that is 10 times larger than Turing energy.

Recent language models comparison

GPT-3 is largely being recognized for its language capabilities when properly primed by a human it can write creative fiction. Researchers say that GPT-3 sample are not just close to a human level’ in-fact they are creative, witty, deep, meta and often beautiful. They demonstrate an ability to handle abstractions like style parodies write poems etc. They also said that chatting with the GPT-3 feels very similar chatting with a human.

Some use-cases which has taken twitter by storm

So the first example is an application called Figma and the developer has built a plug-in to Figma that allows users to enter a description of what they want for an application. So in this example, a user is describing an application with a feed and some icons and a user interface and when he hit submit, the application actually generates an application that looks very much like Instagram. It has all of the buttons all the UI in a scrolling feed. Just astounding in terms of what it’s able to do.

The next example is a coding example and so this is Debuild.co and they’re describing an application that is a basic to-do list application. And so what the software is doing is it’s generating a React application. All of the functions and procedures, all the things that are necessary to create a React component to handle to-dos. You can add items to the to-do list, there’s all the events and triggers for handling the interactions of the application and it’s all generated on the fly by artificial intelligence. Just amazing what we’re able to do with AI both design and development aspects.

So this example is within Wikipedia. If you’ve ever seen a long article you can’t really understand or find the information that you need, you can simply have a plugin where you ask the exact question that you want. It will summarizes the data making it really easy for you to find exactly what you want from large volumes of text and information that are presented. Because of the vast amounts of information within GPT 3 you can really ask it anything. In fact you can almost use it like a search engine, where you ask it a question it can give you one result giving you the citation of exactly where to find that answer.

Apart from all these, there were many videos in social media where the language model was used for generating SQL queries by inputting the user requirements, generating a fictional story by inputting only initial few lines of the story and the model completes the rest and the work resembles as if done by a human, translating various languages.

How does it work?

The current state-of-the-art NLP systems struggle to generalize to work on different tasks, they need to be fine-tuned on data sets of thousands of examples while humans only need to see a few examples to perform a new language task. This was the goal behind GPT-3. To improve the task-agnostic characteristic of language models. Of course this model is pre-trained but it is never touched again specifically they train GPT-3 on a data set of half a trillion words for 175 billion parameters which is 10 times more than any previous non-sparse language model (Microsoft’s Turing energy), then there is no more fine-tuning to do with this model, only few shot demonstrations specified purely via text interaction with the model. For example an English sentence and the French translation, the few-shot works by giving k examples of context and completion and then one final example of context with the model expected to provide the completion without changing the model’s parameters, the model even sometimes reach competitiveness with prior state-of-the-art approaches that are directly fine-tuned on the specific task. In short, it works great because its memory pretty much contains all text ever published by humans on the internet.

At its core, GPT-3 is an extremely sophisticated text predictor. A human gives it a chunk of text as input and the model generates its best guess as to what the next chunk of text should be. It can then repeat this process taking the original input together with a newly generated chunk treating that as a new input and generating a subsequent chunk until it reaches a length limit. But how does GPT-3 go about generating these predictions? The answer to that question is that it has ingested effectively all of the text available on the internet, the output it generates is language that it calculates to be a statistically plausible response to the input it is given based on everything that humans have previously published online. Amazingly rich and nuanced insights can be extracted from the patterns latent in massive data sets far beyond what the human mind can recognize on its own. This is the first premise of modern machine learning having trained on a data set of half a trillion words, GPT-3 is able to identify and dazzlingly riff on the linguistic patterns contained therein.

Does it has solution for everything?

GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3’s few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora.

Also it lacks the ability to reason abstractly basically it lacks true common sense for example consider the following exchange with GPT-3 by a person posted on twitter wall.

Snapshot of exchange with GPT-3

Can we use it in RPA?

With the existing architecture of RPA or with the existing way of using the UI provided by the most of the RPA vendors, we can straight away tell that we cannot use this language model GPT-3 in our RPA CoE. One of the simple reason is that, in most of the RPA tools, the way of generating the code is a simple drag-n-drop. As GPT-3 is more likely a text generator, we obviously need a plugin developed which is again specific to each vendor and moreover even though a plugin is developed which generates the code specific to each RPA product, they may need to compromise over the user friendly drag-n-drop feature.

But I think using this language model GPT-3, in future, new players in RPA field may come without drag-n-drop functionality but with much more user friendly interfaces something like entering the text that captures the essence of the output and the plugin generator dishes out the automation code.

I think it could also be consumed to develop a program with a layout which actually generates python code or VBscript code or at least SQL queries by inputting requirements in simple English which actually reduces lead time during development phase of automating a business process as in the current RPA tools we are using these scripts which plays a critical role for data massaging that cannot be addressed easily with just in-built commands of RPA products. But here the challenge is the CoE Project Manager has to be futuristic in order to invest time and resources for this and open to new initiative try-outs without bothering about the end results because sometimes we may not get the desired results like in this case the team may not be successful in building up a program which auto generates the code using GPT-3 model at one go but for sure they will be having useful takeaways in this learning and development process.

Job security of a programmer?

What seems to elude a lot of people is when they use something like GPT-3 or a no code solution to build something they think they’re not coding, when in reality they are. They’re just using a different interface to give the computer instructions compared to say just a regular programming language. A programming language is just a tool and a software engineer’s tool belt. It’s like how an artist uses a brush to paint a picture. Personally, I think programmers are going to exist for the foreseeable future but what those programmers use to give computers instructions may change over time. Right now, we are using programming languages and typing stuff in IDEs, but 20 years from now, maybe that means we’re using a graphical user interface for everything.

Thanks for reading! Follow for more articles on RPA :) Cheers!!!

--

--

Nagaraj Vaidya

Automation Developer skilled in RPA-A360, Gen-AI & Python. Passionate about efficiency, AI integration. Always exploring new tech and pushing automation limits.