Saturday, July 27, 2024
HomeMoreWhat exactly is Generative AI? Creating artificial intelligence

What exactly is Generative AI? Creating artificial intelligence

Conversations, queries, tales, source code, and photos and movies of nearly any type may be carried out by generative AI models. Here’s how generative AI works, how it’s utilised, and why it’s less powerful than you may assume.

Generative AI is a type of artificial intelligence that generates new information, such as text, pictures, audio, and video, based on patterns learnt from previously created content. Deep learning, or deep neural networks, were used to train today’s generative AI models, which can carry on conversations, answer questions, write stories, generate source code, and create images and videos of any description based on brief text inputs or “prompts.”

The term “generative AI” refers to AI that develops something that did not previously exist. That is what distinguishes it from discriminative AI, which distinguishes between different types of data. To put it another way, discriminative AI attempts to answer questions such as “Is this image a drawing of a rabbit or a lion?” whereas generative AI reacts to prompts such as “Draw me a picture of a lion and a rabbit sitting next to each other.”

This post will introduce you to generative AI and its applications with popular models such as ChatGPT and DALL-E. We’ll also look at the technology’s limits, such as why “too many fingers” has become a dead giveaway for computer-generated art.

The rise of generative AI

Generative AI has been around for a long time, maybe since ELIZA, a chatbot that simulates speaking with a psychiatrist, was developed at MIT in 1966. However, years of AI and machine learning research have lately paid off with the emergence of new generative AI systems. ChatGPT, a text-based AI chatbot that writes amazingly human-like language, is almost definitely familiar to you. DALL-E and Stable Diffusion have also gotten a lot of attention for their capacity to generate vivid and realistic pictures based on text cues.

The output of these systems is so eerie that it has many people wondering about the nature of consciousness—as well as fretting about the economic implications of generative AI on human jobs. However, while all of these artificial intelligence developments are unquestionably significant, there may be less going on beneath the surface than people may believe. We’ll get to some of those big-picture issues shortly. Let’s begin by peeking inside the engine.

How does generative AI work?

Machine learning is used in Generative AI to analyse massive amounts of visual or textual data, most of which is scraped from the internet, and then decides what items are most likely to occur near other things. Much of the programming work in generative AI is spent on developing algorithms that can discriminate between the “things” of interest to the AI’s creators—words and phrases in the case of chatbots like ChatGPT, or visual components in the case of DALL-E. However, generative AI generates output by analysing a massive corpus of data and then responds to prompts with something that fits within the area of probability as specified by that corpus.

Autocomplete is a low-level kind of generative AI that occurs when your phone or Gmail recommends what the remainder of the phrase or sentence you’re typing may be. ChatGPT and DALL-E just take the concept to a much higher level.

What is an AI model?

ChatGPT and DALL-E are interfaces to underlying AI functionality known as a model in AI jargon. An AI model is a mathematical representation—implemented as an algorithm or practice—that creates new data that (ideally) resembles an existing collection of data. ChatGPT and DALL-E are commonly referred to as models; however, this is erroneous because ChatGPT is a chatbot that provides users with access to various distinct versions of the underlying GPT model. However, because these interfaces are how most people would engage with the models in practise, don’t be shocked if the names are used interchangeably.

AI developers compile a corpus of data of the sort that their models should generate. The model’s training set is this corpus, and the process of constructing the model is termed training. For example, the GPT models were trained on a massive corpus of text scraped from the internet, and as a consequence, you can give it natural language questions and it will answer in idiomatic English (or any number of other languages, depending on the input).

Different properties of data in training sets are treated as vectors by AI models, which are mathematical constructs made up of numerous integers. The ability of these models to translate real-world information into vectors in a meaningful way, and to determine which vectors are similar to one another in a way that allows the model to generate output that is similar to, but not identical to, its training set, is a big part of their secret sauce.

There are many different types of AI models available, however keep in mind that they are not always mutually exclusive. Some models can be classified in more than one way.

Large language models, or LLMs, are perhaps the AI model type gaining the most public attention right now. LLMs are based on the transformer idea, which was initially described in “Attention Is All You Need,” a 2017 study by Google researchers. A transformer extracts meaning from long sequences of text in order to comprehend how distinct words or semantic components may be connected to one another, and then determines how probable they are to appear in close proximity to one another. LLMs are used in the GPT models, and the T denotes for transformer. Pretraining (the P in GPT) is a procedure in which these transformers are run unsupervised on a large corpus of natural language text before being fine-tuned by humans engaging with the model.

Diffusion is frequently employed in generative AI models that generate visuals or video. The model adds noise—randomness, essentially—to a picture throughout the diffusion phase, then progressively eliminates it repeatedly, all the while checking against its training set to match semantically related images. Diffusion is at the heart of AI models that conduct text-to-image transformations, such as Stable Diffusion and DALL-E.

A generative adversarial network, or GAN, is a sort of reinforcement learning in which two algorithms fight with each other. Text or graphics are generated based on probabilities determined from a large data collection. The other, a discriminative AI, determines if the output is real or manufactured by AI. The generative AI constantly attempts to “trick” the discriminative AI, automatically evolving to favour successful outputs. Once the generative AI regularly “wins” this competition, humans fine-tune the discriminative AI, and the process begins again.

One of the most essential things to remember is that, while there is some human participation in the training process, the majority of learning and adaptation occurs automatically. Many, many iterations are necessary to get the models to provide interesting results, hence automation is critical. The method is highly computationally expensive, and developments in GPU computing power and strategies for performing parallel processing on these processors have fueled much of the current surge in AI capabilities.

Is generative AI conscious?

The mathematics and code involved in developing and training generative AI models are highly difficult, and this article is far from exhaustive. However, interacting with the models created as a result of this process can be rather unsettling. DALL-E can create objects that resemble true pieces of art. ChatGPT allows you to hold discussions that feel like they are with another person. Have scientists really constructed a thinking machine?

No, according to Chris Phipps, a former IBM natural language processing lead who worked on Watson AI products. ChatGPT is described by him as a “very good prediction machine.”

It is extremely accurate at predicting what people would find coherent. It isn’t always coherent (though it is most of the time), but it isn’t because ChatGPT “understands.” The contrary is true: people who consume the product are quite adept at creating whatever implicit assumptions required to make the result make sense.

Phipps, who is also a comedian, compares it to a popular improv game called Mind Meld.

Two individuals each conceive of a word and pronounce it aloud at the same time—you may say “boot” and I would say “tree.” We came up with those words entirely on our own, and they had nothing to do with each other at first. The following two players take those two words and try to think of anything they share in common before saying it aloud. The game will continue until two players utter the same word.

Perhaps two individuals say “lumberjack.” It appears to be magic, but it is actually our human brains reasoning about the input (“boot” and “tree”) and finding a relationship. We, not the machine, perform the job of comprehension. There’s a lot more of that going on with ChatGPT and DALL-E than most people realise. ChatGPT can tell a tale, but we humans have to work hard to make it make sense.

Setting computer intelligence to the test

Certain suggestions we can provide these AI models will make Phipps’ argument rather clear. ChatGPT will accurately solve this puzzle, as you might expect from a coldly logical computer with no “common sense” to throw it off. But that is not the case behind the hood. ChatGPT isn’t figuring out the response logically; it’s simply creating output based on its predictions of what should happen after a query about a pound of feathers and a pound of lead. Because its training set includes a large amount of text describing the puzzle, it constructs a version of the correct solution.

If you ask ChatGPT if two pounds of feathers are heavier than a pound of lead, it will confidently tell you that they are, because it is still the most likely output to a prompt concerning feathers and lead, based on its training set. It’s entertaining to correct the AI and then watch it stumble in reaction; It eventually offered to say sorry for its error and that two pounds of feathers weigh four times as much as one pound of lead.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments