Generative AI – A primer into the past, present, and future of intelligent content generation (2024)

23. July 2024 By Patrick Flege

We are currently living in an exciting period for business and science, with artificial intelligence poised to change our lives more and more. AI has been compared to electricity in its social and economic consequences. This blog post serves as a brief primer on where a novel technology, generative AI or GenAI, systems that generate content, instead of just analyzing it, came from, and where opportunities and risks lie. If you are interested in how adesso can help your company with GenAI, have a look here.

GenAI – The new electricity

In the last couple of years, we’ve witnessed a remarkable expansion in the capabilities of Artificial Intelligence (AI). Such a development is not without precedent – in the 1970s and 1980s, explosions in government and private funding for machine learning and AI happened likewise yet were followed by what is often referred to as an ‘AI-Winter’ – a period of long stagnation in investment and progress in AI. However, this time, it is poised to be different. Already in 2017, Stanford scientist, Google Brain founder, and former chief scientist at Baidu, Andrew Ng, predicted that advances in hardware would enable continuous progress for years to come. He was not mistaken. Thanks to an architectural pattern called Neural Networks, often also referred to as ‘Deep Learning’, and advances in processing power, the capabilities of AI improved continuously. In 2017, with the advance of a new type of architecture, the transformer-model, content-generation capabilities of computer systems again took a leap. Yet it was not until the release of ChatGPT by OpenAI that intelligent, content-generating AI-systems, also known as Generative AI or GenAI, became omnipresent in daily life.

While much hype, good and bad, has surrounded GenAI, the economic benefits and opportunities are tangible, and cannot be overstated. In 2017, McKinsey estimated that the application of AI could add up to 15.4 $ trillion to the annual economic value of the global economy in the coming decades. In their 2023 report, McKinsey updated this estimate, to include up to 4.4$ trillion, generated annually from the adaptation of GenAI in businesses. In comparison, the GDP of Britain in 2021 was 3.1 $ trillion(see here for the full report). Many of these productivity gains could be realized in knowledge-intensive sectors, such as banking, sales, R&D, the life sciences, and software engineering. According to Google’s and MIT’s Andrew McAfee, GenAI is a transformative technology, like the steam engine or electricity (see here). Like these, it will most likely generate strong growth in and demand for new professions using this kind of technology. Here at adesso, we are at the forefront of this development. For our clients, we are currently developing a broad portfolio of technologies that harness the power of GenAI. More information about the solutions that we provide can be found above.

Yet for all its promises, GenAI remains somewhat of a mystery for most people, even those whose work might be transformed drastically by it. Let’s get a grasp of what hoe GenAI models work, why they are so powerful, and what some of their pitfalls are.

GenAI and DeepLearning

Most generative AI–models are a type of model called Large Language Models (LLM). As the name suggests, these models have their origin in the processing of language. Modern LLMs represent what are called ‘Foundation Models’. Such models can solve a diverse array of problems, not just one task. Earlier architectures and models excelled at one thing only – such as recognizing cats in a picture. Foundation models’ capabilities, in contrast, are generalizable to a large swath of tasks. With regards to language, think about a model that can only translate from English to French. A model that can only do this is not a foundation model. Modern systems, like OpenAI’s family of GPTs (Generative Pre-Trained Transformer), are in contrast capable of handling many tasks – they can translate, summarize texts, tell jokes, etc. Most LLMs today are foundation models, but strictly speaking, not all LLMs are foundation models. GenAI is technically independent of those terms, meaning an AI system that creates content, instead of merely classifying or counting objects. Yet the best-known GenAI systems are LLMs which are also foundation models – got it?

Neural Networks and Transformers

GenAI could not have advanced without a simultaneous increase in the quantity of data made available by digitalization. Why? LLMs, which make up many GenAIs, are built on top of a computer architecture called Neural Networks (NN). As the name suggests, the basic principle behind them is to mimic human neurons, although the analogy only goes so far. They take many different input signals (say, mathematical representations of words or sentences), and ‘fire’ if the input exceeds a certain level, just like your neurons as you read this sentence. Stack many of those neurons together in layers and take the output of one layer as the input to the next – voila, a simple neural network. Each neuron has several parameters (it represents basically a mathematical, statistical equation), which must be tuned to generate a good output signal (for example, a good translation into French). NN models can be big – billions of parameters (it is estimated that, although no public information is available, the OpenAIs GPT-4 family has around a trillion learned parameters, and Metas yet-to-be-released 400B Llama has 400 billion and their performance is impressive. But such huge models only make sense if they have a lot of data to be trained on. To see why, it helps to put on our history goggles – NN have been around since the 1970s (around 50 years!), yet for most of the time, they have been seen as inferior to other techniques. One reason was their complexity and computing costs. It was only with the advance of big data that one could harness its full potential, blowing its performance through the roof. Add to this the need for stronger computing (nowadays often provided in the form of Graphical Processing Units), and we can see why it took so long for neural networks to take off.

The final piece of the puzzle was the inception of a new kind of NN architecture. Previously, language-based tasks posed one big problem – that the next word in a sentence might be depending on a word further back. As NN only took the input of the previous layers, a workaround was necessary. Until 2017, this workaround consisted of an architecture called Long-Short-Term-Memory Recurrent Neural Networks (LSTM-RNN). Such networks are called ‘recurrent’ because the parameters for the NN are the same in each layer. These networks suffered from shortcomings inthe computing capacity that they needed to accurately predict the next word in any text-completion task, like translation.They had to store more and more of the previously seen text in memory to predict the next word. With a large text corpus, that method quickly ran into bottlenecks. That all changed in 2017 when scientists from Google Brain and the University of Toronto came up with a sophisticated architecture.

This architecture was called the transformer. A seminal paper (‘Attention is all you need’, available here) by this team laid out this new architecture. It enabled efficient scaling of processing by using parallelization and powerful hardware components like GPUs. Amazingly, transformer models could easily learn various relationships between words in documents and use them to generate texts and other documents in an almost human-like manner. Such relationships were learned by a procedure called ‘Multi-Head-Attention’ – a head is a mathematical description of one kind of relationship between words in the text. Using and incorporating many heads inside the transformer, the intricacies of language could now be captured by the model. Transformers now build the foundation for almost every LLM, although the original architecture has since been adapted for different tasks.

Training and Big Data

Transformer-based LLMs are trained on massive corpora of data with self-supervised learning – a process whereby the model may try to, for example, predict the next item in a piece of text and change its parameters if it is wrong. Later, to be more effective at specific niche tasks, AI-engineers, and data scientists will represent the model with prompt-completion-pairs and punish the model if the completion is inadequate. A prompt is what we enter, for example, into ChatGPT, and the completion is its answer. Without the explosion in digital data we have available in the last few years, the crucial first step in training LLMs mentioned before could not have happened. OpenAI’s GPT-4o was trained on approximately 570 GB of digitalized data from all over the internet. The energetic costs of this training are non-negligible – to train a model of this size emits as much as 5 cars running over the lifetime of a human being.

Apart from possible environmental costs, other issues may arise with Large Language Models – let’s dive into some.

Models do not perform well on my task

Most LLMs are like Swiss-army knives – relatively good at a lot of things, but perhaps excelling at none. Businesses can choose to fine-tune a model with specific labeled data (data that is marked as desirable or undesirable, for example) so that it gets better at their task. One problem that may arise is called catastrophic forgetting, where the model changes so much that it cannot perform many of its initial tasks well anymore, even though it improves on your business task. Many tools are available to solve this, such as multitask-learning, a technique where the model is trained simultaneously on multiple different skills, or parameter-efficient fine-tuning (PEFT). PEFT is a lightweight procedure to either train only a few model parameters or to create an ‘adapter’ for a specific task, which is much less compute-consuming than re-training the whole model. Basically, most of the original model parameters are just added to this adapter (see this paper for an overview of methods: https://arxiv.org/abs/2312.12148)

Models are outdated

Models have only trained on documents that are available before a certain cutoff dates. The question ‘Who is the prime minister of the Netherlands?’ will be answered incorrectly by, say, GPT-4o or Llama3 in a few months. Take this into account when building solutions. An effective way to address such a shortcoming is Retrieval Augmented Generation (RAG), where the static model knowledge is enriched with documents specific to your use case. adesso implements several GenAI solutions which make use of RAG to solve our customer’s needs. Check out our website for more about our portfolio of GenAI solutions.

Bias in models

The old programming adage GIGO (Garbage-in, Garbage-out) holds true also for LLMs. As many models are also trained on texts decades or even centuries old, unhealthy stereotypes about, for example, gender can sneak into the models’ evaluations. In 2017, a team of Microsoft researchers found that Language models tend to associate lucrative, engineering professions with men, and household and lower-paid jobs with women (see this work). While there are ways to address this issue using mathematical procedures that adjust the mathematical representations of text for LLMs, they can still generate responses biased towards certain groups (see here). Encouragingly, the extent of bias seems to decrease with newer, bigger models!

Toxic language use and inappropriate responses

LLMs themselves are just mathematical models, and therefore often cannot tell which responses are considered ‘bad’ or unethical by human standards. A model may give ‘helpful’ responses to prompts that are unethical (‘How could I best hack my neighbor’s WiFi?’). To address this problem, a technique called Reinforcement Learning from Human Feedback (RLHF), where the model is rewarded for desired prompt completions and punished for undesired ones, can alleviate this issue (see here). In Reinforcement learning, an agent (the LLM) learns new behavior (updated parameters) based on feedback, or reinforcement, from the environment. If available, humans are the best judges to punish or reward a model, but nowadays, specific LLMs exist to supervise bigger LLMs and hand out rewards or punishments.

Hallucinations

LLMs are trained on large corpora of data but may fail to distinguish between fact and fiction. It is important for GenAI practitioners and the public to realize that GenAI is based merely on statistical patterns in input-data, and may sometimes generate plausible-sounding content, which is incorrect. Such behavior is referred to as hallucination (link). Hallucination may happen because LLMs tend to overgeneralize from data they have encountered. LLMs do not have an actual understanding of the content. Solutions to this ongoing problem, which can have detrimental consequences if made-up content spreads quickly, include RAG with up-to-date backend systems, RHLF or other forms of human auditing, or training the model with more accurate data.

Future directions

GenAI is an exciting area of development and can greatly benefit companies and civil society. Some promising research directions and developments include efforts to make GenAI better explainable to the public and practitioners alike – models often are perceived as black boxes that generate content. Furthermore, GenAI recently expanded to other modalities, such as video, audio, and even whole movies. Ongoing efforts to reduce the size of models and concurrently improve their efficiency will deliver the same or better services at lower costs and less energy use. Lastly, new and specialized tools using GenAI will free workers from many arduous tasks, and broaden the time available for more creative and engaging activities, opening up a new era of productivity. At adesso, we are excited to be actively engaged in this new frontier.

FAQs

What is generative AI primer? ›

Generative AI models are based on deep learning algorithms that learn to recognize patterns and relationships from vast amounts of input data, which then generate new outputs that are similar in style and structure to the data they were trained on.

Learn More Now ›

What is generative AI the future of content creation? ›

Generative AI serves as a catalyst for creativity, empowering individuals and organizations to push the boundaries of what's possible. With Gen AI, you can streamline your content creation process, freeing up time and resources to focus on what truly matters — crafting meaningful experiences for your audience.

Find Out More ›

What is the future of generative artificial intelligence? ›

AI Delegation And Making Work Better

But more jobs will be augmented – enhanced – by generative AI. Because generative AI allows us to hand over the more mundane and repetitive tasks to machines, human workers will have more time for value-adding tasks like problem-solving, creativity, and relationship-building.

What is generative AI the evolution of artificial intelligence? ›

Generative AI models can take inputs such as text, image, audio, video, and code and generate new content into any of the modalities mentioned. For example, it can turn text inputs into an image, turn an image into a song, or turn video into text.

Read The Full Story ›

What is the downside of generative AI? ›

One of the foremost challenges related to generative AI is the handling of sensitive data. As generative models rely on data to generate new content, there is a risk of this data including sensitive or proprietary information.

View Details ›

How is generative AI different from AI? ›

One of the key differences between AI and Gen AI is that traditional AI excels in analyzing and automating, while Gen AI pioneers in creative content generation. Both AI and Gen AI are key players in industry transformations, from healthcare to finance.

Get More Info ›

What is the most famous generative AI? ›

Among the best generative AI tools for images, DALL-E 2 is OpenAI's recent version for image and art generation. DALL-E 2 generates better and more photorealistic images when compared to DALL-E. DALL-E 2 appropriately goes by user requests.

Get More Info ›

Will generative AI replace humans? ›

Even though generative design affects the field of mechanical design, it is unlikely to replace human engineers.

View Details ›

Will generative AI replace coders? ›

AI is not in a position to replace programmers, but as a developing technology, its current limitations may become less limiting over time. However, even then, replacing programmers with AI will face another obstacle: human comfort. Programmers and software engineers develop products that deeply impact society.

Find Out More ›

What is the main goal of generative AI? ›

Generative AI enables users to quickly generate new content based on a variety of inputs. Inputs and outputs to these models can include text, images, sounds, animation, 3D models, or other types of data.

Know More ›

What is the next step after generative AI? ›

The next step with generative AI for many is to find out what works in the most important parts the business, and scale to a point where there is a meaningful and positive impact on the desired goals, be they performance, profits, costs, sustainability, or something else.

Explore More ›

What is one thing current generative AI applications cannot do? ›

Generative AI, while advanced, still struggles with tasks that require deep understanding and reasoning beyond pattern recognition and data synthesis. One significant limitation is the inability to perform complex logical reasoning or solve problems that require abstract thinking and deep contextual understanding.

Tell Me More ›

What is a real life application of generative AI? ›

Healthcare: By seeing patterns in large quantities of medical data, Real Life Application of Generative AI is utilized in healthcare to generate novel medications and therapies. Additionally, it is used to create individualized treatment plans and suggestions for patients, personalizing their care.

Category:	AI
Tags:	GenAIAI