What is an LLM? – Large language models explained

What does LLM stand for?

What a large language model is and what LLM means

PC Guide is reader-supported. When you buy through links on our site, we may earn an affiliate commission. Prices subject to change. Read More

Last Updated on

OpenAI’s ChatGPT is the worlds most popular AI – a specific type of artificial intelligence called an LLM. Assuming you don’t have your LLM degree yet, you won’t know what a large language model is – completely understandable. What is an LLM, and what does language model mean?

What does LLM stand for?

LLM stands for large language model. Popular LLMs include GPT-3GPT-3.5 TurboGPT-4PaLM 2, and Llama 2. These neural networks are what power AI chatbots like OpenAI‘s ChatGPT, Microsoft’s Bing Chat, and Google Bard.

CompanyAI chatbotLLM
OpenAIChatGPTGPT-4 or GPT-3*
OpenAIChatGPT APIGPT-4 or GPT-3.5 Turbo
GoogleBardPaLM 2
MicrosoftBing ChatGPT-4
MetaNo chatbot**Llama 2
AnthropicClaude 2Claude
Comparison of which AI chatbots use which LLMs.

*OpenAI’s GPT-3 model is the default for free users of the ChatGPT AI chatbot. Using GPT-4 requires a paid subscription to ChatGPT Plus or ChatGPT Enterprise.

**Meta does not have their own AI chatbot running on Llama 2. Instead, the language model is open-source, and third-party developers are encouraged to create their own interfaces (such as chatbots) for it. Chatbots are not the only uses case or type of interface for an LLM. However, HuggingChat and llama2.ai are clear public favourites.

What does LLM mean?

LLM (Large Language Model) refers to the model itself, which includes parameters and weightings (contextual understanding) and the algorithm used for NLP (natural language processing). The training data set is not strictly part of the LLM, and training can be a one-time or iterative process. ChatGPT itself was trained through a process called Reinforcement Learning from Human Feedback (RLHF), but the pre-training process is also not strictly part of the model – more so how the model was arrived at.

Essential AI Tools

Editor’s pick

7-in-1 AI Content Checker – One-click, Seven Checks

7 Market leading AI Content Checkers in ONE click. The only 7-in-1 AI content detector platform in the world. We integrate with leading AI content detectors to give unparalleled confidence that your content appear to be written by a human.
Only $0.00015 per word!

Winston AI detector

Winston AI: The most trusted AI detector. Winston AI is the industry leading AI content detection tool to help check AI content generated with ChatGPT, GPT-4, Bard, Bing Chat, Claude, and many more LLMs.
Only $0.01 per 100 words

Originality AI detector

Originality.AI Is The Most Accurate AI Detection.Across a testing data set of 1200 data samples it achieved an accuracy of 96% while its closest competitor achieved only 35%. Useful Chrome extension. Detects across emails, Google Docs, and websites.
EXCLUSIVE DEAL 10,000 free bonus credits

Jasper AI

On-brand AI content wherever you create. 100,000+ customers creating real content with Jasper. One AI tool, all the best models.
TRY FOR FREE

WordAI

10x Your Content Output With AI. Key features – No duplicate content, full control, in built AI content checker. Free trial available.

Language models use a deep learning architecture called transformer architecture. This is where the GPT (Generative Pre-trained Transformer) series inherits its name from.

A transformer model is a neural network involving bidirectional encoder representations, a self-attention mechanism, and word embeddings (tokens). One specific type of transformer model is auto-regressive, in the case of GPT-3, and GPT-4.

Transformer models are not exclusively used to create LLMs, but they are well suited for the task. One example of an earlier transformer model is BERT (Bidirectional Encoder Representations from Transformers), a 1B parameter model produced by Google in 2018. This is already considered very small by today’s standards.

Rick Merret of NVIDIA, the hardware manufacturer enabling the AI industry, explains that “Transformer models apply an evolving set of mathematical techniques, called attention or self-attention, to detect subtle ways even distant data elements in a series influence and depend on each other.”

Transformer architecture is the latest in a long line of recurrent neural network (RNN) technologies. LLMs supercede LSTM (Long Short-Term Memory) models, which OpenAI chief scientist Ilya Sutskever was working with prior to co-founding OpenAI in 2015.

What is an LLM?

An LLM is a type of AI for text generation. It takes an input text prompt and generates an output (also of text). So what’s the point?

Well the input could be just a sentence or two, with an output of hundreds or thousands of words. Not only can it speed up your writing process, it can summarize other peoples text, reformat data, help produce WordPowerPoint, and Excel documents, and write in languages that you don’t speak yourself – like code!

Large language models are an example of natural language processing. The neural network of an LLM is trained on billions of words – broken down into tokens – and, through a process of machine learning, an artificial understanding of these words (and the relationship between them) is built up.

The degree of relational understanding is quantified as parameters. Where a token can be thought of as the neuron in a human brain, the parameters are the synapses – the connections in between. Without connections, you have a static database of information. With connections, you have a contextual understanding of that information – a neural network that can understand why you’re asking your questions, and even educate you about what you don’t know that you don’t know!

What can GPT do?

An LLM is capable of language translation, fluency in any programming language (if included in the training data), writing and text summarization, human-like conversation, classification, sentiment analysis, and inference beyond that of a search engine.

They do, however, require much more computational resources to run than a search engine. This because an AI is not simply an IR (Information Recall) system, but a generative one. AI creates new data. There is no guarantee that this exact data was not previously produced by someone somewhere sometime before – that would require checking against all human literature, including that not digitally archived, and on the internet outside of the world wide web. Neither AI not you have access to all that! All generative AI really means is that the responses are not pre-written. The datasets were used as training wheels, but now the LLM program is self-reliant.

Do large language models work?

Yes, large language models work! These artificial intelligence systems have already passed the bar exam, as well as engineering and doctoral exams. While the bachelor of laws title is still exclusive to humans, technically speaking, current AI could hold a law degree.

LLMs have a great many use cases, and new specific tasks and applications are discovered seemingly by the day!