4 - Large Language Models - Human Rights Talks 2023/2024: Artificial Intelligence as a Human Rights Chance or Challenge? [ID:55656]

50 von 85 angezeigt

CPT or Generative Pre-trained Transformer is a large language model or an LLM that can

generate human-like text.

In this video, we're going to number one, ask what is an LLM and how does it work?

Number two, talk about their applications in law.

And number three, conclude with some regulatory guidelines as well as limitations of using

LLMs.

Starting with number one, what is a large language model and how does it work?

LLMs are based on machine learning, the currently most relevant subfield of artificial intelligence

AI.

They have these three components, data, architecture, and lastly, training.

The first component, the data, is required in the form of text.

General purpose models such as ChairTPT, Claude, Gemini, or BERT, for example, are trained

with text data that is publicly available online, such as books, research articles,

Wikipedia, and other websites.

When we say large, these models can be tens of gigabytes in size and trained on enormous

amounts of text data.

We're talking potentially petabytes of data here.

To put that into perspective, a text file that is, for example, one gigabyte in size

can store about 178 million words.

And how many gigabytes are in one petabyte?

Well about one million.

The human brain in comparison is believed to store about two and a half petabytes of

memory data.

Moving on to the second component of an LLM, the architecture of the models is referred

to as artificial neural networks, ANN, which are very complex structures of mathematical

instructions represented in a high number of nodes and connections between them.

An untrained ANN has unspecified respective random parameters that serve as placeholders

for information.

The model with specified parameters is built through training with usually large amounts

of training data.

In general, the more parameters a model has, the more complex it can be.

And LLMs are among the biggest models when it comes to parameter count.

And then this architecture is trained on all of this large amount of data.

For the training, the texts are broken down to sentences, parts of sentences, and words.

During the first phase of the training, the model is iteratively fed with incomplete sentences

and predicts the missing word or missing words, usually the next one.

When this phase of the training is completed, a model can generate outputs with correct

syntax.

To perform sufficiently with respect to semantics, the model is trained on questions through

reinforcement, either based on question-answer pairs or with human feedback.

However, generative LLMs that generate syntactically and semantically correct outputs do not understand

grammar rules or the meaning of words.

They just learn to mimic very well.

So instead, existing models merely operate with statistical relationships between words.

They learn which words in which order are likely in a specific context and thereby adjust

their parameters such that it can generate highly probable word combinations.

But real reasoning does not happen.

Now the model can be fine-tuned in a smaller, more specific data set.

Here the model redefines its mimicking to be able to perform this specific task more

accurately.

Teil einer Videoserie :

Artificial Intelligence as a Human Rights Chance or Challenge? - Human Rights Talks 2023/2024

Presenters

Prof. Dr. Patricia Wiater

Zugänglich über

Offener Zugang

Dauer

00:07:40 Min

Aufnahmedatum

2024-11-29

Hochgeladen am

2024-11-29 11:41:12

Sprache

en-US

Tags

Per RSS abonnieren