universo-virtual.com

buytrendz.net

thisforall.net

benchpressgains.com

qthzb.com

mindhunter9.com

dwjqp1.com

secure-signup.net

ahaayy.com

soxtry.com

tressesindia.com

puresybian.com

krpano-chs.com

cre8workshop.com

hdkino.org

peixun021.com

qz786.com

utahperformingartscenter.org

maw-pr.com

zaaksen.com

ypxsptbfd7.com

worldqrmconference.com

shangyuwh.com

eejssdfsdfdfjsd.com

playminecraftfreeonline.com

trekvietnamtour.com

your-business-articles.com

essaywritingservice10.com

hindusamaaj.com

joggingvideo.com

wandercoups.com

onlinenewsofindia.com

worldgraphic-team.com

bnsrz.com

wormblaster.net

tongchengchuyange0004.com

internetknowing.com

breachurch.com

peachesnginburlesque.com

dataarchitectoo.com

clientfunnelformula.com

30pps.com

cherylroll.com

ks2252.com

webmanicura.com

osostore.com

softsmob.com

sofietsshotel.com

facetorch.com

nylawyerreview.com

apapromotions.com

shareparelli.com

goeaglepointe.com

thegreenmanpubphuket.com

karotorossian.com

publicsensor.com

taiwandefence.com

epcsur.com

odskc.com

inzziln.info

leaiiln.info

cq-oa.com

dqtianshun.com

southstills.com

tvtv98.com

thewellington-hotel.com

bccaipiao.com

colectoresindustrialesgs.com

shenanddcg.com

capriartfilmfestival.com

replicabreitlingsale.com

thaiamarinnewtoncorner.com

gkmcww.com

mbnkbj.com

andrewbrennandesign.com

cod54.com

luobinzhang.com

bartoysdirect.com

taquerialoscompadresdc.com

aaoodln.info

amcckln.info

drvrnln.info

dwabmln.info

fcsjoln.info

hlonxln.info

kcmeiln.info

kplrrln.info

fatcatoons.com

91guoys.com

signupforfreehosting.com

faithfirst.net

zjyc28.com

tongchengjinyeyouyue0004.com

nhuan6.com

oldgardensflowers.com

lightupthefloor.com

bahamamamas-stjohns.com

ly2818.com

905onthebay.com

fonemenu.com

notanothermovie.com

ukrainehighclassescort.com

meincmagazine.com

av-5858.com

yallerdawg.com

donkeythemovie.com

corporatehospitalitygroup.com

boboyy88.com

miteinander-lernen.com

dannayconsulting.com

officialtomsshoesoutletstore.com

forsale-amoxil-amoxicillin.net

generictadalafil-canada.net

guitarlessonseastlondon.com

lesliesrestaurants.com

mattyno9.com

nri-homeloans.com

rtgvisas-qatar.com

salbutamolventolinonline.net

sportsinjuries.info

topsedu.xyz

xmxm7.com

x332.xyz

sportstrainingblog.com

autopartspares.com

readguy.net

soniasegreto.com

bobbygdavis.com

wedsna.com

rgkntk.com

bkkmarketplace.com

zxqcwx.com

breakupprogram.com

boxcardc.com

unblockyoutubeindonesia.com

fabulousbookmark.com

beat-the.com

guatemala-sailfishing-vacations-charters.com

magie-marketing.com

kingstonliteracy.com

guitaraffinity.com

eurelookinggoodapparel.com

howtolosecheekfat.net

marioncma.org

oliviadavismusic.com

shantelcampbellrealestate.com

shopleborn13.com

topindiafree.com

v-visitors.net

qazwsxedcokmijn.com

parabis.net

terriesandelin.com

luxuryhomme.com

studyexpanse.com

ronoom.com

djjky.com

053hh.com

originbluei.com

baucishotel.com

33kkn.com

intrinsiqresearch.com

mariaescort-kiev.com

mymaguk.com

sponsored4u.com

crimsonclass.com

bataillenavale.com

searchtile.com

ze-stribrnych-struh.com

zenithalhype.com

modalpkv.com

bouisset-lafforgue.com

useupload.com

37r.net

autoankauf-muenster.com

bantinbongda.net

bilgius.com

brabustermagazine.com

indigrow.org

miicrosofts.net

mysmiletravel.com

selinasims.com

spellcubesapp.com

usa-faction.com

snn01.com

hope-kelley.com

bancodeprofissionais.com

zjccp99.com

liturgycreator.com

weedsmj.com

majorelenco.com

colcollect.com

androidnews-jp.com

hypoallergenicdogsnames.com

dailyupdatez.com

foodphotographyreviews.com

cricutcom-setup.com

chprowebdesign.com

katyrealty-kanepa.com

tasramar.com

bilgipinari.org

four-am.com

indiarepublicday.com

inquick-enbooks.com

iracmpi.com

kakaschoenen.com

lsm99flash.com

nana1255.com

ngen-niagara.com

technwzs.com

virtualonlinecasino1345.com

wallpapertop.net

nova-click.com

abeautifulcrazylife.com

diggmobile.com

denochemexicana.com

eventhalfkg.com

medcon-taiwan.com

life-himawari.com

myriamshomes.com

nightmarevue.com

allstarsru.com

bestofthebuckeyestate.com

bestofthefirststate.com

bestwireless7.com

declarationintermittent.com

findhereall.com

jingyou888.com

lsm99deal.com

lsm99galaxy.com

moozatech.com

nuagh.com

patliyo.com

philomenamagikz.net

rckouba.net

saturnunipessoallda.com

tallahasseefrolics.com

thematurehardcore.net

totalenvironment-inthatquietearth.com

velislavakaymakanova.com

vermontenergetic.com

sizam-design.com

kakakpintar.com

begorgeouslady.com

1800birks4u.com

2wheelstogo.com

6strip4you.com

bigdata-world.net

emailandco.net

gacapal.com

jharpost.com

krishnaastro.com

lsm99credit.com

mascalzonicampani.com

sitemapxml.org

thecityslums.net

topagh.com

flairnetwebdesign.com

bangkaeair.com

beneventocoupon.com

noternet.org

oqtive.com

smilebrightrx.com

decollage-etiquette.com

1millionbestdownloads.com

7658.info

bidbass.com

devlopworldtech.com

digitalmarketingrajkot.com

fluginfo.net

naqlafshk.com

passion-decouverte.com

playsirius.com

spacceleratorintl.com

stikyballs.com

top10way.com

yokidsyogurt.com

zszyhl.com

16firthcrescent.com

abogadolaboralistamd.com

apk2wap.com

aromacremeria.com

banparacard.com

bosmanraws.com

businessproviderblog.com

caltonosa.com

calvaryrevivalchurch.org

chastenedsoulwithabrokenheart.com

cheminotsgardcevennes.com

cooksspot.com

cqxzpt.com

deesywig.com

deltacartoonmaps.com

despixelsetdeshommes.com

duocoracaobrasileiro.com

fareshopbd.com

goodpainspills.com

kobisitecdn.com

makaigoods.com

mgs1454.com

piccadillyresidences.com

radiolaondafresca.com

rubendorf.com

searchengineimprov.com

sellmyhrvahome.com

shugahouseessentials.com

sonihullquad.com

subtractkilos.com

valeriekelmansky.com

vipasdigitalmarketing.com

voolivrerj.com

zeelonggroup.com

1015southrockhill.com

10x10b.com

111-online-casinos.com

191cb.com

3665arpentunitd.com

aitesonics.com

bag-shokunin.com

brightotech.com

communication-digitale-services.com

covoakland.org

dariaprimapack.com

freefortniteaccountss.com

gatebizglobal.com

global1entertainmentnews.com

greatytene.com

hiroshiwakita.com

iktodaypk.com

jahatsakong.com

meadowbrookgolfgroup.com

newsbharati.net

platinumstudiosdesign.com

slotxogamesplay.com

strikestaruk.com

trucosdefortnite.com

ufabetrune.com

weddedtowhitmore.com

12940brycecanyonunitb.com

1311dietrichoaks.com

2monarchtraceunit303.com

601legendhill.com

850elaine.com

adieusolasomade.com

andora-ke.com

bestslotxogames.com

cannagomcallen.com

endlesslyhot.com

iestpjva.com

ouqprint.com

pwmaplefest.com

qtylmr.com

rb88betting.com

buscadogues.com

1007macfm.com

born-wild.com

growthinvests.com

promocode-casino.com

proyectogalgoargentina.com

wbthompson-art.com

whitemountainwheels.com

7thavehvl.com

developmethis.com

funkydogbowties.com

travelodgegrandjunction.com

gao-town.com

globalmarketsuite.com

blogshippo.com

hdbka.com

proboards67.com

outletonline-michaelkors.com

kalkis-research.com

thuthuatit.net

buckcash.com

hollistercanada.com

docterror.com

asadart.com

vmayke.org

erwincomputers.com

dirimart.org

okkii.com

loteriasdecehegin.com

mountanalog.com

healingtaobritain.com

ttxmonitor.com

bamthemes.com

nwordpress.com

11bolabonanza.com

avgo.top

{"id":1380,"date":"2023-09-26T14:00:00","date_gmt":"2023-09-26T14:00:00","guid":{"rendered":"https:\/\/cherylroll.com\/what-is-generative-ai-how-it-works-432402\/"},"modified":"2023-09-26T14:00:00","modified_gmt":"2023-09-26T14:00:00","slug":"what-is-generative-ai-how-it-works-432402","status":"publish","type":"post","link":"https:\/\/cherylroll.com\/what-is-generative-ai-how-it-works-432402\/","title":{"rendered":"What is generative AI and how does it work?"},"content":{"rendered":"

Generative AI, a subset of artificial intelligence, has emerged as a revolutionary force in the tech world. But what exactly is it? And why is it gaining so much attention? <\/p>\n

This in-depth guide will dive into how generative AI models work, what they can and can’t do, and the implications of all these elements.<\/p>\n

What is generative AI?<\/h2>\n

Generative AI, or genAI, refers to systems that can generate new content, be it text, images, music, or even videos. Traditionally, AI\/ML meant three things: supervised, unsupervised, and reinforcement learning. Each gives insights based on clustering output. <\/p>\n

Non-generative AI models make calculations based on input (like classifying an image or translating a sentence). In contrast, generative models produce “new” outputs such as writing essays, composing music, designing graphics, and even creating realistic human faces that don’t exist in the real world. <\/p>\n

The implications of generative AI<\/h2>\n

The rise of generative AI has significant implications. With the ability to generate content, industries like entertainment, design, and journalism are witnessing a paradigm shift. <\/p>\n

For instance, news agencies can use AI to draft reports, while designers can get AI-assisted suggestions for graphics. AI can generate hundreds of ad slogans in seconds – whether or not those options are good <\/em>or not is another matter. <\/p>\n

Generative AI can produce tailored content for individual users. Think of something like a music app that composes a unique song based on your mood or a news app that drafts articles on topics you’re interested in.<\/p>\n

The issue is that as AI plays a more integral role in content creation, questions about authenticity, copyright, and the value of human creativity become more prevalent. <\/p>\n

How does generative AI work?<\/h2>\n

Generative AI, at its core, is about predicting the next piece of data in a sequence, whether that’s the next word in a sentence or the next pixel in an image. Let’s break down how this is achieved.<\/p>\n

Statistical models<\/h3>\n

Statistical models are the backbone of most AI systems. They use mathematical equations to represent the relationship between different variables. <\/p>\n

For generative AI, models are trained to recognize patterns in data and then use these patterns to generate <\/em>new, similar data. <\/p>\n

If a model is trained on English sentences, it learns the statistical likelihood of one word following another, allowing it to generate coherent sentences.<\/p>\n

\"Basic\"Basic
Basic demo of how text is selected from an LLM<\/em><\/figcaption><\/figure>\n

Data gathering<\/h3>\n

Both the quality and quantity of data are crucial. Generative models are trained on vast datasets to understand patterns. <\/p>\n

For a language model, this might mean ingesting billions of words from books, websites, and other texts. <\/p>\n

For an image model, it could mean analyzing millions of images. The more diverse and comprehensive the training data, the better the model will generate diverse outputs.<\/p>\n

How transformers and attention work<\/h3>\n

Transformers are a type of neural network architecture introduced in a 2017 paper titled  “Attention Is All You Need<\/a>” by Vaswani et al. They have since become the foundation for most state-of-the-art language models. ChatGPT wouldn’t work without transformers.<\/p>\n

The “attention” mechanism allows the model to focus on different parts of the input data, much like how humans pay attention to specific words when understanding a sentence. <\/p>\n

This mechanism lets the model decide which parts of the input are relevant for a given task, making it highly flexible and powerful.<\/p>\n

The code below is a fundamental breakdown of transformer mechanisms, explaining each piece in plain English.<\/p>\n

class Transformer:\n      # Convert words to vectors<\/strong>\n        # What this is<\/strong>: turns words into \"vector embeddings\" –basically numbers that represent the words and their relationships to each other.\n        # Demo<\/strong>: \"the pineapple is cool and tasty\" -> [0.2, 0.5, 0.3, 0.8, 0.1, 0.9]\n        self.embedding = Embedding(vocab_size, d_model)\n        \n        # Add position information to the vectors<\/strong>\n        # What this is<\/strong>: Since words in a sentence have a specific order, we add information about each word's position in the sentence.\n        # Demo<\/strong>: \"the pineapple is cool and tasty\" with position -> [0.2+0.01, 0.5+0.02, 0.3+0.03, 0.8+0.04, 0.1+0.05, 0.9+0.06]\n        self.positional_encoding = PositionalEncoding(d_model)\n\n        # Stack of transformer layers<\/strong>\n        # What this is<\/strong>: Multiple layers of the Transformer model stacked on top of each other to process data in depth.\n        # Why it does it<\/strong>: Each layer captures different patterns and relationships in the data.\n        # Explained like I'm five<\/strong>: Imagine a multi-story building. Each floor (or layer) has people (or mechanisms) doing specific jobs. The more floors, the more jobs get done!\n        self.transformer_layers = [TransformerLayer(d_model, nhead) for _ in range(num_layers)]\n        \n        # Convert the output vectors to word probabilities<\/strong>\n        # What this is<\/strong>: A way to predict the next word in a sequence.\n        # Why it does it<\/strong>: After processing the input, we want to guess what word comes next.\n        # Explained like I'm five<\/strong>: After listening to a story, this tries to guess what happens next.\n        self.output_layer = Linear(d_model, vocab_size)\n\n    def forward(self, x):\n        # Convert words to vectors, as above<\/strong>\n        x = self.embedding(x)\n        \n        # Add position information, as above<\/strong>\n        x = self.positional_encoding(x)\n        \n        # Pass through each transformer layer<\/strong>\n        # What this is<\/strong>: Sending our data through each floor of our multi-story building.\n        # Why it does it<\/strong>: To deeply process and understand the data.\n        # Explained like I'm five<\/strong>: It's like passing a note in class. Each person (or layer) adds something to the note before passing it on, which can end up with a coherent story – or a mess.\n\n        for layer in self.transformer_layers:\n            x = layer(x)\n\n        # Get the output word probabilities<\/strong>\n        # What this is<\/strong>: Our best guess for the next word in the sequence.\n        return self.output_layer(x)\n<\/code><\/pre>\n

In code, you might have a Transformer class and a single TransformerLayer class. This is like having a blueprint for a floor vs. an entire building. <\/p>\n

This TransformerLayer piece of code shows you how specific components, like multi-head attention and specific arrangements, work. <\/p>\n

\"Image\"Image
Demonstration of how attention works using different colors<\/em><\/figcaption><\/figure>\n
class TransformerLayer:\n        # Multi-head attention mechanism<\/strong>\n        # What this is<\/strong>: A mechanism that lets the model focus on different parts of the input data simultaneously. \n        # Demo<\/strong>: \"the pineapple is cool and tasty\" might become \"this PINEAPPLE is COOL and TASTY\" as the model pays more attention to certain words.\n        self.attention = MultiHeadAttention(d_model, nhead)\n        \n        # Simple feed-forward neural network<\/strong>\n        # What this is<\/strong>: A basic neural network that processes the data after the attention mechanism.\n        # Demo<\/strong>: \"this PINEAPPLE is COOL and TASTY\" -> [0.25, 0.55, 0.35, 0.85, 0.15, 0.95] (slight changes in numbers after processing)\n        self.feed_forward = FeedForward(d_model)\n\n    def forward(self, x):\n        # Apply attention mechanism<\/strong>\n        # What this is<\/strong>: The step where we focus on different parts of the sentence.\n        # Explained like I'm five<\/strong>: It's like highlighting important parts of a book.\n        attention_output = self.attention(x, x, x)\n        \n        # Pass the output through the feed-forward network<\/strong>\n        # What this is<\/strong>: The step where we process the highlighted information.\n        return self.feed_forward(attention_output)\n<\/code><\/pre>\n

A feed-forward neural network is one of the simplest types of artificial neural networks. It consists of an input layer, one or more hidden layers, and an output layer. <\/p>\n

The data flows in one direction – from the input layer, through the hidden layers, and to the output layer. There are no loops or cycles in the network.<\/p>\n

In the context of the transformer architecture, the feed-forward neural network is used after the attention mechanism in each layer. It’s a simple two-layered linear transformation with a ReLU activation in between.<\/p>\n

# Scaled dot-product attention mechanism<\/strong>\nclass ScaledDotProductAttention:\n    def __init__(self, d_model):\n \n       # Scaling factor helps in stabilizing the gradients \n       # it reduces the variance of the dot product.<\/strong>\n        # What this is:<\/strong> A scaling factor based on the size of our model's embeddings.\n        # What it does<\/strong>: Helps to make sure the dot products don't get too big.\n        # Why it does it<\/strong>: Big dot products can make a model unstable and harder to train.\n        # How it does it<\/strong>: By dividing the dot products by the square root of the embedding size.\n        # It's used when calculating attention scores.\n        # Explained like I'm five<\/strong>: Imagine you shouted something really loud. This scaling factor is like turning the volume down so it's not too loud.\n\n        self.scaling_factor = d_model ** 0.5\n\n    def forward(self, query, key, value):\n        # What this is<\/strong>: The function that calculates how much attention each word should get.\n        # What it does<\/strong>: Determines how relevant each word in a sentence is to every other word.\n        # Why it does it<\/strong>: So we can focus more on important words when trying to understand a sentence.\n        # How it does it<\/strong>: By taking the dot product (the numeric product: a way to measure similarity) of the query and key, then scaling it, and finally using that to weigh our values.\n        # How it fits into the rest of the code<\/strong>: This function is called whenever we want to calculate attention in our model.\n        # Explained like I'm five<\/strong>: Imagine you have a toy and you want to see which of your friends likes it the most. This function is like asking each friend how much they like the toy, and then deciding who gets to play with it based on their answers.\n        \n\n        # Calculate attention scores by taking the dot product of the query and key.<\/strong>\n        scores = dot_product(query, key) \/ self.scaling_factor\n        # Convert the raw scores to probabilities using the softmax function.<\/strong>\n        attention_weights = softmax(scores)\n        # Weight the values using the attention probabilities.<\/strong>\n        return dot_product(attention_weights, value)\n\n\n# Feed-forward neural network<\/strong>\n# This is an extremely basic example of a neural network.<\/strong>\nclass FeedForward:\n    def __init__(self, d_model):\n        # First linear layer increases the dimensionality of the data.<\/strong>\n        self.layer1 = Linear(d_model, d_model * 4)\n        # Second linear layer brings the dimensionality back to d_model.<\/strong>\n        self.layer2 = Linear(d_model * 4, d_model)\n\n    def forward(self, x):\n        # Pass the input through the first layer,\n#Pass the input through the first layer:<\/strong>\n# Input<\/strong>: This refers to the data you feed into the neural network. I\n#First layer<\/strong>: Neural networks consist of layers, and each layer has neurons. When we say \"pass the input through the first layer,\" we mean that the input data is being processed by the neurons in this layer. Each neuron takes the input, multiplies it by its weights (which are learned during training), and produces an output.\n#  apply ReLU activation to introduce non-linearity,\n        # and then pass through the second layer.\n#ReLU activation: ReLU stands for Rectified Linear Unit. <\/strong>\n\n# It's a type of activation function, which is a mathematical function applied to the output of each neuron. In simpler terms, if the input is positive, it returns the input value; if the input is negative or zero, it returns zero.\n# Neural networks can model complex relationships in data by introducing non-linearities. \n# Without non-linear activation functions, no matter how many layers you stack in a neural network, it would behave just like a single-layer perceptron because summing these layers would give you another linear model. \n# Non-linearities allow the network to capture complex patterns and make better predictions. \n\n        return self.layer2(relu(self.layer1(x)))\n\n# Positional encoding adds information about the position of each word in the sequence.<\/strong>\nclass PositionalEncoding:\n    def __init__(self, d_model):\n        # What this is<\/strong>: A setup to add information about where each word is in a sentence.\n        # What it does<\/strong>: Prepares to add a unique \"position\" value to each word.\n        # Why it does it<\/strong>: Words in a sentence have an order, and this helps the model remember that order.\n        # How it does it<\/strong>: By creating a special pattern of numbers for each position in a sentence.\n        # How it fits into the rest of the code<\/strong>: Before processing words, we add their position info.\n        # Explained like I'm five<\/strong>: Imagine you're in a line with your friends. This gives everyone a number to remember their place in line.\n        pass\n\n    def forward(self, x):\n        # What this is<\/strong>: The main function that adds position info to our words.\n        # What it does<\/strong>: Combines the word's original value with its position value.\n        # Why it does it<\/strong>: So the model knows the order of words in a sentence.\n        # How it does it<\/strong>: By adding the position values we prepared earlier to the word values.\n        # How it fits into the rest of the code<\/strong>: This function is called whenever we want to add position info to our words.\n        # Explained like I'm five<\/strong>: It's like giving each of your toys a tag that says if it's the 1st, 2nd, 3rd toy, and so on.\n        return x\n\n# Helper functions<\/strong>\ndef dot_product(a, b):\n    # Calculate the dot product of two matrices.\n    # What this is<\/strong>: A mathematical operation to see how similar two lists of numbers are.\n    # What it does<\/strong>: Multiplies matching items in the lists and then adds them up.\n    # Why it does it<\/strong>: To measure similarity or relevance between two sets of data.\n    # How it does it<\/strong>: By multiplying and summing up.\n    # How it fits into the rest of the code<\/strong>: Used in attention to see how relevant words are to each other.\n    # Explained like I'm five<\/strong>: Imagine you and your friend have bags of candies. You both pour them out and match each candy type. Then, you count how many matching pairs you have.\n    return a @ b.transpose(-2, -1)\n\ndef softmax(x):\n    # Convert raw scores to probabilities ensuring they sum up to 1.\n    # What this is<\/strong>: A way to turn any list of numbers into probabilities.\n    # What it does<\/strong>: Makes the numbers between 0 and 1 and ensures they all add up to 1.\n    # Why it does it<\/strong>: So we can understand the numbers as chances or probabilities.\n    # How it does it<\/strong>: By using exponentiation and division.\n    # How it fits into the rest of the code<\/strong>: Used to convert attention scores into probabilities.\n    # Explained like I'm five<\/strong>: Lets go back to our toys. This makes sure that when you share them, everyone gets a fair share, and no toy is left behind.\n    return exp(x) \/ sum(exp(x), axis=-1)\n\ndef relu(x):\n    # Activation function that introduces non-linearity. It sets negative values to 0.\n    # What this is<\/strong>: A simple rule for numbers.\n    # What it does<\/strong>: If a number is negative, it changes it to zero. Otherwise, it leaves it as it is.\n    # Why it does it<\/strong>: To introduce some simplicity and non-linearity in our model's calculations.\n    # How it does it<\/strong>: By checking each number and setting it to zero if it's negative.\n    # How it fits into the rest of the code<\/strong>: Used in neural networks to make them more powerful and flexible.\n    # Explained like I'm five<\/strong>: Imagine you have some stickers, some are shiny (positive numbers) and some are dull (negative numbers). This rule says to replace all dull stickers with blank ones.\n\n    return max(0, x)\n<\/code><\/pre>\n

How generative AI works – in simple terms<\/h2>\n

Think of generative AI as rolling a weighted dice. The training data determine the weights (or probabilities). <\/p>\n

If the dice represents the next word in a sentence, a word often following the current word in the training data will have a higher weight. So, “sky” might follow “blue” more often than “banana”. When the AI “rolls the dice” to generate content, it’s more likely to choose statistically more probable sequences based on its training.<\/p>\n

So, how can LLMs generate content that “seems” original? <\/p>\n

Let’s take a fake listicle – the “best Eid al-Fitr gifts for content marketers” – and walk through how an LLM can generate <\/em>this list by combining textual cues from documents about gifts, Eid, and content marketers.<\/p>\n

Before processing, the text is broken down into smaller pieces called “tokens.” These tokens can be as short as one character or as long as one word.<\/p>\n

Example:<\/strong> “Eid al-Fitr is a celebration” becomes [“Eid”, “al-Fitr”, “is”, “a”, “celebration”].<\/p>\n

This allows the model to work with manageable chunks of text and understand the structure of sentences.<\/p>\n

Each token is then converted into a vector (a list of numbers) using embeddings. These vectors capture the meaning and context of each word.<\/p>\n

Positional encoding adds information to each word vector about its position in the sentence, ensuring the model doesn’t lose this order information.<\/p>\n

Then we use an attention mechanism<\/strong>: this allows the model to focus on different parts of the input text when generating an output. If you remember BERT, this is what was so exciting to Googlers about BERT. <\/p>\n

If our model has seen texts about “gifts<\/strong>” and knows that people give gifts <\/strong>during celebrations<\/strong>, and it has also seen texts about “Eid al-Fitr<\/strong>” being a significant celebration<\/strong>, it will pay “attention<\/strong>” to these connections.<\/p>\n

Similarly, if it has seen texts about “content marketers<\/strong>” needing specific tools <\/strong>or resources<\/strong>, it can connect the idea of “gifts<\/strong>” to “content marketers<\/strong>“.<\/p>\n

\"Image\"Image<\/figure>\n

Now we can combine contexts: As the model processes the input text through multiple Transformer layers, it combines the contexts it has learned.<\/p>\n

So, even if the original texts never mentioned “Eid al-Fitr gifts for content marketers,” the model can bring together the concepts of “Eid al-Fitr,” “gifts,” and “content marketers” to generate this content.<\/p>\n

This is because it has learned the broader contexts around each of these terms.<\/p>\n

After processing the input through the attention mechanism and the feed-forward networks in each Transformer layer, the model produces a probability distribution over its vocabulary for the next word in the sequence.<\/p>\n

It might think that after words like “best” and “Eid al-Fitr,” the word “gifts” has a high probability of coming next. Similarly, it might associate “gifts” with potential recipients like “content marketers.”<\/p>\n

Get the daily newsletter search marketers rely on.<\/p>\n

\t\t\t\t\t\t\t

\t\t\t\t\t\t\t