Chat GPT : The secret sauce revealed.

Monday, March 25, 2024

Primary Blog/Chat GPT : The secret sauce revealed.

Chat GPT : The secret sauce revealed.

Like many people, you may find yourself wondering how such monstrous technology seemingly emerged "out of nowhere." We're here to tell you that there's no such thing as nowhere, and every technology stands on the shoulders of giants. It represents one or multiple additions to existing technologies that enable breakthroughs. As Lavoisier said, "Nothing is created; everything is transformed."

Let's take a journey through history and explore the steps that have led us to where we are now.

How Does Weather Forecasting Work?

Long before Google, Chat GPT, and their underlying technologies, we relied on television weather forecasts, believing in them but often wondering how they worked. The heroes of yesteryear were our meteorologists, but their methods were far from magic. Rather, they employed mathematics and logic to unravel the mysteries of weather prediction.

In short, the process goes something like this: By examining historical data, such as the weather patterns of the last seven years on March 23rd, when we experienced five days of rain (from the 17th to the 22nd) and temperatures of 17 degrees, meteorologists can predict with some certainty that on similar days, with the same wind patterns, we can expect temperatures of 18 degrees and clear skies. This concept boils down to "history repeats itself," given the same elements and chain of events. Of course, the actual forecasting process is more complex, involving various parameters like wind patterns and the number of similar days, but the basic idea remains the same.

How Does Google Ranking Work?

Chances are, you've been using Google for years without ever questioning how it operates. Thankfully, Google has provided a university document outlining their patent, shedding light on its inner workings for everyone in the industry.

In simple terms: "Markov Chains." But hold on, don't tune out just yet—we'll break it down for you. The theory of networks, exploring the relationship between the current state and the previous state, has been thoroughly studied and encapsulated into algorithms. While this might sound dull, its potential to revolutionize our world for the better is anything but.

Picture this: If a baby were to crawl along all available paths indefinitely, the highest probability of finding it at any given time would be to begin the search at point 5. Why, you ask? Because there are more pathways leading to point 5, and those pathways also connect to the other points.

This implies that the website labeled as "5" is the most popular. But popular for what exactly? Imagine a website with a page covering every imaginable topic in the world. Herein lies the magic: it's popular in an absolute sense. Regardless of the request or user query, it holds the highest popularity. Hence, the ranking, also known as PageRank, isn't tied to a specific user input; it's generic. Furthermore, this model's scalability stems from the fact that its output isn't dependent on the input.

Now, how does this relate to user queries? Let's consider someone searching for information about hotels. Upon examining the page content, you'll realize that only one page adequately responds to this query. Thus, you end up with:

This means that google has a list of pages, all have their ranking (in advance), ranked by % and when asking for the pages it does retrieve the highest ranked page that includes the query. (Starting by the page title, meta data, name in the links..) but basically it simply retrieves an index.

At that time, it wasn't termed artificial intelligence, as it doesn't "learn" in the traditional sense, but rather indexes and retrieves information. However, the crucial aspect is that the dataset (the internet) was pre-learned.

How does Amazon work ?

You might wonder, "What does Amazon have to do with all of this?" Let me explain.

Amazon excels at predicting the next product you're likely to request based on not only your past purchases but also those of countless other users before you. Here, we delve into the realm of Artificial Intelligence, employing predictive models and machine learning to discern causal relationships among a chain of events.

For instance, if you've purchased diapers and a highchair, Amazon may suggest you also buy beer for the evening. Why? Because now, as a new dad, your life has become more challenging, and you might need a break when the little one sleeps. This epitomizes AI at its finest, as the typical additional product we might expect to recommend—such as a baby bath or changing table—is surpassed by the nuanced understanding derived from data analysis. Amazon is not so far advanced in predicting your real desire, so they came up with this.. for now..

Artificial intelligence has indeed been in existence for some time, but in this instance, it's simply suggesting an existing product based on other existing products. It doesn't generate anything new; rather, it offers recommendations based on existing data.

In both preceding examples, these intelligent systems excel at ranking, indexing, or automatically recommending new products—an impressive feat that has kept us satisfied thus far.

Then Came Google, Bing Auto Complete

Another tool you use all the time is autocomplete. It suggests the next word you're likely to add to your query based on past queries from others in similar situations. This mirrors the example of Amazon but in the context of search terms. For instance, if you type "Baby diapers," it might suggest "near me" or "size," drawing from thousands of other users who have typed similar queries. It's not reinventing the wheel; rather, it operates on the principle that if 99% of the population has used the term "near me," there's a high probability it will be relevant for you too.

The common thread among all these technologies is learning from the past and calculating the probabilities of your next move without explicitly asking you to make a move.

Then came ChatGPT, offering an autocomplete feature on steroids.

Now, imagine not only checking your searches but delving into all internet pages, books, questions, answers, comments, and reviews. Then, we create an autocomplete feature. If you gather 100 books on the same topic, their content will be quite similar, albeit with slight variations. However, if you add another 500 books from a different context, you'll amass a dataset from which you can determine the probability of the next word a human would write.

For instance, consider the query: "Why are children afraid of going to the dentist?"

If your now read all the documents already written on the topic, they will have a sequence with

[IF noun adjective THEN answer with NOUN VERB] kind of structure..

But additionally, you will find mathematically.

Children (99%), if children can (98%), if children can then be (100%)... and so forth. The system establishes logic based on the preceding chain, akin to how a web crawler navigates from website to website or how the pattern of rain followed by sun prevailed over 7 years in similar situations.

But there is more…Let’s ask again..

What? Not the same result for the same query? How come? This is because someone designed it this way, making the results variable with potential adjustments using a parameter called "temperature." This approach aims to keep it "interesting" and ensures that almost never will two people receive identical answers to the same question. It's akin to Google changing their ranking per query (although not the case at the moment).

Ah-ha, no voodoo magic here, just parameters, mathematics, and probability—crafted by humans, for humans to comprehend.

And there is more…