Created at 4pm, Mar 24
t2ruvaArtificial Intelligence
0
What Is ChatGPT Doing … and Why Does It Work?
Wz7zoWAy8-1v7xwIhJfn9nK25PC68tBYzCAayhUJXuo
File Type
PDF
Entry Count
145
Embed. Model
jina_embeddings_v2_base_en
Index Type
hnsw

That ChatGPT can automatically generate something that readseven superficially like human-written text is remarkable, andunexpected. But how does it do it? And why does it work? Mypurpose here is to give a rough outline of what’s going on insideChatGPT—and then to explore why it is that it can do so well inproducing what we might consider to be meaningful text. I shouldsay at the outset that I’m going to focus on the big picture of what’sgoing on—and while I’ll mention some engineering details, I won’tget deeply into them. (And the essence of what I’ll say applies justas well to other current “large language models” [LLMs] as toChatGPT.)

So how in more detail does this work for the digit recognition network? We can think of the network as consisting of 11 successive layers, that we might summarize iconically like this (with activation functions shown as separate layers): At the beginning were feeding into the first layer actual images, represented by 2D arrays of pixel values. And at the endfrom the last layerwere getting out an array of 10 values, which we can think of saying how certain the network is that the image corresponds to each of the digits 0 through 9.
id: a1512a9aa1e2c8de4038b5becdb2a498 - page: 42
Feed in the image and the values of the neurons in that last layer are: In other words, the neural net is by this point incredibly certain that this image is a 4and to actually get the output 4 we just have to pick out the position of the neuron with the largest value. But what if we look one step earlier? The very last operation in the network is a so-called softmax which tries to force certainty. But before thats been applied the values of the neurons are: The neuron representing 4 still has the highest numerical value.
id: 74dc6b8e6873771d2c874f36d28c1aef - page: 43
But theres also information in the values of the other neurons. And we can expect that this list of numbers can in a sense be used to characterize the essence of the imageand thus to provide something we can use as an embedding. And so, for example, each of the 4s here has a slightly different signature (or feature embedding)all very different from the 8s: Here were essentially using 10 numbers to characterize our images. But its often better to use much more than that. And for example in our digit recognition network we can get an array of 500 numbers by tapping into the preceding layer. And this is probably a reasonable array to use as an image embedding. If we want to make an explicit visualization of image space for handwritten digits we need to reduce the dimension, effectively by projecting the 500-dimensional vector weve got into, say, 3D
id: ebb40ee093f0e531572c7606799c75ba - page: 43
And we can do the same thing much more generally for images if we have a training set that identifies, say, which of 5000 common types of object (cat, dog, chair, ) each image is of. And in this way we can make an image embedding thats anchored by our identification of common objects, but then generalizes around that according to the behavior of the neural net. And the point is that insofar as that
id: c8d2461fd362c0e8955600fbe9c58b70 - page: 44
How to Retrieve?
# Search

curl -X POST "https://search.dria.co/hnsw/search" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"rerank": true, "top_n": 10, "contract_id": "Wz7zoWAy8-1v7xwIhJfn9nK25PC68tBYzCAayhUJXuo", "query": "What is alexanDRIA library?"}'
        
# Query

curl -X POST "https://search.dria.co/hnsw/query" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"vector": [0.123, 0.5236], "top_n": 10, "contract_id": "Wz7zoWAy8-1v7xwIhJfn9nK25PC68tBYzCAayhUJXuo", "level": 2}'