What is ChatGPT doing … and why does it work?

Stephen Wolfram excels at explaining complicated topics in comprehensible ways, so if you want to be able to explain it to other mortals, here’s your guide.

 

18 June 2024 — A number of years I had the opportunity to attend an event at the Edge Foundation, an organization that brings together people working at the edge of a broad range of scientific and technical fields. It was where I first met Stephen Wolfram. If you ever get invited to an Edge dinner or event, just go. You might find yourself sitting next to Richard Dawkins, Jared Diamond, John Tooby, David Deutsch, Nicholas Carr, Alex Pentland, Nassim Nicholas Taleb, Martin Rees, A.C. Grayling, etc.  Or all of them.

Last year Wolfram published a piece on ChatGPT which has become the foundational “explainer” of all the chat/LLMs out there. I wanted to repeat it since many of us are are being dragged down that rabbit hole, and it is getting as complicated as ever. From the intro:

That ChatGPT can automatically generate something that reads even superficially like human-written text is remarkable, and unexpected. But how does it do it? And why does it work? My purpose here is to give a rough outline of what’s going on inside ChatGPT—and then to explore why it is that it can do so well in producing what we might consider to be meaningful text. I should say at the outset that I’m going to focus on the big picture of what’s going on—and while I’ll mention some engineering details, I won’t get deeply into them. (And the essence of what I’ll say applies just as well to other current “large language models” [LLMs] as to ChatGPT.) 

For a link to the full piece please click here.

Leave a Reply

Your email address will not be published. Required fields are marked *

scroll to top