Stephen Wolfram, What Is ChatGPT Doing … and Why Does It Work? (Illinois: Wolfram Media Inc., March 2023)
Reviewed by Shaza Arif
Artificial Intelligence (AI) models such as ChatGPT have become increasingly popular owing to their impressive human-like responses generated in an unexpectedly short time. Individuals from all age groups continue to explore such models with great excitement, given their captivating potential in different fields from content writing to coding. While it is interesting to use and explore ChatGPT in particular, it is equally important to examine its internal workings to have a better understanding of the fascinating Chatbot. Authored by Stephen Wolfram, the book What Is ChatGPT Doing … and Why Does It Work? is a recent take on the subject. Wolfram is well-known for his contributions in Mathematics, Physics and Computer Sciences and is the founder of Wolfram Alpha (a search engine used to perform advanced computational tasks) and Mathematica (an algebraic software system). In approximately 100-odd pages, he deconstructs how and why ChatGPT can provide us with all the answers we ask in such ‘human-like’ writing style.
The book is structured into two parts. The initial part provides an overview of relevant Machine Learning (ML) concepts behind ChatGPT. The focus is on the explanation of neural nets (p.17), neural net training (p.35), loss function (p.31), embedding (p.46) and the process of training models. The subsequent section of the book sheds light on the content-generating capability of ChatGPT. In short, the book explains three simple stages to explain the working of ChatGPT. In the first stage, the human prompt is broken down into smaller parts (referred to as token) and converted into a series of digits through a process called ‘embedding’ (the process by which the computer starts to understand the text). In the next step, the series of numbers is then passed on to a series of layers of Artificial Neural Networks (ANN), where they are modified and form a new kind of embedding. The last part of the embedding is then assessed by the neural network to predict the next part by using thousands of available values. In the final stage, the complex structure of the neural network – the exact working of which is still not understood generates the final output that captures the nuance of human-like thinking in written text. The training data used by ChatGPT in the process comprises abundant text from online sources such as digital books and websites.
The author skillfully employs straightforward language, making intricate concepts more approachable for the reader. Significantly, the book highlights the remarkable progress in computer capabilities, particularly in areas such as human-like writing, which were once considered challenging for machines but are now achievable with relative ease. In this context, ChatGPT operates on a model with an underlying well-defined structure and approximately 175 billion perimeters (p.13). These parameters are sufficient to generate human-like text. He concludes that the computational speed and process used in ChatGPT are more straightforward than they were generally assumed – what he calls a ‘computationally shallower process’.
Wolfram also offers a compelling exploration of the intersection between Technology, Science, and Philosophy, delving into the intricacies and philosophical underpinnings of these fields. He thoughtfully presents the paradox of human achievement in technology and exploration -highlighting our significant strides in innovation and discovery, yet acknowledging the perpetual mysteries that elude our understanding, particularly in the realm of Artificial Neural Networks (ANN). The narrative underscores that, despite our advancements, some aspects of technology, especially as complex as ANN, may always remain beyond our understanding like the exact processing of the text by ANN. Similarly, syntax is not fed into the ChatGPT algorithms – it is something that the system has implicitly learnt, and it remains for us to find out the exact process which enables it to do so. At its core, the book poignantly captures the essence of our human experience, reflecting on how our deep-seated awe and curiosity about our own intellect finds expression in our fascination with innovations like ChatGPT (dare we call them ‘creations’?). It looks into the profound realisation that our amazement is not merely with the technology itself but with what it reveals about us – our capacity for thought, innovation, and the creation of entities like AI and chatbots that mirror our intellectual prowess. This reflection on ChatGPT becomes a mirror, allowing us to marvel at the complexities and capabilities of the human mind, a journey that continually surprises and inspires us as we see our intellectual reflections in these advanced technologies.
The book culminates with a comparison between ChatGPT and Wolfram Alpha, shedding light on their distinct functionalities and applications. This section is particularly enlightening, offering a clear perspective on how these technologies differ and complement each other. Additionally, the latter part of the book is enhanced by illustrative examples that not only elucidate the inner workings of ChatGPT but also serve as thoughtful interludes, giving readers respite from the more technical discussion.
Despite its compact size, the book presents a rich and intellectually demanding exploration of advanced topics. It is for readers who already possess a foundational understanding of very technical concepts and training data. This is both its strength and may be considered a relative weakness since the text assumes familiarity with these principles, as they are not extensively elaborated upon but rather discussed at a more advanced level. This necessitates multiple readings for a complete grasp of the content, making it a challenging yet rewarding experience for those seeking deeper insights into these complex subjects.
While the book may not necessarily be an interesting read for many, it is highly recommended for AI and ML professionals, as well as students aspiring to explore these fields. It provides an insightful deep dive into the intricate world of AI and ML, enhancing understanding, especially of ChatGPT and its broader implications. Additionally, it offers valuable insights for policymakers, clarifying the capabilities and limitations of ChatGPT, which is crucial for informed decision-making on future technology regulations. For anyone with a basic grasp of AI and ML looking to expand their knowledge or explore the potential impacts of these technologies, this book is a valuable resource. Its depth and clarity make it a worthwhile read for those intrigued by the evolving AI landscape.
Shaza Arif is a Research Assistant at the Centre for Aerospace & Security Studies (CASS), Islamabad, Pakistan. She can be reached at firstname.lastname@example.org.