Books that have shaped my understanding of ML, AI Engineering, and building production-grade systems. Ranked by impact on my professional journey.
AI Engineering
Chip Huyen • O'Reilly Media • 2024
A guide to building real-world applications using pre-trained large language and multimodal models. It distinguishes itself from traditional machine learning by focusing on adapting and integrating existing models rather than training them from scratch. I particularly appreciated Huyen's perspective on AI Engineering. Coming from a standard ML background, I found myself lost when I started working in this field. This book provided me with a solid foundation and practical insights into the challenges and best practices of AI Engineering, helping me find my place in the fast-paced AI environment.
The StatQuest Illustrated Guide To Machine Learning
Josh Starmer • StatQuest • 2022
This book takes the machine learning algorithms, no matter how complicated, and breaks them down into small, bite-sized pieces that are easy to understand. I believe Josh Starmer has done an incredible job of making complex topics accessible and engaging. It's the go-to perfect book for anyone looking to get a solid understanding of machine learning without getting bogged down in technical jargon. Of course it cannot replace a deep dive into the mathematics behind the algorithms, but it is a great starting point to get the big picture and understand how the algorithms work at a high level.
The StatQuest Illustrated Guide to Neural Networks and AI
Josh Starmer • StatQuest • 2025
This book explains neural networks from the basic concepts all the way through the state of the art Transformers that power modern AI tools like ChatGPT, and it also includes hands-on tutorials in PyTorch. I was very optimistic about this book since I loved the previous one by the same author. However, I found this one to be a bit disappointing. While it does cover a lot of ground, I felt that it lacked the same clarity with respect to the pytorch examples that tend to be a bit repetitive in the sense that they help you build from scratch the architectures but don't offer much in terms of deeper insights or variations that you can do to get more familiar with the concepts. That said, I still think it's a good resource for getting a high-level understanding of neural networks and how they work, especially if you're new to the field. Here the premises are similar to the previous book: it is not a deep dive into the mathematics behind neural networks, but rather a high-level overview of the concepts and how they work.