The Mechanics of LLMs

Home /
The Mechanics of LLMs

The Mechanics of LLMs

Theory, Architecture and Practice for Engineers

View paperback on Amazon View Kindle on Amazon

Editions (languages)

Français: French edition
- Paperback: https://www.amazon.fr/dp/B0G6Z4KWSQ
- Kindle: https://www.amazon.fr/Mecanique-LLM-Architecture-Pratique-lIngenieur-ebook/dp/B0G7MTBYWT
English: English edition
- Paperback: https://www.amazon.com/Mechanics-LLMs-Architecture-Practice-Engineers/dp/B0GFTCY2K9
- Kindle: https://www.amazon.com/Mechanics-LLMs-Architecture-Practice-Engineers-ebook/dp/B0GFNYLTGS
Español: Spanish edition
- Paperback: https://www.amazon.com/Mecanica-los-LLM-Arquitectura-Ingeniero/dp/B0GD7CR2BZ
- Kindle: https://www.amazon.com/Mecanica-los-LLM-Arquitectura-Ingeniero-ebook/dp/B0GD57FXLH
Português (Brasil): Portuguese (Brazil) edition
- Paperback: https://www.amazon.com/Mecanica-dos-LLM-Arquitetura-Engenheiro/dp/B0GH18LPW3
- Kindle: https://www.amazon.com/Mecanica-dos-LLM-Arquitetura-Engenheiro-ebook/dp/B0GFT3ZJ4H
العربية: Arabic translation
- Paperback: https://www.amazon.com/Mechanics-LLMs-Architecture-Practice-Engineers/dp/B0GFTCY2K9
- Kindle: https://www.amazon.com/Mechanics-LLMs-Architecture-Practice-Engineers-ebook/dp/B0GFNYLTGS

Why this book?

As an engineer and Chief Information Officer, the author adopts an architectural and decision-making approach: not just “what a model does”, but “how” and “under what conditions” it integrates into an information system.

Since the emergence of Transformers, artificial intelligence has undergone a major disruption. It is no longer a mysterious black box – it is an understandable engineering architecture.

This book dissects LLMs with the same rigour as a complex IT architecture. No magic promises: principles, equations and executable code, with an explicit IT decision-maker’s perspective.

Overview: 15 progressive chapters

Part I: Fundamentals (Chapters 1-3)

Mathematical and architectural foundations

Ch. 1 – Introduction to Natural Language Processing
- Classic NLP vs modern approaches
- Sequence prediction paradigm
Ch. 2 – Text Representation and Sequential Models
- Tokenisation (BPE, WordPiece, SentencePiece)
- Embeddings and vector representations
- RNN, LSTM, GRU models
Ch. 3 – Transformer Architecture
- Self-attention: formula, intuition, calculations
- Multi-head attention and its benefits
- Normalisation (LayerNorm) and residual connections

Part II: Architecture & Optimisation (Chapters 4-8)

Building and training at scale

Ch. 4 – Transformer-Derived Models
- BERT, GPT, T5: architectures and applications
- Vision Transformers (ViT)
Ch. 5 – Architectural Optimisations
- Linear attention and approximations
- Key-Value Cache and efficient inference
Ch. 6 – Mixture-of-Experts (MoE) Architecture
- Routing algorithms
- Scaling laws with MoE
Ch. 7 – LLM Pre-training
- Pre-training objectives
- Data, tokenisation, and loss functions
- Scaling laws: compute vs data vs model size
Ch. 8 – Training Optimisations
- Gradient checkpointing and activation checkpointing
- Distributed training: DDP, FSDP
- Optimisers: Adam, AdamW, modern variations

Part III: Learning & Alignment (Chapters 9-12)

From raw model to useful assistant

Ch. 9 – Supervised Fine-Tuning (SFT)
- Instruction tuning
- LoRA and QLoRA: parameter reduction
- Resource-efficient fine-tuning
Ch. 10 – Alignment with Human Preferences
- RLHF (Reinforcement Learning from Human Feedback)
- Reward models and their challenges
- Implicit vs explicit preferences
Ch. 11 – Generation and Inference Strategies
- Sampling, Temperature, Top-k, Top-p
- Beam search and guided generation
- Logits processors and constraints
Ch. 12 – Reasoning Models
- Chain-of-Thought (CoT)
- Tree-of-Thought (ToT)
- Self-consistency and majority voting

Part IV: Agentic Ecosystem (Chapters 13-15)

Deployment and autonomous use

Ch. 13 – Augmented Systems and RAG
- Retrieval-Augmented Generation
- Vector databases and similarity search
- Chunking strategies and indexing
Ch. 14 – Standard Agentic Protocols (MCP)
- Model Context Protocol
- Tool calling and function definitions
- Agent loops and orchestration
Ch. 15 – Critical Evaluation of Agentic Flows
- Quality metrics (BLEU, ROUGE, BERTScore)
- Evaluation frameworks
- Limitations and hallucinations

Included resources

9 Executable Python Scripts

All theoretical concepts are illustrated with working code:

01_tokenization_embeddings.py — Tokenisation and vectors
02_multihead_attention.py — Self-attention in detail
03_temperature_softmax.py — Sampling and temperature
04_rag_minimal.py — Minimal RAG pipeline
05_pass_at_k_evaluation.py — Model evaluation
06_react_agent_bonus.py — ReAct agents
07_llamaindex_rag_advanced.py — Advanced RAG
08_lora_finetuning_example.py — LoRA and fine-tuning
09_mini_assistant_complet.py — Integrating mini-assistant

All scripts:

✅ Executable without external API (demo/simulation mode)
✅ Documented and explained line by line
✅ Compatible with Python 3.9+
✅ Freely available on GitHub

Book characteristics

Aspect	Details
Author	Mustapha Alouani
Pages	153 pages
Chapters	15 technical chapters
Format	6 × 9 inches
Language	French (English edition in preparation)
Audience	Engineers, advanced students, technical leaders
Prerequisites	Probability, linear algebra, Python practice
Level	Intermediate → advanced
Status	✅ Published (2025)

Who is this book for?

✅ Engineers wanting to understand LLMs beyond an API
✅ Students in computer science, ML, AI: a rigorous resource
✅ Data Scientists transitioning to LLMs
✅ Technical leaders needing to integrate LLMs
✅ Researchers in NLP and ML looking for a reference
✅ Developers curious about what happens “under the hood”

❌ Not recommended for: Readers just looking to “use ChatGPT”

What the reader gains

After reading this book, the reader will be able to:

Explain how a Transformer really works
Analyse trade-offs between quality and computational cost
Justify architectural choices (number of layers, heads, hidden size)
Evaluate an AI system critically
Implement key concepts in code
Argue in a structured way in technical discussions
Make informed decisions about using LLMs in an information system

How to get the book

View paperback on Amazon View Kindle on Amazon

Available via the Kindle ecosystem (e-reader, tablet or computer) or in paperback format.

Additional resources

Code and Scripts: GitHub – Mechanics of LLMs
Blog: In-depth articles on AI and LLMs
Newsletter: Technical news and insights
Back to books: All books →

Author’s note

This book was born from a recurring need observed among technical teams and decision-makers: to understand what is really happening behind a language model API, in order to make informed decisions. It is designed to be read with pen in hand, taking time to follow the reasoning, formulas and code.
It is an engineering book, oriented towards decision-making. It is aimed at those who build systems as well as those who decide on their use.

Mustapha Alouani

English edition available on Amazon (paperback and Kindle).