New Book Releases For Large Language Models

Date:2024-04-27 Visits:

In March 2023, we published a comprehensive survey paper titled “A Survey of Large Language Models”. This survey has been updated to its 13th version, encompassing 83 pages of main content and covering over 900 references. The aim of this survey is to systematically organize the research advancements and core technologies related to large language models, discussing a significant number of relevant works. Since the preprint of this survey was released, it has garnered considerable attention from readers.

Following the release of the English survey article, several readers inquired about the availability of a corresponding Chinese version. In response, we published the Chinese translation of the survey in August 2023. To further provide Chinese study materials on large language models, we initiated the compilation of a Chinese book at the end of December 2023 and have recently completed the initial draft. Unlike the English survey article, the Chinese book is designed to cater to beginner readers of large language model technology. Consequently, we have significantly updated and reorganized the content to present a comprehensive framework and roadmap of large language model technology. This book is intended for advanced undergraduate and early graduate students with a background in deep learning and can serve as an introductory reference. For more information about the Chinese book project, please visit the following link:

Chinese Book Download Links

Download Link 1:

Download Link 2:

Organization of the Book Chapters

Part One: Background and Fundamentals

Chapter 1: Introduction

- Development history of large language models

- Overview of key technologies

Chapter 2: Fundamentals

- Scaling Law

- Development history of the GPT series models

Chapter 3: Large Language Model Resources

- Open-source models

- Data

- Code libraries

Part Two: Pre-training

Chapter 4: Data Preparation

- Data collection

- Data cleaning

- Data balancing

- Curriculum learning methods

Chapter 5: Model Architecture

- Transformer structure

- Mainstream architectures of large language models

- Detailed improvements

Chapter 6: Model Pre-training

- Pre-training tasks

- Optimization parameter settings

- Parallel training methods

Part Three: Fine-tuning and Alignment

Chapter 7: Instruction Fine-tuning

- Instruction data collection and synthesis methods

- Instruction fine-tuning strategies and effects

Chapter 8: Human Alignment

- 3H standards

- RLHF algorithms

- Non-RL algorithms

Part Four: Using Large Language Models

Chapter 9: Decoding and Deployment

- Decoding generation algorithms

- Decoding acceleration algorithms

- Model compression algorithms

Chapter 10: Prompt Learning

- Basic prompting methods

- Contextual learning

- Chain of thought

Chapter 11: Planning and Agents

- Complex planning methods

- Building agents

Part Five: Evaluation and Applications

Chapter 12: Evaluation

- Evaluation metrics and methods

- Basic and advanced capability evaluation

- Evaluation systems

Chapter 13: Applications

- Overview of applications in research and professional fields

During the writing of this book, we received extensive feedback and numerous revisions from many colleagues, for which we express our heartfelt gratitude. We hope that everyone will continue to support and follow our Chinese book on large language models. Your support and feedback are our greatest motivation to move forward. This first edition of the book is just a starting point; we plan to continuously update and improve the content online. We especially welcome readers to provide valuable criticisms and suggestions, and we will acknowledge those who contribute significantly on our website. If you have any comments, feedback, or suggestions, please use the GitHub Issues page ( or contact us via email.

To better organize and disseminate the latest advancements and technical frameworks of large model technology, we offer the following supplementary resources for readers to reference and use while reading this book.

LLMBox: We have developed a comprehensive code toolkit called LLMBox, specifically designed for the development and implementation of large language models. It is based on a standardized training process and a comprehensive model evaluation framework. LLMBox aims to be a pipeline for training and utilizing large language models, integrating numerous practical features to ensure high flexibility and efficiency during both the training and utilization stages. Toolkit link:

YuLan: The YuLan series models are large language models supporting chat functionality, developed collaboratively by the faculty and students of the Gaoling School of Artificial Intelligence at Renmin University of China. The name “YuLan” is derived from the university's emblematic flower. The latest version has been entirely pre-trained from scratch and utilizes curriculum learning techniques for supervised fine-tuning on bilingual data in both Chinese and English. This includes high-quality instructions and human preference data. Model link: