Nano Course: Building Large Language Models for Code
In this Free Nano GenAI Course on Building Large Language Models for Code, you will-
Learn how to train LLMs for Code from Scratch covering Training Data Curation, Data Preparation, Model Architecture, Training, and Evaluation Frameworks.
Explore each step in-depth, delving into the algorithms and techniques used to create StarCoder, a 15B code generation model trained on 80+ programming languages.
Understand and learn the best practices to train your own StarCoder on the data
Key Takeaways from the “Nano Course: Building Large Language Models for Code”
-
Learn how to train LLMs for code fom scratch
-
Deep dive into StarCoder journey
-
Understand algorithms and techniques used at each step involved in development of StarCoder
-
Learn best practices to train your own StarCoder model on data
-
Explore the model architecture, training and evaluation frameworks for Code LLMs
Course curriculum
-
1
Building Large Language Models for Code
- Introduction
- Agenda
- BigCode Community
- Training LLMs for Code from Scratch: Training Data Curation
- Training Data Formatting and Preprocessing
- Model Architecture
- BigCode Ecosystem
- Training Frameworks
- Model Evaluation
- Tools and Descendants of StarCoder
Instructor
Loubna Ben Allal, ML Engineer at Hugging Face
