

SMOL-GPT
2 likes
A minimal PyTorch implementation for training your own small LLM from scratch. Designed for educational purposes and simplicity, featuring efficient training, flash attention, and modern sampling techniques.
Cost / License
- Free
- Open Source
Platforms
- Python
- Windows
- Mac
- Linux
- BSD
- PyTorch
Features
- Python-based
Tags
SMOL-GPT News & Activities
Highlights All activities
Recent activities
SMOL-GPT information
No comments or reviews, maybe you want to be first?
Post comment/reviewWhat is SMOL-GPT?
A minimal PyTorch implementation for training your own small LLM from scratch. Designed for educational purposes and simplicity, featuring efficient training, flash attention, and modern sampling techniques.
Features:
- Minimal Codebase: Pure PyTorch implementation with no abstraction overhead
- Modern Architecture: GPT model with:
- Flash Attention (when available)
- RMSNorm and SwiGLU
- Efficient top-k/p/min-p sampling
- Rotary embeddings - RoPE (Optional)
- Training Features:
- Mixed precision (bfloat16/float16)
- Gradient accumulation
- Learning rate decay with warmup
- Weight decay & gradient clipping
- Dataset Support: Built-in TinyStories dataset processing
- Custom Tokenizer: SentencePiece tokenizer training integration





