nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs. It is a rewrite of minGPT that prioritizes teeth over education. Still under active development, but currently the file train.

nanoGPT screenshot 1

Cost / License

  • Free
  • Open Source

Platforms

  • Python
  • Mac
  • Windows
  • Linux
  • BSD
-
No reviews
0likes
0comments
0news articles

Features

Suggest and vote on features

nanoGPT News & Activities

Highlights All activities

Recent activities

Show all activities

nanoGPT information

  • Developed by

    US flagAndrej Karpathy
  • Licensing

    Open Source (MIT) and Free product.
  • Written in

  • Alternatives

    13 alternatives listed
  • Supported Languages

    • English

AlternativeTo Categories

AI Tools & ServicesDevelopment

GitHub repository

  •  51,120 Stars
  •  8,556 Forks
  •  325 Open Issues
  •   Updated  
View on GitHub
nanoGPT was added to AlternativeTo by Paul on and this page was last updated .
No comments or reviews, maybe you want to be first?
Post comment/review

What is nanoGPT?

The simplest, fastest repository for training/finetuning medium-sized GPTs. It is a rewrite of minGPT that prioritizes teeth over education. Still under active development, but currently the file train.py reproduces GPT-2 (124M) on OpenWebText, running on a single 8XA100 40GB node in about 4 days of training. The code itself is plain and readable: train.py is a ~300-line boilerplate training loop and model.py a ~300-line GPT model definition, which can optionally load the GPT-2 weights from OpenAI. That's it.

Because the code is so simple, it is very easy to hack to your needs, train new models from scratch, or finetune pretrained checkpoints (e.g. biggest one currently available as a starting point would be the GPT-2 1.3B model from OpenAI).

Official Links