GPT2-Chinese

на сайте с December 18, 2022 16:02
Chinese version of GPT2 training code, using BERT tokenizer. Chinese version of GPT2 training code, using BERT tokenizer or BPE tokenizer. It is based on the extremely awesome repository from HuggingFace team Transformers. Can write poems, news, novels, or train general language models. Support char level, word level and BPE level. Support large training corpus.