Corus
Corus — коллекция русскоязычных NLP-датасетов, на сайте с December 22, 2022 06:31
Links to publicly available Russian corpora + code for loading and parsing. 20+ datasets, 350Gb+ of text.