на сайте с May 04, 2023 16:11
This is a custom project from Stanford CS224N Winter 2023 class. The goal of this project is to answer this question Will alignment from human feedback also help small language models such as GPT-2? And the answer is YES! With RLHF, evaluation shows that ChatGPT prefers the aligned GPT-2 outputs for 96% of times over the vanilla GPT-2 outputs, and 88% times over the supervised fine-tuning baseline. Please see the technical report for more details.