CLD3

на сайте с 12 сентября 2022, 18:06
CLD3 is a neural network model for language identification. This package contains the inference code and a trained model. The inference code extracts character ngrams from the input text and computes the fraction of times each of them appears. For example, as shown in the figure below, if the input text is "banana", then one of the extracted trigrams is "ana" and the corresponding fraction is 2/4. The ngrams are hashed down to an id within a small range, and each id is represented by a dense...