Individual results
View docsView in-depth performance of a single language model on a single test suite.
Region-by-region surprisal
Sample item for Center Embedding
The first item of the test suite is shown below for quick reference. Please visit the page for Center Embedding to see the full list of items.
Item |
Condition
|
intro | np_1 | that | det_2 | np_2 | verb1 | verb2 |
---|---|---|---|---|---|---|---|---|
Item | Condition | intro | np_1 | that | det_2 | np_2 | verb1 | verb2 |
1 | plaus | The | painting | that | the | artist | painted | deteriorated |
1 | implaus | The | painting | that | the | artist | deteriorated | painted |
Prediction performance for TinyLSTM on Center Embedding
Accuracy |
Formula
|
Description |
---|---|---|
Accuracy | Prediction | Description |
82.14% | ( (634,plaus/6,verb1) + (634,plaus/7,verb2) ) < ( (633,implaus/6,verb1) + (633,implaus/7,verb2) ) | The sum of Verb1+Verb2 should be lower in the match condition than in the mismatch condition, where the verb plausibility matches a first-in-first-out ordering. |