Individual results
View docsView in-depth performance of a single language model on a single test suite.
Region-by-region surprisal
Sample item for Reflexive Number Agreement (masculine; with subject relative clause)
The first item of the test suite is shown below for quick reference. Please visit the page for Reflexive Number Agreement (masculine; with subject relative clause) to see the full list of items.
Item |
Condition
|
intro | np_subject | that | embed_vp | the | embed_np | matrix_v | reflexive |
---|---|---|---|---|---|---|---|---|---|
Item | Condition | intro | np_subject | that | embed_vp | the | embed_np | matrix_v | reflexive |
1 | match_sing | The | author | that | liked | the | senators | hurt | himself |
1 | mismatch_sing | The | author | that | liked | the | senators | hurt | themselves |
1 | match_plural | The | authors | that | liked | the | senator | hurt | themselves |
1 | mismatch_plural | The | authors | that | liked | the | senator | hurt | himself |
Prediction performance for GPT-2 on Reflexive Number Agreement (masculine; with subject relative clause)
Accuracy |
Formula
|
Description |
---|---|---|
Accuracy | Prediction | Description |
68.42% | (598,match_sing/8,reflexive) < (600,mismatch_sing/8,reflexive) | No description provided. |
78.95% | (599,match_plural/8,reflexive) < (597,mismatch_plural/8,reflexive) | No description provided. |