Individual results
View docsView in-depth performance of a single language model on a single test suite.
Region-by-region surprisal
Sample item for Reflexive Number Agreement (feminine; with subject relative clause)
The first item of the test suite is shown below for quick reference. Please visit the page for Reflexive Number Agreement (feminine; with subject relative clause) to see the full list of items.
Item |
Condition
|
intro | np_subject | that | embed_vp | the | embed_np | matrix_v | reflexive |
---|---|---|---|---|---|---|---|---|---|
Item | Condition | intro | np_subject | that | embed_vp | the | embed_np | matrix_v | reflexive |
1 | match_sing | The | author | that | liked | the | senators | hurt | herself |
1 | mismatch_sing | The | author | that | liked | the | senators | hurt | themselves |
1 | match_plural | The | authors | that | liked | the | senator | hurt | themselves |
1 | mismatch_plural | The | authors | that | liked | the | senator | hurt | herself |
Prediction performance for GPT-2 on Reflexive Number Agreement (feminine; with subject relative clause)
Accuracy |
Formula
|
Description |
---|---|---|
Accuracy | Prediction | Description |
15.79% | (664,match_sing/8,reflexive) < (666,mismatch_sing/8,reflexive) | No description provided. |
100.00% | (665,match_plural/8,reflexive) < (663,mismatch_plural/8,reflexive) | No description provided. |