Individual results
View docsView in-depth performance of a single language model on a single test suite.
Region-by-region surprisal
Sample item for Cataphor Prediction
The first item of the test suite is shown below for quick reference. Please visit the page for Cataphor Prediction to see the full list of items.
Item |
Condition
|
adjunct_comp | adjunct_subject | adjunct_verb | adjunct_rest | main_subj | main_adverb | main_verb | main_object | main_rest |
---|---|---|---|---|---|---|---|---|---|---|
Item | Condition | adjunct_comp | adjunct_subject | adjunct_verb | adjunct_rest | main_subj | main_adverb | main_verb | main_object | main_rest |
1 | cata_match | When | he | was | at the party, | the boy | cruelly | teased | the girl | about something. |
1 | cata_mismatch | When | she | was | at the party, | the boy | cruelly | teased | the girl | about something. |
1 | new_referent | While | I | was | at the party, | the boy | cruelly | teased | the girl | about something. |
Prediction performance for GPT-2 on Cataphor Prediction
Accuracy |
Formula
|
Description |
---|---|---|
Accuracy | Prediction | Description |
50.00% | ((679,cata_match/5,main_subj) < (680,cata_mismatch/5,main_subj)) | No description provided. |
100.00% | ((679,cata_match/5,main_subj) < (681,new_referent/5,main_subj)) | No description provided. |
0.00% | ((680,cata_mismatch/5,main_subj) > (681,new_referent/5,main_subj)) | No description provided. |