Individual results
View docsView in-depth performance of a single language model on a single test suite.
Region-by-region surprisal
Sample item for Cleft Structure
The first item of the test suite is shown below for quick reference. Please visit the page for Cleft Structure to see the full list of items.
Item |
Condition
|
intro | subj | verb | passive | verb.1 | matrix_v |
---|---|---|---|---|---|---|---|
Item | Condition | intro | subj | verb | passive | verb.1 | matrix_v |
1 | np_mismatch | What | he | did | was | the meal | |
1 | np_match | What | he | ate | was | the meal | |
1 | vp_match | What | he | did | was | prepare | the meal |
1 | vp_mismatch | What | he | ate | was | prepare | the meal |
Prediction performance for Transformer XL on Cleft Structure
Accuracy |
Formula
|
Description |
---|---|---|
Accuracy | Prediction | Description |
95.00% | ((547,np_mismatch/6,matrix_v)-(545,np_match/6,matrix_v))+(((546,vp_mismatch/5,verb.1)+(546,vp_mismatch/6,matrix_v))-((548,vp_match/5,verb.1)+(548,vp_match/6,matrix_v)))>0 | We expect that the Matrix Verb has lower surprisal in the NP Match condition, where we have a lexicalized verb (“ate” instead of “did”). In addition, we expect that the sum of the Verb 1 + Matrix Verb has lower surprisal in the VP Match condition, where it cannot be the object of a lexicalized verb such as “ate.” Together, the differences between these sums should be greater than zero. |