Individual results

View docs

View in-depth performance of a single language model on a single test suite.

Region-by-region surprisal
Sample item for Filler-Gap Dependencies (4 sentential embeddings)
Item
Condition
prefixcompembeddingsubjverbobjcontinuation
ItemConditionprefixcompembeddingsubjverbobjcontinuation
1 what_gap I know what our mother said her friend remarked that the park attendant reported the cop thinks your friend threw into the trash can
1 that_gap I know that our mother said her friend remarked that the park attendant reported the cop thinks your friend threw into the trash can
1 what_no-gap I know what our mother said her friend remarked that the park attendant reported the cop thinks your friend threw the plastic into the trash can
1 that_no-gap I know that our mother said her friend remarked that the park attendant reported the cop thinks your friend threw the plastic into the trash can
Showing 1 to 4 of 4 entries
Prediction performance for Transformer XL on Filler-Gap Dependencies (4 sentential embeddings)
Accuracy
Formula
Description
AccuracyPredictionDescription
33.33% what_no-gap.obj>that_no-gap.obj We expect the object to be less surprising in the that_no-gap condition than in the what_no-gap condition, because an upstream wh-word should set up an expectation for a gap.
66.67% what_gap.continuation<that_gap.continuation We expect the continuation to be lower in the what_gap condition than in the that_gap condition, because gaps must be licensed by upstream wh words (such as “what”).
Showing 1 to 2 of 2 entries