Individual results

View docs

View in-depth performance of a single language model on a single test suite.

Region-by-region surprisal
Sample item for Across The Board Wh-Movement and CSC
Item
Condition
Beginning Comp SubjLeftConj VerbLeftConj ObjLeftConj AdvLeftConj and SubjRightConj VerbRightConj ObjRightConj AdvRightConj
Item Condition Beginning Comp SubjLeftConj VerbLeftConj ObjLeftConj AdvLeftConj and SubjRightConj VerbRightConj ObjRightConj AdvRightConj
1 ATB-what-2gaps I know what the guy broke accidentally and the mechanic fixed skilfully.
1 ATB-what-1gapL I know what the guy broke accidentally and the mechanic fixed the engine skilfully.
1 ATB-what-1gapR I know what the guy broke the thing accidentally and the mechanic fixed skilfully.
1 ATB-what-0gaps I know what the guy broke the thing accidentally and the mechanic fixed the engine skilfully.
1 ATB-that-2gaps I know that the guy broke accidentally and the mechanic fixed skilfully.
1 ATB-that-0gaps I know that the guy broke the thing accidentally and the mechanic fixed the engine skilfully.
Prediction performance for GPT-2 on Across The Board Wh-Movement and CSC
Accuracy
Formula
Description
AccuracyPredictionDescription
100.00% ((682,ATB-what-2gaps/11,AdvRightConj) < (684,ATB-what-1gapR/11,AdvRightConj)) No description provided.
50.00% ((687,ATB-that-0gaps/10,ObjRightConj) < (683,ATB-what-1gapL/10,ObjRightConj)) No description provided.
100.00% ((684,ATB-what-1gapR/5,ObjLeftConj) > (687,ATB-that-0gaps/5,ObjLeftConj)) No description provided.
100.00% ((684,ATB-what-1gapR/7,and) > (687,ATB-that-0gaps/7,and)) No description provided.
100.00% ((682,ATB-what-2gaps/6,AdvLeftConj) < (686,ATB-that-2gaps/6,AdvLeftConj)) No description provided.