Viewing test suite Subject-Verb Number Agreement (with prepositional phrase)

Reference
"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. "
Number of items
19
Tags
Models evaluated
88% (8/9)
Description
This task tests a language model for how well it predicts the number marking on English finite present-tense verbs (whether it should be the third-person singular form, or the non-third-person-singular form, generally referred to as the plural form for simplicity, although technically this is the form for first- and second-person singular as well). In controlled, targeted versions of this test, multiple NP precede the verb: the verb's actual subject, as well as a distractor NP with number that is different from that of the subject. A successful language model should place higher probability on the verbform matching that of the subject, not the distractor. We have three versions of this test suite with different types of intervening material.
Items for number_prep
Item
Condition
intronp_subjectpreptheprep_npmatrix_vcontinuation
ItemConditionintronp_subjectpreptheprep_npmatrix_vcontinuation
1 match_sing The author next to the senators is good
1 mismatch_sing The author next to the senators are good
1 match_plural The authors next to the senator are good
1 mismatch_plural The authors next to the senator is good
2 match_sing The pilot behind the teachers brings love to people
2 mismatch_sing The pilot behind the teachers bring love to people
2 mismatch_plural The pilots behind the teacher brings love to people
2 match_plural The pilots behind the teacher bring love to people
3 match_sing The doctor in front of the guards interests people
3 mismatch_sing The doctor in front of the guards interest people
Showing 1 to 10 of 76 entries
Predictions for number_prep
Formula
Description
FormulaDescription
match_plural.matrix_v < mismatch_plural.matrix_v No description provided.
match_sing.matrix_v < mismatch_sing.matrix_v No description provided.
Showing 1 to 2 of 2 entries
ModelPrediction 1 accuracyPrediction 2 accuracy
ModelPrediction 1 accuracyPrediction 2 accuracy
GPT-2 78.95% 100.00% Visualize results
GPT-2 XL 78.95% 100.00% Visualize results
JRNN 68.42% 100.00% Visualize results
Ordered Neurons 78.95% 78.95% Visualize results
RNNG 84.21% 100.00% Visualize results
TinyLSTM 5.26% 52.63% Visualize results
Transformer XL 73.68% 89.47% Visualize results
Vanilla LSTM 5.26% 52.63% Visualize results
Showing 1 to 8 of 8 entries