Viewing test suite Subject-Verb Number Agreement (with prepositional phrase)
Reference
"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. "
Number of items
19
Tags
Models evaluated
88% (8/9)
Description
This task tests a language model for how well it predicts the number marking on English finite present-tense verbs (whether it should be the third-person singular form, or the non-third-person-singular form, generally referred to as the plural form for simplicity, although technically this is the form for first- and second-person singular as well). In controlled, targeted versions of this test, multiple NP precede the verb: the verb's actual subject, as well as a distractor NP with number that is different from that of the subject. A successful language model should place higher probability on the verbform matching that of the subject, not the distractor. We have three versions of this test suite with different types of intervening material.
Items for number_prep
Item |
Condition
|
intro | np_subject | prep | the | prep_np | matrix_v | continuation |
---|---|---|---|---|---|---|---|---|
Item | Condition | intro | np_subject | prep | the | prep_np | matrix_v | continuation |
1 | match_sing | The | author | next to | the | senators | is | good |
1 | mismatch_sing | The | author | next to | the | senators | are | good |
1 | match_plural | The | authors | next to | the | senator | are | good |
1 | mismatch_plural | The | authors | next to | the | senator | is | good |
2 | match_sing | The | pilot | behind | the | teachers | brings | love to people |
2 | mismatch_sing | The | pilot | behind | the | teachers | bring | love to people |
2 | mismatch_plural | The | pilots | behind | the | teacher | brings | love to people |
2 | match_plural | The | pilots | behind | the | teacher | bring | love to people |
3 | match_sing | The | doctor | in front of | the | guards | interests | people |
3 | mismatch_sing | The | doctor | in front of | the | guards | interest | people |
3 | mismatch_plural | The | doctors | in front of | the | guard | interests | people |
3 | match_plural | The | doctors | in front of | the | guard | interest | people |
4 | match_sing | The | farmer | near | the | clerks | knows | many people |
4 | mismatch_sing | The | farmer | near | the | clerks | know | many people |
4 | mismatch_plural | The | farmers | near | the | clerk | knows | many people |
4 | match_plural | The | farmers | near | the | clerk | know | many people |
5 | match_sing | The | manager | to the side of | the | architects | likes | to gamble |
5 | mismatch_sing | The | manager | to the side of | the | architects | like | to gamble |
5 | mismatch_plural | The | managers | to the side of | the | architect | likes | to gamble |
5 | match_plural | The | managers | to the side of | the | architect | like | to gamble |
6 | match_sing | The | customer | across from | the | athletes | enjoys | playing tennis |
6 | mismatch_sing | The | customer | across from | the | athletes | enjoy | playing tennis |
6 | mismatch_plural | The | customers | across from | the | athlete | enjoys | playing tennis |
6 | match_plural | The | customers | across from | the | athlete | enjoy | playing tennis |
7 | match_sing | The | officer | next to | the | actors | is | good |
7 | mismatch_sing | The | officer | next to | the | actors | are | good |
7 | mismatch_plural | The | officers | next to | the | actor | is | good |
7 | match_plural | The | officers | next to | the | actor | are | good |
8 | match_sing | The | teacher | behind | the | ministers | is | good |
8 | mismatch_sing | The | teacher | behind | the | ministers | are | good |
8 | mismatch_plural | The | teachers | behind | the | minister | is | good |
8 | match_plural | The | teachers | behind | the | minister | are | good |
9 | match_sing | The | senator | in front of | the | actors | is | good |
9 | mismatch_sing | The | senator | in front of | the | actors | are | good |
9 | mismatch_plural | The | senators | in front of | the | actor | is | good |
9 | match_plural | The | senators | in front of | the | actor | are | good |
10 | match_sing | The | consultant | near | the | secretaries | is | good |
10 | mismatch_sing | The | consultant | near | the | secretaries | are | good |
10 | mismatch_plural | The | consultants | near | the | secretary | is | good |
10 | match_plural | The | consultants | near | the | secretary | are | good |
11 | match_sing | The | guard | to the side of | the | executives | is | good |
11 | mismatch_sing | The | guard | to the side of | the | executives | are | good |
11 | mismatch_plural | The | guards | to the side of | the | executive | is | playing tennis |
11 | match_plural | The | guards | to the side of | the | executive | are | playing tennis |
12 | match_sing | The | clerk | across from | the | authors | is | good |
12 | mismatch_sing | The | clerk | across from | the | authors | are | good |
12 | mismatch_plural | The | clerks | across from | the | author | is | good |
12 | match_plural | The | clerks | across from | the | author | are | good |
13 | match_sing | The | architect | next to | the | pilots | is | good |
13 | mismatch_sing | The | architect | next to | the | pilots | are | good |
13 | mismatch_plural | The | architects | next to | the | pilot | is | good |
13 | match_plural | The | architects | next to | the | pilot | are | good |
14 | match_sing | The | athlete | behind | the | doctors | brings | good feelings |
14 | mismatch_sing | The | athlete | behind | the | doctors | bring | good feelings |
14 | mismatch_plural | The | athletes | behind | the | doctor | brings | good feelings |
14 | match_plural | The | athletes | behind | the | doctor | bring | good feelings |
15 | match_sing | The | actor | in front of | the | farmers | interests | people |
15 | mismatch_sing | The | actor | in front of | the | farmers | interest | people |
15 | mismatch_plural | The | actors | in front of | the | farmer | interests | people |
15 | match_plural | The | actors | in front of | the | farmer | interest | people |
16 | match_sing | The | minister | near | the | managers | knows | many people |
16 | mismatch_sing | The | minister | near | the | managers | know | many people |
16 | mismatch_plural | The | ministers | near | the | manager | knows | tennis |
16 | match_plural | The | ministers | near | the | manager | know | tennis |
17 | match_sing | The | taxi driver | to the side of | the | customers | likes | to gamble |
17 | mismatch_sing | The | taxi driver | to the side of | the | customers | like | to gamble |
17 | mismatch_plural | The | taxi drivers | to the side of | the | customer | likes | tennis |
17 | match_plural | The | taxi drivers | to the side of | the | customer | like | tennis |
18 | match_sing | The | secretary | across from | the | officers | enjoys | playing tennis |
18 | mismatch_sing | The | secretary | across from | the | officers | enjoy | playing tennis |
18 | mismatch_plural | The | secretaries | across from | the | officer | enjoys | tennis |
18 | match_plural | The | secretaries | across from | the | officer | enjoy | tennis |
19 | match_sing | The | executive | next to | the | teachers | is | good |
19 | mismatch_sing | The | executive | next to | the | teachers | are | good |
19 | mismatch_plural | The | executives | next to | the | teacher | is | good |
19 | match_plural | The | executives | next to | the | teacher | are | good |
Predictions for number_prep
Formula
|
Description |
---|---|
Formula | Description |
(594,match_sing/6,matrix_v) < (596,mismatch_sing/6,matrix_v) | No description provided. |
(595,match_plural/6,matrix_v) < (593,mismatch_plural/6,matrix_v) | No description provided. |
Results for number_prep
Model | Prediction 1 accuracy | Prediction 2 accuracy | |
---|---|---|---|
Model | Prediction 1 accuracy | Prediction 2 accuracy | |
TinyLSTM | 5.26% | 52.63% | Visualize results |
GPT-2 | 78.95% | 100.00% | Visualize results |
GPT-2 XL | 78.95% | 100.00% | Visualize results |
JRNN | 68.42% | 100.00% | Visualize results |
Ordered Neurons | 78.95% | 78.95% | Visualize results |
RNNG | 84.21% | 100.00% | Visualize results |
Transformer XL | 73.68% | 89.47% | Visualize results |
Vanilla LSTM | 5.26% | 52.63% | Visualize results |