Viewing test suite Negative Polarity Licensing (any; with subject relative clause)

Reference
"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. "
Number of items
38
Tags
Models evaluated
88% (8/9)
Description
The words any and ever, in their most common uses, are "negative polarity items' (NPIs): they can only be used in an appropriate syntactic-semantic environment---to a first approximation, in the scope of negation. For example, the determiner "no" can license NPIs, but its NP has to structurally command the NPI. Some sentences with NPIs may be ungrammatical, even if they include a negative determiner earlier in the sentence, because "no" is embedded inside a modifier of the main-clause subject and thus does not command the NPI.
Items for npi_src_any
Item
Condition
Licensor np compl rc_verb rc_dp rc_obj matrix_v npi continuation
Item Condition Licensor np compl rc_verb rc_dp rc_obj matrix_v npi continuation
1 neg_pos No author that liked the senators has had any success
1 neg_neg No author that liked no senators has had any success
1 pos_pos The author that liked the senators has had any success
1 pos_neg The author that liked no senators has had any success
2 neg_pos No pilot that praised the consultants has shown any courage
2 neg_neg No pilot that praised no consultants has shown any courage
2 pos_pos The pilot that praised the consultants has shown any courage
2 pos_neg The pilot that praised no consultants has shown any courage
3 neg_pos No surgeon that hated the guards has known any setback
3 neg_neg No surgeon that hated no guards has known any setback
3 pos_pos The surgeon that hated the guards has known any setback
3 pos_neg The surgeon that hated no guards has known any setback
4 neg_pos No farmer that discussed the clerks has had any luck
4 neg_neg No farmer that discussed no clerks has had any luck
4 pos_pos The farmer that discussed the clerks has had any luck
4 pos_neg The farmer that discussed no clerks has had any luck
5 neg_pos No manager that loved the architects has completed any project
5 neg_neg No manager that loved no architects has completed any project
5 pos_pos The manager that loved the architects has completed any project
5 pos_neg The manager that loved no architects has completed any project
6 neg_pos No customer that liked the managers has spent any money
6 neg_neg No customer that liked no managers has spent any money
6 pos_pos The customer that liked the managers has spent any money
6 pos_neg The customer that liked no managers has spent any money
7 neg_pos No officer that praised the journalists has had any violations
7 neg_neg No officer that praised no journalists has had any violations
7 pos_pos The officer that praised the journalists has had any violations
7 pos_neg The officer that praised no journalists has had any violations
8 neg_pos No teacher that hated the ministers has failed any student
8 neg_neg No teacher that hated no ministers has failed any student
8 pos_pos The teacher that hated the ministers has failed any student
8 pos_neg The teacher that hated no ministers has failed any student
9 neg_pos No senator that admired the taxi drivers has passed any bills
9 neg_neg No senator that admired no taxi drivers has passed any bills
9 pos_pos The senator that admired the taxi drivers has passed any bills
9 pos_neg The senator that admired no taxi drivers has passed any bills
10 neg_pos No consultant that loved the secretaries has passed up any project
10 neg_neg No consultant that loved no secretaries has passed up any project
10 pos_pos The consultant that loved the secretaries has passed up any project
10 pos_neg The consultant that loved no secretaries has passed up any project
11 neg_pos No guard that liked the executives has seen any crimes
11 neg_neg No guard that liked no executives has seen any crimes
11 pos_pos The guard that liked the executives has seen any crimes
11 pos_neg The guard that liked no executives has seen any crimes
12 neg_pos No chef that praised the authors has burned any dishes
12 neg_neg No chef that praised no authors has burned any dishes
12 pos_pos The chef that praised the authors has burned any dishes
12 pos_neg The chef that praised no authors has burned any dishes
13 neg_pos No architect that hated the pilots has had any fame
13 neg_neg No architect that hated no pilots has had any fame
13 pos_pos The architect that hated the pilots has had any fame
13 pos_neg The architect that hated no pilots has had any fame
14 neg_pos No athlete that helped the doctors has broken any equipment
14 neg_neg No athlete that helped no doctors has broken any equipment
14 pos_pos The athlete that helped the doctors has broken any equipment
14 pos_neg The athlete that helped no doctors has broken any equipment
15 neg_pos No dancer that loved the farmers has broken any bones
15 neg_neg No dancer that loved no farmers has broken any bones
15 pos_pos The dancer that loved the farmers has broken any bones
15 pos_neg The dancer that loved no farmers has broken any bones
16 neg_pos No minister that liked the managers has missed any services
16 neg_neg No minister that liked no managers has missed any services
16 pos_pos The minister that liked the managers has missed any services
16 pos_neg The minister that liked no managers has missed any services
17 neg_pos No taxi driver that praised the customers has crashed any car
17 neg_neg No taxi driver that praised no customers has crashed any car
17 pos_pos The taxi driver that praised the customers has crashed any car
17 pos_neg The taxi driver that praised no customers has crashed any car
18 neg_pos No assistant that hated the officers has missed any phone calls
18 neg_neg No assistant that hated no officers has missed any phone calls
18 pos_pos The assistant that hated the officers has missed any phone calls
18 pos_neg The assistant that hated no officers has missed any phone calls
19 neg_pos No executive that knew the teachers has fired any worker
19 neg_neg No executive that knew no teachers has fired any worker
19 pos_pos The executive that knew the teachers has fired any worker
19 pos_neg The executive that knew no teachers has fired any worker
20 neg_pos No authors that loved the officer have read any poetry
20 neg_neg No authors that loved no officer have read any poetry
20 pos_pos The authors that loved the officer have read any poetry
20 pos_neg The authors that loved no officer have read any poetry
21 neg_pos No pilots that liked the teacher have landed any plane
21 neg_neg No pilots that liked no teacher have landed any plane
21 pos_pos The pilots that liked the teacher have landed any plane
21 pos_neg The pilots that liked no teacher have landed any plane
22 neg_pos No doctors that praised the senator have failed any surgery
22 neg_neg No doctors that praised no senator have failed any surgery
22 pos_pos The doctors that praised the senator have failed any surgery
22 pos_neg The doctors that praised no senator have failed any surgery
23 neg_pos No farmers that hated the consultant have planted any beans
23 neg_neg No farmers that hated no consultant have planted any beans
23 pos_pos The farmers that hated the consultant have planted any beans
23 pos_neg The farmers that hated no consultant have planted any beans
24 neg_pos No managers that respected the guard have had any luck
24 neg_neg No managers that respected no guard have had any luck
24 pos_pos The managers that respected the guard have had any luck
24 pos_neg The managers that respected no guard have had any luck
25 neg_pos No customers that loved the chef have purchased any apparel
25 neg_neg No customers that loved no chef have purchased any apparel
25 pos_pos The customers that loved the chef have purchased any apparel
25 pos_neg The customers that loved no chef have purchased any apparel
26 neg_pos No officers that liked the architect have arrested any criminal
26 neg_neg No officers that liked no architect have arrested any criminal
26 pos_pos The officers that liked the architect have arrested any criminal
26 pos_neg The officers that liked no architect have arrested any criminal
27 neg_pos No teachers that praised the athlete have failed any students
27 neg_neg No teachers that praised no athlete have failed any students
27 pos_pos The teachers that praised the athlete have failed any students
27 pos_neg The teachers that praised no athlete have failed any students
28 neg_pos No senators that hated the dancer have passed any bills
28 neg_neg No senators that hated no dancer have passed any bills
28 pos_pos The senators that hated the dancer have passed any bills
28 pos_neg The senators that hated no dancer have passed any bills
29 neg_pos No consultants that impressed the minister have missed any projects
29 neg_neg No consultants that impressed no minister have missed any projects
29 pos_pos The consultants that impressed the minister have missed any projects
29 pos_neg The consultants that impressed no minister have missed any projects
30 neg_pos No guards that loved the taxi driver have caught any thieves
30 neg_neg No guards that loved no taxi driver have caught any thieves
30 pos_pos The guards that loved the taxi driver have caught any thieves
30 pos_neg The guards that loved no taxi driver have caught any thieves
31 neg_pos No clerks that liked the assistant have burned any dishes
31 neg_neg No clerks that liked no assistant have burned any dishes
31 pos_pos The clerks that liked the assistant have burned any dishes
31 pos_neg The clerks that liked no assistant have burned any dishes
32 neg_pos No architects that praised the executive have passed any projects
32 neg_neg No architects that praised no executive have passed any projects
32 pos_pos The architects that praised the executive have passed any projects
32 pos_neg The architects that praised no executive have passed any projects
33 neg_pos No athletes that hated the pilot have broken any equipment
33 neg_neg No athletes that hated no pilot have broken any equipment
33 pos_pos The athletes that hated the pilot have broken any equipment
33 pos_neg The athletes that hated no pilot have broken any equipment
34 neg_pos No journalists that contacted the surgeon have broken any bones
34 neg_neg No journalists that contacted no surgeon have broken any bones
34 pos_pos The journalists that contacted the surgeon have broken any bones
34 pos_neg The journalists that contacted no surgeon have broken any bones
35 neg_pos No ministers that loved the farmer have advocated any evil
35 neg_neg No ministers that loved no farmer have advocated any evil
35 pos_pos The ministers that loved the farmer have advocated any evil
35 pos_neg The ministers that loved no farmer have advocated any evil
36 neg_pos No taxi drivers that liked the manager have crashed any cars
36 neg_neg No taxi drivers that liked no manager have crashed any cars
36 pos_pos The taxi drivers that liked the manager have crashed any cars
36 pos_neg The taxi drivers that liked no manager have crashed any cars
37 neg_pos No secretaries that praised the customer have missed any appointments
37 neg_neg No secretaries that praised no customer have missed any appointments
37 pos_pos The secretaries that praised the customer have missed any appointments
37 pos_neg The secretaries that praised no customer have missed any appointments
38 neg_pos No executives that hated the officer have refused any compensation
38 neg_neg No executives that hated no officer have refused any compensation
38 pos_pos The executives that hated the officer have refused any compensation
38 pos_neg The executives that hated no officer have refused any compensation
Predictions for npi_src_any
Formula
Description
Formula Description
(583,neg_pos/8,npi) < (581,pos_pos/8,npi) No description provided.
(582,neg_neg/8,npi) < (584,pos_neg/8,npi) No description provided.
(583,neg_pos/8,npi) < (584,pos_neg/8,npi) No description provided.
Model Prediction 1 accuracy Prediction 2 accuracy Prediction 3 accuracy
Model Prediction 1 accuracy Prediction 2 accuracy Prediction 3 accuracy
TinyLSTM 100.00% 100.00% 0.00% Visualize results
GPT-2 97.37% 92.11% 60.53% Visualize results
GPT-2 XL 97.37% 100.00% 97.37% Visualize results
JRNN 97.37% 94.74% 81.58% Visualize results
Ordered Neurons 100.00% 100.00% 0.00% Visualize results
RNNG 100.00% 100.00% 5.26% Visualize results
Transformer XL 100.00% 86.84% 15.79% Visualize results
Vanilla LSTM 100.00% 100.00% 0.00% Visualize results