This is a beta release of SyntaxGym. Please send questions and comments to contact@syntaxgym.org.

User profile

Jon G

jon@gauthiers.net

Contributed test suites

Name	Language	Reference	Models evaluated	Average performance	Tags
Name	Language	Reference	Models evaluated	Average performance	Tags
Cleft Structure	English		8 / 9	89.38%	Long-Distance Dependencies
Center Embedding (with modifier)	English	"Wilcox E. Levy R. & Futrell R. (2019). Hierarchical representation in neural language models: Suppression and recovery of expectations."	8 / 9	70.54%	Center Embedding
Subject-Verb Number Agreement (with subject relative clause)	English	"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. "	8 / 9	54.61%	Agreement
Reflexive Number Agreement (masculine; with prepositional phrase)	English	"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. "	8 / 9	49.34%	Licensing
Filler-Gap Dependencies (hierarchy)	English	"Wilcox E. Levy R. & Futrell R. (2019). What Syntactic Structures block Dependencies in RNN Language Models?"	8 / 9	53.12%	Long-Distance Dependencies
Subordination (with object relative clause)	English	"Futrell R. Wilcox E. Morita T. Qian P. Ballesteros M. & Levy R. (2019). Neural language models as psycholinguistic subjects: Representations of syntactic state."	8 / 9	75.00%	Gross Syntactic State
NP/Z Garden-path Ambiguity with Modifier (Overt Object)	English	"Futrell R. Wilcox E. Morita T. Qian P. Ballesteros M. & Levy R. (2019). Neural language models as psycholinguistic subjects: Representations of syntactic state."	7 / 9	91.07%	Garden-Path Effects
Reflexive Number Agreement (feminine; with prepositional phrase)	English	"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. "	8 / 9	15.13%	Licensing
NP/Z Garden-path Ambiguity (Verb Transitivity)	English	"Futrell R. Wilcox E. Morita T. Qian P. Ballesteros M. & Levy R. (2019). Neural language models as psycholinguistic subjects: Representations of syntactic state."	7 / 9	79.17%	Garden-Path Effects
Negative Polarity Licensing (any; with object relative clause)	English		8 / 9	41.12%	Licensing
Negative Polarity Licensing (any; with subject relative clause)	English	"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. "	8 / 9	31.91%	Licensing
Negative Polarity Licensing (ever; with subject relative clause)	English	"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. "	8 / 9	29.61%	Licensing
Negative Polarity Licensing (ever; with object relative clause)	English	"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. "	8 / 9	38.16%	Licensing
Subject-Verb Number Agreement (with prepositional phrase)	English	"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. "	8 / 9	53.95%	Agreement
Reflexive Number Agreement (masculine; with subject relative clause)	English	"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. "	8 / 9	46.71%	Licensing
Cleft Structure (with modifier)	English	No published reference	8 / 9	65.00%	Long-Distance Dependencies
Subordination	English	"Futrell R. Wilcox E. Morita T. Qian P. Ballesteros M. & Levy R. (2019). Neural language models as psycholinguistic subjects: Representations of syntactic state."	8 / 9	79.35%	Gross Syntactic State
Reflexive Number Agreement (feminine; with object relative clause)	English	"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. "	8 / 9	17.76%	Licensing
Main-verb/Reduced-relative Garden-path Disambiguation	English	"Futrell R. Wilcox E. Morita T. Qian P. Ballesteros M. & Levy R. (2019). Neural language models as psycholinguistic subjects: Representations of syntactic state."	7 / 9	65.82%	Garden-Path Effects
Filler-Gap Dependencies (extraction from prepositional phrase)	English	"Wilcox E. Levy R. Morita T. & Futrell R. (2018). What do RNN Language Models Learn about Filler-Gap Dependencies?"	8 / 9	51.56%	Long-Distance Dependencies
Filler-Gap Dependencies (4 sentential embeddings)	English	"Wilcox E. Levy R. & Futrell R. (2019). What Syntactic Structures block Dependencies in RNN Language Models?" Wilcox et al. 2018	7 / 9	50.34%	Long-Distance Dependencies
Subordination (with prepositional phrase)	English	"Futrell R. Wilcox E. Morita T. Qian P. Ballesteros M. & Levy R. (2019). Neural language models as psycholinguistic subjects: Representations of syntactic state."	7 / 9	86.96%	Gross Syntactic State
Center Embedding	English	"Wilcox E. Levy R. & Futrell R. (2019). Hierarchical representation in neural language models: Suppression and recovery of expectations."	8 / 9	85.27%	Center Embedding
NP/Z Garden-path Ambiguity with Modifier (Verb Transitivity)	English	"Futrell R. Wilcox E. Morita T. Qian P. Ballesteros M. & Levy R. (2019). Neural language models as psycholinguistic subjects: Representations of syntactic state."	7 / 9	66.67%	Garden-Path Effects
Filler-Gap Dependencies (object extraction)	English	"Wilcox E. Levy R. Morita T. & Futrell R. (2018). What do RNN Language Models Learn about Filler-Gap Dependencies?"	8 / 9	78.65%	Long-Distance Dependencies
Subordination (with subject relative clause)	English	"Futrell R. Wilcox E. Morita T. Qian P. Ballesteros M. & Levy R. (2019). Neural language models as psycholinguistic subjects: Representations of syntactic state."	8 / 9	72.83%	Gross Syntactic State
Main-verb/Reduced-relative Garden-path Disambiguation (with modifier)	English	"Futrell R. Wilcox E. Morita T. Qian P. Ballesteros M. & Levy R. (2019). Neural language models as psycholinguistic subjects: Representations of syntactic state."	7 / 9	67.35%	Garden-Path Effects
NP/Z Garden-path Ambiguity (Overt Object)	English	"Futrell R. Wilcox E. Morita T. Qian P. Ballesteros M. & Levy R. (2019). Neural language models as psycholinguistic subjects: Representations of syntactic state."	7 / 9	95.24%	Garden-Path Effects
Subject-Verb Number Agreement (with object relative clause)	English	"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. "	8 / 9	34.21%	Agreement
Reflexive Number Agreement (masculine; with object relative clause)	English	"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. "	8 / 9	32.24%	Licensing
Reflexive Number Agreement (feminine; with subject relative clause)	English	"Marvin R. & Linzen T. (2018). Targeted syntactic evaluation of language models. "	8 / 9	13.82%	Licensing
Filler-Gap Dependencies (3 sentential embeddings)	English	"Wilcox E. Levy R. & Futrell R. (2019). What Syntactic Structures block Dependencies in RNN Language Models?" Wilcox et al. 2018	7 / 9	61.22%	Long-Distance Dependencies
Filler-Gap Dependencies (subject extraction)	English	"Wilcox E. Levy R. Morita T. & Futrell R. (2018). What do RNN Language Models Learn about Filler-Gap Dependencies?"	8 / 9	72.40%	Long-Distance Dependencies

Contributed models

Name	Description	Owner	Language	Author	Date added	Docker image	Status	Average performance
Name	Description	Owner	Language	Author	Date added	Docker image	Status	Average performance
Transformer XL	None	Jon G	English	Zihang Dai et al.	2020-01-21	`cpllab/language-models`:`transformer-xl`	Validated	76.81%
JRNN	None	Jon G	English	Josefowicz et al.	2020-01-21	`cpllab/language-models`:`jrnn`	Validated	76.09%
Vanilla LSTM	None	Jon G	English	Hochreiter & Schmidhuber	2020-01-30	`cpllab/language-models`:`vanilla-lstm`	Validated	65.59%
RNNG	None	Jon G	English	Dyer et al.	2020-01-30	`cpllab/language-models`:`rnng`	Validated	74.22%
Ordered Neurons	None	Jon G	English	Shen et al.	2020-01-30	`cpllab/language-models`:`ordered-neurons`	Validated	72.47%
GPT-2	None	Jon G	English	Radford et al. (OpenAI)	2020-01-21	`cpllab/language-models`:`gpt2`	Validated	84.93%
TinyLSTM	None	Jon G	English	Hochreiter & Schmidhuber	2020-07-06	`cpllab/language-models`:`tinylstm`	Validated	63.19%
GPT-2 XL	None	Jon G	English	Radford et al. (OpenAI)	2020-01-21	`cpllab/language-models`:`gpt2-xl`	Validated	89.97%