Syntacticus: A treebank of early Indo-European languages ~ CookingFood.Org

Syntacticus: A treebank of early Indo-European languages

Syntacticus provides easy access to around a million morphosyntactically annotated sentences from a range of early Indo-European languages.

Syntacticus is an umbrella project for the PROIEL Treebank, the TOROT Treebank and the ISWOC Treebank, which all use the same annotation system and share similar linguistic priorities. In total, Syntacticus contains 80,138 sentences or 936,874 tokens in 10 languages.

We are constantly adding new material to Syntacticus. The ultimate goal is to have a representative sample of different text types from each branch of early Indo-European. We maintain lists of texts we are working on at the moment, which you can find on the PROIEL Treebank and the TOROT Treebank pages, but this is extremely time-consuming work so please be patient!

The focus for Syntacticus at the moment is to consolidate and edit our documentation so that it is easier to approach. We are very aware that the current documentation is inadequate! But new features and better integration with our development toolchain are also on the horizon in the near future.

Language Size

Ancient Greek 250,449 tokens
Latin 202,140 tokens
Classical Armenian 23,513 tokens
Gothic 57,211 tokens
Portuguese 36,595 tokens
Spanish 54,661 tokens
Old English 29,406 tokens
Old French 2,340 tokens
Old Russian 209,334 tokens
Old Church Slavonic 71,225 tokens

Language	Size
Ancient Greek	250,449 tokens
Latin	202,140 tokens
Classical Armenian	23,513 tokens
Gothic	57,211 tokens
Portuguese	36,595 tokens
Spanish	54,661 tokens
Old English	29,406 tokens
Old French	2,340 tokens
Old Russian	209,334 tokens
Old Church Slavonic	71,225 tokens

Ancient Greek

Herodotus, Histories
Sphrantzes, Chronicles (post-1453)
The Greek New Testament
Dictionary
Old Russian

Charter of Prince Jurij Svjatoslavich of Smolensk on the alliance with Poland and Lithuania, 1386
The taking of Pskov
Uspenskij sbornik
Russkaja pravda
The First Novgorod Chronicle, Synodal manuscript
Missive from Prince Ivan of Pskov, 1463–1465
Novgorod’s treaty with Grand Prince Jaroslav Jaroslavich, 1266
The tale of the fall of Constantinople
Varlaam’s donation charter to the Xutyn monastery
Missive from the Archbishop of Riga to the Prince of Smolensk
Statute of Prince Vladimir
Life of Sergij of Radonezh
The 1229 Treaty between Smolensk, Riga and Gotland (version A)
The Primary Chronicle, Codex Hypatianus
The Suzdal Chronicle, Codex Laurentianus
Mstislav’s letter
Afanasij Nikitin’s journey beyond three seas
Domostroj
Birch bark letters
The Kiev Chronicle, Codex Hypatianus
The Life of Avvakum
The Tale of Luka Koločskij
The Primary Chronicle, Codex Laurentianus
Vesti-Kuranty
The Tale of Dracula
Dictionary
Latin

Jerome's Vulgate
Cicero, Epistulae ad Atticum
Caesar, Commentarii belli Gallici
Peregrinatio Aetheriae
Dictionary
Old Church Slavonic

Codex Suprasliensis
Codex Zographensis
Dictionary
Gothic

The Gothic Bible
Dictionary
Spanish

Libro delos claros varones
Alfonso X, el sabio, Estoria de Espanna I
Crónica de Alfonso XI
General Estoria parte IV Daniel
El Conde Lucanor
Crónica de España
Dictionary
Portuguese

Crónica Geral de Espanha 155-167
Crónica Geral de Espanha 2-12
Diogo de Couto, Décadas Livro 5, VIII, 9-14
Dictionary
Old English

Ælfric's Lives of Saints
West-Saxon Gospels
Apollonius of Tyre
Anglo-Saxon Chronicles
Orosius
Dictionary
Classical Armenian

The Armenian New Testament
Dictionary
Old French

La Vie Saint Eustace
Dictionary