German_Reference_Corpus Search Results

German Reference Corpus

German Reference Corpus (original: Deutsches Referenzkorpus; short: DeReKo) is an electronic archive of text corpora of contemporary written German....

4 KB (537 words) - 20:49, 27 January 2023

Lancaster-Oslo-Bergen Corpus

The Lancaster-Oslo/Bergen (LOB) Corpus is a one-million-word collection of British English texts which was compiled in the 1970s in collaboration between...

3 KB (230 words) - 02:09, 26 March 2025

Corpus linguistics

Corpus linguistics is an empirical method for the study of language by way of a text corpus (plural corpora). Corpora are balanced, often stratified collections...

20 KB (2,335 words) - 10:40, 25 June 2025

Brown Corpus

The Brown University Standard Corpus of Present-Day American English, better known as simply the Brown Corpus, is an electronic collection of text samples...

11 KB (1,270 words) - 02:43, 26 March 2025

Enron Corpus

The Enron Corpus is a database of over 600,000 emails generated by 158 employees of the Enron Corporation in the years leading up to the company's collapse...

7 KB (725 words) - 03:40, 16 April 2025

Oxford English Corpus

The Oxford English Corpus (OEC) is a text corpus of 21st-century English, used by the makers of the Oxford English Dictionary and by Oxford University...

4 KB (348 words) - 21:01, 11 January 2025

Cambridge English Corpus

The Cambridge International Corpus (CIC) is a collection of over 2 billion words of real spoken and written English . The texts are stored in a database...

8 KB (1,028 words) - 00:21, 18 January 2025

Europarl Corpus

The Europarl Corpus is a corpus (set of documents) that consists of the proceedings of the European Parliament from 1996 to 2012. In its first release...

6 KB (800 words) - 11:02, 15 September 2022

Habeas corpus

and corpus, accusative singular of corpus "body". In reference to more than one person, the phrase is habeas corpora. The writ of habeas corpus was described...

67 KB (8,173 words) - 23:55, 20 July 2025

PropBank (redirect from PropBank Corpus)

is a corpus that is annotated with verbal propositions and their arguments—a "proposition bank". Although "PropBank" refers to a specific corpus produced...

4 KB (390 words) - 18:00, 28 June 2025

British National Corpus

British National Corpus (BNC) is a 100-million-word text corpus of samples of written and spoken English from a wide range of sources. The corpus covers British...

31 KB (3,894 words) - 01:18, 14 June 2024

American National Corpus

The American National Corpus (ANC) is a text corpus of American English containing 22 million words of written and spoken data produced since 1990. Currently...

5 KB (605 words) - 10:56, 26 January 2025

Corpus of Contemporary American English

The Corpus of Contemporary American English (COCA) is a one-billion-word corpus of contemporary American English. It was created by Mark Davies, retired...

9 KB (1,135 words) - 14:04, 24 May 2025

Switchboard Telephone Speech Corpus

The Switchboard Telephone Speech Corpus is a corpus of spoken English language consisted of almost 260 hours of speech. It was created in 1990 by Texas...

4 KB (459 words) - 18:16, 28 June 2025

Bank of English (category Corpus linguistics stubs)

French, German and Spanish corpora. Corpus of Contemporary American English (COCA) British National Corpus (BNC) The Collins Corpus COBUILD Reference v t...

1 KB (153 words) - 18:12, 28 June 2025

Thesaurus Linguae Graecae

lemmatization of the Greek corpus (2006) – a substantial undertaking, given the highly inflected nature of Greek and the complexity of the corpus, covering more than...

5 KB (599 words) - 20:04, 26 August 2024

COBUILD (category Articles lacking reliable references from December 2023)

have been the creation and analysis of an electronic corpus of contemporary text, the Collins Corpus, later leading to the development of the Bank of English...

2 KB (181 words) - 18:11, 28 June 2025

TenTen Corpus Family

(Danish web corpus) deTenTen (German web corpus) elTenTen (Greek web corpus) enTenTen (English web corpus) esTenTen (Spanish web corpus with European/American...

12 KB (1,204 words) - 06:39, 22 November 2024

Czech National Corpus

The Czech National Corpus (CNC) (Czech : Český národní korpus) is a large electronic corpus of written and spoken Czech language, developed by the Institute...

4 KB (466 words) - 11:24, 12 July 2025

Sketch Engine (category Corpus linguistics)

Sketch Engine is a corpus manager and text analysis software developed by Lexical Computing since 2003. Its purpose is to enable people studying language...

16 KB (1,437 words) - 13:48, 10 July 2025

VerbNet (category Corpus linguistics stubs)

Corpus German Reference Corpus Hamshahri Corpus National Corpus of Polish Neo-Assyrian Text Corpus Project Persian Speech Corpus Quranic Arabic Corpus Russian...

1 KB (96 words) - 02:16, 17 May 2025

Feast of Corpus Christi

The Feast of Corpus Christi (Ecclesiastical Latin: Dies Sanctissimi Corporis et Sanguinis Domini Iesu Christi, lit. 'Day of the Most Holy Body and Blood...

48 KB (5,104 words) - 16:30, 12 July 2025

Quranic Arabic Corpus

The Quranic Arabic Corpus (Arabic: المدونة القرآنية العربية, romanized: al-modwana al-Qurʾāni al-ʿArabiyya) is an annotated linguistic resource consisting...

6 KB (599 words) - 01:25, 28 March 2025

Arabic Speech Corpus

The Arabic Speech Corpus is a Modern Standard Arabic (MSA) speech corpus for speech synthesis. The corpus contains phonetic and orthographic transcriptions...

4 KB (388 words) - 18:44, 27 July 2023

Bijankhan Corpus

The Bijankhan corpus (Persian: پیکرهٔ بی‌جن‌خان) is a tagged corpus that is suitable for natural language processing (NLP) research on the Persian language...

2 KB (158 words) - 12:41, 15 June 2025

Russian National Corpus

The Russian National Corpus (Russian: Национальный корпус русского языка, lit. 'National Corpus of the Russian Language') is a corpus of the Russian language...

4 KB (379 words) - 18:21, 29 October 2024

Tatoeba

lexicographic references for language learners. The JMdict Japanese-English dictionary selects its example sentences from the Tatoeba Corpus. OpenRussian...

23 KB (2,075 words) - 19:12, 23 June 2025

List of text corpora (category Corpus linguistics)

Corpus Slovenian National Corpus Czech National Corpus National Corpus of Polish Slovak National Corpora German Reference Corpus (DeReKo) More than 4 billion...

23 KB (2,460 words) - 20:27, 20 June 2025

International Corpus of English

The International Corpus of English (ICE) is a set of text corpora representing varieties of English from around the world. Over twenty countries or groups...

11 KB (1,229 words) - 00:56, 27 February 2025

Hamshahri Corpus

The Hamshahri Corpus (Persian: پیکره همشهری) is a sizable Persian corpus based on the Iranian newspaper Hamshahri, one of the first online Persian-language...

3 KB (327 words) - 20:27, 20 June 2025