In linguistics and natural language processing, a corpus (pl.: corpora) or text corpus is a dataset, consisting of natively digital and older, digitalized...
8 KB (858 words) - 09:48, 14 November 2024
begin being deciphered. Large collections of parallel texts are called parallel corpora (see text corpus). Alignments of parallel corpora at sentence level...
12 KB (1,182 words) - 13:40, 27 July 2024
Corpus linguistics is an empirical method for the study of language by way of a text corpus (plural corpora). Corpora are balanced, often stratified collections...
20 KB (2,335 words) - 01:45, 24 May 2025
Habeas corpus (/ˈheɪbiəs ˈkɔːrpəs/ ; from Medieval Latin, lit. 'you should have the body') is a legal procedure by which a report can be made to a court...
76 KB (9,363 words) - 12:59, 25 May 2025
The Lancaster-Oslo/Bergen (LOB) Corpus is a one-million-word collection of British English texts which was compiled in the 1970s in collaboration between...
3 KB (230 words) - 02:09, 26 March 2025
Text corpora (singular: text corpus) are large and structured sets of texts, which have been systematically collected. Text corpora are used by both AI...
23 KB (2,470 words) - 10:37, 24 May 2025
The Neo-Assyrian Text Corpus Project is an international scholarly project aimed at collecting and publishing ancient Assyrian texts of the Neo-Assyrian...
10 KB (117 words) - 00:24, 25 February 2025
University Standard Corpus of Present-Day American English, better known as simply the Brown Corpus, is an electronic collection of text samples of American...
11 KB (1,270 words) - 02:43, 26 March 2025
The Electronic Text Corpus of Sumerian Literature (ETCSL) is an online digital library of texts and translations of Sumerian literature that was created...
4 KB (368 words) - 11:40, 17 March 2024
British National Corpus (BNC) is a 100-million-word text corpus of samples of written and spoken English from a wide range of sources. The corpus covers British...
31 KB (3,894 words) - 01:18, 14 June 2024
Spanish. Each estimate comes from an analysis of a different text corpus. A text corpus is a large collection of samples of written and/or spoken language...
27 KB (750 words) - 08:51, 5 May 2025
The Oxford English Corpus (OEC) is a text corpus of 21st-century English, used by the makers of the Oxford English Dictionary and by Oxford University...
4 KB (348 words) - 21:01, 11 January 2025
The AsoSoft text corpus is the first large-scale Kurdish text corpus, collected and processed by the AsoSoft research and development group. It contains...
1 KB (132 words) - 18:09, 24 November 2023
is also called the corpus cavernosum urethrae in older texts. The proximal part of the corpus spongiosum is expanded to form the urethral bulb, and lies...
4 KB (405 words) - 05:31, 2 May 2025
2019, the corpus had grown to 560 million words. As of November 2021, the Corpus of Contemporary American English is composed of 485,202 texts. According...
9 KB (1,135 words) - 14:04, 24 May 2025
The Scottish Corpus of Texts & Speech (SCOTS) is an ongoing project to build a corpus of modern-day (post-1940) written and spoken texts in Scottish English...
3 KB (349 words) - 01:27, 28 May 2025
"The Electronic Text Corpus of Sumerian Literature". Etcsl.orinst.ox.ac.uk. Retrieved 30 December 2018. "The Electronic Text Corpus of Sumerian Literature"...
20 KB (2,183 words) - 14:28, 28 April 2025
Look up corpus, corpora, or corpuses in Wiktionary, the free dictionary. Corpus (plural corpora) is Latin for "body". It may refer to: Text corpus, in linguistics...
2 KB (317 words) - 00:15, 8 March 2025
Corpus (OEC), a massive text corpus that is written in the English language. In total, the texts in the Oxford English Corpus contain more than 2 billion...
16 KB (872 words) - 06:35, 28 April 2025
The Feast of Corpus Christi (Ecclesiastical Latin: Dies Sanctissimi Corporis et Sanguinis Domini Iesu Christi, lit. 'Day of the Most Holy Body and Blood...
48 KB (5,128 words) - 10:18, 15 April 2025
Word list (category Articles lacking in-text citations from December 2023)
analysis within a given text corpus, and is used in corpus linguistics to investigate genealogies and evolution of languages and texts. A word which appears...
27 KB (2,849 words) - 03:54, 27 May 2025
Corpus of Sumerian Literature. Archived from the original on 2012-05-15. Retrieved 2010-02-20. "A balbale to Nanna (Nanna B)". Electronic Text Corpus...
40 KB (4,064 words) - 00:32, 25 May 2025
The American National Corpus (ANC) is a text corpus of American English containing 22 million words of written and spoken data produced since 1990. Currently...
5 KB (605 words) - 10:56, 26 January 2025
The Corpus Hermeticum is a collection of 17 Greek writings whose authorship is traditionally attributed to the legendary Hellenistic figure Hermes Trismegistus...
11 KB (1,200 words) - 13:20, 14 March 2025
The Corpus Juris (or Iuris) Civilis ("Body of Civil Law") is the modern name for a collection of fundamental works in jurisprudence, enacted from 529 to...
22 KB (2,736 words) - 20:39, 8 May 2025
Sumerian literature (redirect from Sumerian texts)
Sumerian literature constitutes the earliest known corpus of recorded literature, including the religious writings and other traditional stories maintained...
9 KB (1,026 words) - 04:33, 26 October 2024
The corpus callosum (Latin for "tough body"), also callosal commissure, is a wide, thick nerve tract, consisting of a flat bundle of commissural fibers...
32 KB (3,648 words) - 12:02, 6 February 2025
Lydian language (category Articles containing Ancient Greek (to 1453)-language text)
Dictionary of the Ancient Anatolian Corpus Languages (eDiAna)". Ludwig-Maximilians-Universität München. Lydian Corpus Palaeolexicon - Word study tool of...
44 KB (3,541 words) - 17:01, 28 May 2025
The Habeas Corpus Suspension Act, 12 Stat. 755 (1863), entitled An Act relating to Habeas Corpus, and regulating Judicial Proceedings in Certain Cases...
37 KB (4,826 words) - 15:24, 11 May 2025
This is a list of Amarna letters–Text corpus, categorized by: Amarna letters–localities and their rulers. It includes countries, regions, and the cities...
9 KB (156 words) - 22:24, 24 October 2024