• In linguistics and natural language processing, a corpus (pl.: corpora) or text corpus is a dataset, consisting of natively digital and older, digitalized...
    8 KB (858 words) - 09:48, 14 November 2024
  • Thumbnail for Parallel text
    begin being deciphered. Large collections of parallel texts are called parallel corpora (see text corpus). Alignments of parallel corpora at sentence level...
    12 KB (1,182 words) - 13:40, 27 July 2024
  • Corpus linguistics is an empirical method for the study of language by way of a text corpus (plural corpora). Corpora are balanced, often stratified collections...
    20 KB (2,335 words) - 01:45, 24 May 2025
  • British National Corpus (BNC) is a 100-million-word text corpus of samples of written and spoken English from a wide range of sources. The corpus covers British...
    31 KB (3,894 words) - 01:18, 14 June 2024
  • The Lancaster-Oslo/Bergen (LOB) Corpus is a one-million-word collection of British English texts which was compiled in the 1970s in collaboration between...
    3 KB (230 words) - 02:09, 26 March 2025
  • The Oxford English Corpus (OEC) is a text corpus of 21st-century English, used by the makers of the Oxford English Dictionary and by Oxford University...
    4 KB (348 words) - 21:01, 11 January 2025
  • The Neo-Assyrian Text Corpus Project is an international scholarly project aimed at collecting and publishing ancient Assyrian texts of the Neo-Assyrian...
    10 KB (117 words) - 00:24, 25 February 2025
  • Text corpora (singular: text corpus) are large and structured sets of texts, which have been systematically collected. Text corpora are used by both AI...
    23 KB (2,470 words) - 10:37, 24 May 2025
  • Thumbnail for Electronic Text Corpus of Sumerian Literature
    The Electronic Text Corpus of Sumerian Literature (ETCSL) is an online digital library of texts and translations of Sumerian literature that was created...
    4 KB (368 words) - 11:40, 17 March 2024
  • The American National Corpus (ANC) is a text corpus of American English containing 22 million words of written and spoken data produced since 1990. Currently...
    5 KB (605 words) - 10:56, 26 January 2025
  • The Scottish Corpus of Texts & Speech (SCOTS) is an ongoing project to build a corpus of modern-day (post-1940) written and spoken texts in Scottish English...
    3 KB (349 words) - 01:27, 28 May 2025
  • Thumbnail for Brown Corpus
    University Standard Corpus of Present-Day American English, better known as simply the Brown Corpus, is an electronic collection of text samples of American...
    11 KB (1,270 words) - 02:43, 26 March 2025
  • The AsoSoft text corpus is the first large-scale Kurdish text corpus, collected and processed by the AsoSoft research and development group. It contains...
    1 KB (132 words) - 18:09, 24 November 2023
  • Thumbnail for Corpus spongiosum
    is also called the corpus cavernosum urethrae in older texts. The proximal part of the corpus spongiosum is expanded to form the urethral bulb, and lies...
    4 KB (405 words) - 03:40, 3 June 2025
  • Spanish. Each estimate comes from an analysis of a different text corpus. A text corpus is a large collection of samples of written and/or spoken language...
    27 KB (750 words) - 08:51, 5 May 2025
  • Look up corpus, corpora, or corpuses in Wiktionary, the free dictionary. Corpus (plural corpora) is Latin for "body". It may refer to: Text corpus, in linguistics...
    2 KB (317 words) - 00:15, 8 March 2025
  • Thumbnail for Aratta
    "The Electronic Text Corpus of Sumerian Literature". Etcsl.orinst.ox.ac.uk. Retrieved 30 December 2018. "The Electronic Text Corpus of Sumerian Literature"...
    20 KB (2,183 words) - 14:28, 28 April 2025
  • 2019, the corpus had grown to 560 million words. As of November 2021, the Corpus of Contemporary American English is composed of 485,202 texts. According...
    9 KB (1,135 words) - 14:04, 24 May 2025
  • AleAhmad built on this corpus and created the first Persian text collection suitable for information retrieval evaluation tasks. This corpus was created by crawling...
    3 KB (333 words) - 17:46, 28 October 2024
  • Corpus (OEC), a massive text corpus that is written in the English language. In total, the texts in the Oxford English Corpus contain more than 2 billion...
    16 KB (872 words) - 06:35, 28 April 2025
  • The Corpus of Electronic Texts, or CELT, is an online database of contemporary and historical documents relating to Irish history and culture. As of 8...
    3 KB (214 words) - 22:31, 24 February 2024
  • Thumbnail for Feast of Corpus Christi
    The Feast of Corpus Christi (Ecclesiastical Latin: Dies Sanctissimi Corporis et Sanguinis Domini Iesu Christi, lit. 'Day of the Most Holy Body and Blood...
    48 KB (5,128 words) - 10:18, 15 April 2025
  • This is a list of Amarna letters–Text corpus, categorized by: Amarna letters–localities and their rulers. It includes countries, regions, and the cities...
    9 KB (156 words) - 22:24, 24 October 2024
  • Thumbnail for Sumerian religion
    Corpus of Sumerian Literature. Archived from the original on 2012-05-15. Retrieved 2010-02-20. "A balbale to Nanna (Nanna B)". Electronic Text Corpus...
    40 KB (4,064 words) - 00:32, 25 May 2025
  • transliterations and translations of texts in a given corpus, and many offer supplementary material such as an introduction to the corpus, discussion of its historical...
    10 KB (399 words) - 23:07, 12 May 2024
  • Habeas corpus (/ˈheɪbiəs ˈkɔːrpəs/ ; from Medieval Latin, lit. 'you should have the body') is a legal procedure by which a report can be made to a court...
    76 KB (9,363 words) - 12:59, 25 May 2025
  • The Europarl Corpus is a corpus (set of documents) that consists of the proceedings of the European Parliament from 1996 to 2012. In its first release...
    6 KB (800 words) - 11:02, 15 September 2022
  • Word list (category Articles lacking in-text citations from December 2023)
    analysis within a given text corpus, and is used in corpus linguistics to investigate genealogies and evolution of languages and texts. A word which appears...
    27 KB (2,849 words) - 03:54, 27 May 2025
  • Thumbnail for Corpus Juris Civilis
    The Corpus Juris (or Iuris) Civilis ("Body of Civil Law") is the modern name for a collection of fundamental works in jurisprudence, enacted from 529 to...
    22 KB (2,736 words) - 20:39, 8 May 2025
  • Thumbnail for Corpus callosum
    The corpus callosum (Latin for "tough body"), also callosal commissure, is a wide, thick nerve tract, consisting of a flat bundle of commissural fibers...
    32 KB (3,648 words) - 03:07, 2 June 2025