• A speech corpus (or spoken corpus) is a database of speech audio files and text transcriptions. In speech technology, speech corpora are used, among other...
    5 KB (474 words) - 09:15, 19 April 2024
  • The BABEL speech corpus is a corpus of recorded speech materials from five Central and Eastern European languages. Intended for use in speech technology...
    7 KB (893 words) - 14:11, 1 March 2024
  • The Arabic Speech Corpus is a Modern Standard Arabic (MSA) speech corpus for speech synthesis. The corpus contains phonetic and orthographic transcriptions...
    4 KB (388 words) - 18:44, 27 July 2023
  • example of annotating a corpus is part-of-speech tagging, or POS-tagging, in which information about each word's part of speech (verb, noun, adjective...
    8 KB (879 words) - 07:37, 2 May 2024
  • Thumbnail for Brown Corpus
    The Brown University Standard Corpus of Present-Day American English, better known as simply the Brown Corpus, is an electronic collection of text samples...
    9 KB (1,056 words) - 13:22, 29 February 2024
  • Thumbnail for Quranic Arabic Corpus
    supervised by Eric Atwell. The annotated corpus includes: A manually verified part-of-speech tagged Quranic Arabic corpus. An annotated treebank of Quranic Arabic...
    6 KB (599 words) - 10:00, 22 April 2024
  • The Persian Speech Corpus is a Modern Persian speech corpus for speech synthesis. The corpus contains phonetic and orthographic transcriptions of about...
    3 KB (355 words) - 07:48, 10 May 2024
  • text of speech or writing that aim to represent a given linguistic variety. Today, corpora are generally machine-readable data collections. Corpus linguistics...
    23 KB (2,576 words) - 21:31, 8 May 2024
  • In corpus linguistics, part-of-speech tagging (POS tagging or PoS tagging or POST), also called grammatical tagging is the process of marking up a word...
    16 KB (2,266 words) - 02:30, 11 May 2024
  • structured set of texts Speech corpus, in linguistics, a large set of speech audio files Corpus linguistics, a branch of linguistics Corpus (album), by Sebastian...
    2 KB (315 words) - 06:44, 26 April 2024
  • user-defined part of speech) Note that the corpus is available only through the web interface, due to copyright restrictions. The corpus of Global Web-based...
    10 KB (1,135 words) - 07:01, 8 May 2024
  • linguists whose goal was a corpus of modern (at the time of building the corpus), naturally occurring language in the form of speech and text or writing that...
    31 KB (3,894 words) - 13:28, 29 February 2024
  • A child speech corpus is a speech corpus documenting first-language language acquisition. Such databases are used in the development of computer-assisted...
    9 KB (714 words) - 19:58, 9 February 2024
  • The Switchboard Telephone Speech Corpus is a corpus of spoken English language consisted of almost 260 hours of speech. It was created in 1990 by Texas...
    4 KB (453 words) - 14:58, 28 January 2024
  • Thumbnail for Corpus callosum
    The corpus callosum (Latin for "tough body"), also callosal commissure, is a wide, thick nerve tract, consisting of a flat bundle of commissural fibers...
    31 KB (3,605 words) - 21:27, 18 April 2024
  • Habeas corpus (/ˈheɪbiəs ˈkɔːrpəs/ ; from Medieval Latin, lit. 'that you have the body') is a recourse in law by which a report can be made to a court...
    75 KB (9,431 words) - 02:21, 23 March 2024
  • The Enron Corpus is a database of over 600,000 emails generated by 158 employees of the Enron Corporation in the years leading up to the company's collapse...
    7 KB (728 words) - 10:15, 10 March 2024
  • Spoken English Corpus (SEC) is a speech corpus collection of recordings of spoken British English compiled during 1984–1987. The corpus manual can be found...
    13 KB (1,278 words) - 04:41, 13 November 2023
  • The Oxford English Corpus (OEC) is a text corpus of 21st-century English, used by the makers of the Oxford English Dictionary and by Oxford University...
    4 KB (345 words) - 10:40, 19 November 2022
  • Thumbnail for N-gram
    N-gram (category Corpus linguistics)
    extracted from a speech-recording dataset, or adjacent base pairs extracted from a genome. They are collected from a text corpus or speech corpus. If Latin numerical...
    8 KB (684 words) - 18:05, 15 February 2024
  • essential to compile a speech corpus to produce acoustic models for speech recognition projects. VoxForge is a free speech corpus and acoustic model repository...
    7 KB (798 words) - 14:26, 14 March 2023
  • TIMIT (category Speech recognition)
    TIMIT is a corpus of phonemically and lexically transcribed speech of American English speakers of different sexes and dialects. Each transcribed element...
    4 KB (561 words) - 09:25, 19 April 2024
  • then assess the realization of linguistic variables in the resulting speech corpus. Other research methods in sociolinguistics include matched-guise tests...
    34 KB (4,100 words) - 08:20, 29 April 2024
  • Washington. EARS funded the collection of the Switchboard telephone speech corpus containing 260 hours of recorded conversations from over 500 speakers...
    113 KB (12,457 words) - 20:58, 3 May 2024
  • The corpus has been also tagged, i.e. part-of-speech categories have been assigned to every word.[citation needed] LOB Corpus Manual LOB Corpus Manual...
    2 KB (151 words) - 15:51, 29 February 2024
  • Speech recognition software is available for many computing platforms, operating systems, use models, and software licenses. Here is a listing of such...
    12 KB (841 words) - 19:25, 1 April 2024
  • included in earlier corpora such as the British National Corpus. It is annotated for part of speech and lemma, shallow parse, and named entities. The ANC...
    5 KB (605 words) - 13:14, 3 February 2023
  • Speech is a human vocal communication using language. Each language uses phonetic combinations of vowel and consonant sounds that form the sound of its...
    30 KB (3,449 words) - 11:42, 23 April 2024
  • non-native speech, only attributes of the non-native part of the corpus are listed. Most of the corpora are collections of read speech. If the corpus instead...
    15 KB (1,419 words) - 22:59, 5 May 2022
  • Corpus callosotomy is a palliative surgical procedure for the treatment of medically refractory epilepsy. In this procedure the corpus callosum is cut...
    16 KB (1,669 words) - 21:46, 13 March 2024