• Thumbnail for Transformer (deep learning architecture)
    The transformer is a deep learning architecture based on the multi-head attention mechanism, in which text is converted to numerical representations called...
    106 KB (13,108 words) - 21:15, 5 June 2025
  • speech processing[citation needed]. Language modeling Transformer (machine learning model) State-space model Recurrent neural network The name comes from...
    11 KB (1,159 words) - 19:42, 16 April 2025
  • In machine learning, diffusion models, also known as diffusion-based generative models or score-based generative models, are a class of latent variable...
    84 KB (14,123 words) - 01:54, 6 June 2025
  • Thumbnail for Vision transformer
    of 1.6 exaFLOPs. Transformer (machine learning model) Convolutional neural network Attention (machine learning) Perceiver Deep learning PyTorch TensorFlow...
    37 KB (4,127 words) - 20:13, 29 April 2025
  • language model (LLM) is a machine learning model designed for natural language processing tasks, especially language generation. LLMs are language models with...
    113 KB (11,798 words) - 13:02, 5 June 2025
  • Thumbnail for Generative pre-trained transformer
    that is used in natural language processing by machines. It is based on the transformer deep learning architecture, pre-trained on large data sets of...
    65 KB (5,278 words) - 15:49, 30 May 2025
  • and Survey. Attention (machine learning) Transformer (machine learning model) Seq2seq Koehn, Philipp (2020). Neural Machine Translation. Cambridge University...
    36 KB (3,901 words) - 17:39, 23 May 2025
  • Thumbnail for Attention (machine learning)
    2021. Zhang, Ruiqi (2024). "Trained Transformers Learn Linear Models In-Context" (PDF). Journal of Machine Learning Research 1-55. 25. arXiv:2306.09927...
    35 KB (3,425 words) - 14:56, 8 June 2025
  • Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn...
    140 KB (15,571 words) - 05:51, 9 June 2025
  • (2023), and Muse (2023). Unlike later models, DALL-E is not a diffusion model. Instead, it uses a decoder-only Transformer that autoregressively generates a...
    9 KB (2,212 words) - 22:40, 1 June 2025
  • includes every stage from beginning with a raw dataset to building a machine learning model ready for deployment. AutoML was proposed as an artificial intelligence-based...
    9 KB (1,046 words) - 02:47, 26 May 2025
  • Transfer Transformer) is a series of large language models developed by Google AI introduced in 2019. Like the original Transformer model, T5 models are encoder-decoder...
    20 KB (1,932 words) - 03:55, 7 May 2025
  • Thumbnail for Attention Is All You Need
    in machine learning authored by eight scientists working at Google. The paper introduced a new deep learning architecture known as the transformer, based...
    15 KB (3,910 words) - 20:36, 1 May 2025
  • reward model to represent preferences, which can then be used to train other models through reinforcement learning. In classical reinforcement learning, an...
    62 KB (8,617 words) - 19:50, 11 May 2025
  • Ensemble learning trains two or more machine learning algorithms on a specific classification or regression task. The algorithms within the ensemble model are...
    53 KB (6,685 words) - 14:14, 8 June 2025
  • common attacks in adversarial machine learning include evasion attacks, data poisoning attacks, Byzantine attacks and model extraction. At the MIT Spam...
    69 KB (7,819 words) - 08:26, 24 May 2025
  • approaches. Whisper is a weakly-supervised deep learning acoustic model, made using an encoder-decoder transformer architecture. Whisper Large V2 was released...
    15 KB (1,613 words) - 00:22, 7 April 2025
  • Self-supervised learning (SSL) is a paradigm in machine learning where a model is trained on a task using the data itself to generate supervisory signals...
    18 KB (2,047 words) - 12:49, 25 May 2025
  • Hasbro Transformers: The Ride 3D, theme park rides located in several Universal Studios parks Transformer (machine learning model) Transformer (disambiguation)...
    2 KB (226 words) - 22:13, 5 February 2025
  • Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020. Like its predecessor, GPT-2, it is a decoder-only transformer model of...
    55 KB (4,923 words) - 20:03, 12 May 2025
  • model (LLM) is a type of machine learning model designed for natural language processing tasks such as language generation. LLMs are language models with...
    64 KB (3,361 words) - 16:05, 24 May 2025
  • simplification Transformer (machine learning model) Truecasing Question answering Word2vec "NLP". Hutchins, J. (2005). "The history of machine translation...
    54 KB (6,592 words) - 04:13, 4 June 2025
  • In machine learning, support vector machines (SVMs, also support vector networks) are supervised max-margin models with associated learning algorithms...
    65 KB (9,071 words) - 06:34, 24 May 2025
  • pre-trained transformer (GPT) models and is fine-tuned for conversational applications using a combination of supervised learning and reinforcement learning from...
    197 KB (16,777 words) - 10:43, 8 June 2025
  • Thumbnail for GPT-2
    Generative Pre-trained Transformer 2 (GPT-2) is a large language model by OpenAI and the second in their foundational series of GPT models. GPT-2 was pre-trained...
    44 KB (3,264 words) - 01:17, 16 May 2025
  • Thumbnail for Deep learning
    Deep learning is a subset of machine learning that focuses on utilizing multilayered neural networks to perform tasks such as classification, regression...
    180 KB (17,772 words) - 15:04, 30 May 2025
  • Changliang; Wong, Derek F.; Chao, Lidia S. (2019). "Learning Deep Transformer Models for Machine Translation". arXiv:1906.01787 [cs.CL]. Xiong, Ruibin;...
    35 KB (5,361 words) - 06:41, 9 June 2025
  • which a machine learning model "learns". In the adaptive control literature, the learning rate is commonly referred to as gain. In setting a learning rate...
    9 KB (1,108 words) - 10:15, 30 April 2024
  • information retrieval. Large language models (LLMs), currently their most advanced form, are predominantly based on transformers trained on larger datasets (frequently...
    16 KB (2,383 words) - 06:50, 4 June 2025
  • intelligence (AI), a foundation model (FM), also known as large X model (LxM), is a machine learning or deep learning model trained on vast datasets so that...
    44 KB (4,719 words) - 15:41, 30 May 2025