The transformer is a deep learning architecture based on the multi-head attention mechanism, in which text is converted to numerical representations called...
106 KB (13,108 words) - 21:15, 5 June 2025
speech processing[citation needed]. Language modeling Transformer (machine learning model) State-space model Recurrent neural network The name comes from...
11 KB (1,159 words) - 19:42, 16 April 2025
In machine learning, diffusion models, also known as diffusion-based generative models or score-based generative models, are a class of latent variable...
84 KB (14,123 words) - 01:54, 6 June 2025
of 1.6 exaFLOPs. Transformer (machine learning model) Convolutional neural network Attention (machine learning) Perceiver Deep learning PyTorch TensorFlow...
37 KB (4,127 words) - 20:13, 29 April 2025
language model (LLM) is a machine learning model designed for natural language processing tasks, especially language generation. LLMs are language models with...
113 KB (11,798 words) - 13:02, 5 June 2025
that is used in natural language processing by machines. It is based on the transformer deep learning architecture, pre-trained on large data sets of...
65 KB (5,278 words) - 15:49, 30 May 2025
and Survey. Attention (machine learning) Transformer (machine learning model) Seq2seq Koehn, Philipp (2020). Neural Machine Translation. Cambridge University...
36 KB (3,901 words) - 17:39, 23 May 2025
2021. Zhang, Ruiqi (2024). "Trained Transformers Learn Linear Models In-Context" (PDF). Journal of Machine Learning Research 1-55. 25. arXiv:2306.09927...
35 KB (3,425 words) - 14:56, 8 June 2025
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn...
140 KB (15,571 words) - 05:51, 9 June 2025
(2023), and Muse (2023). Unlike later models, DALL-E is not a diffusion model. Instead, it uses a decoder-only Transformer that autoregressively generates a...
9 KB (2,212 words) - 22:40, 1 June 2025
includes every stage from beginning with a raw dataset to building a machine learning model ready for deployment. AutoML was proposed as an artificial intelligence-based...
9 KB (1,046 words) - 02:47, 26 May 2025
Transfer Transformer) is a series of large language models developed by Google AI introduced in 2019. Like the original Transformer model, T5 models are encoder-decoder...
20 KB (1,932 words) - 03:55, 7 May 2025
in machine learning authored by eight scientists working at Google. The paper introduced a new deep learning architecture known as the transformer, based...
15 KB (3,910 words) - 20:36, 1 May 2025
reward model to represent preferences, which can then be used to train other models through reinforcement learning. In classical reinforcement learning, an...
62 KB (8,617 words) - 19:50, 11 May 2025
Ensemble learning trains two or more machine learning algorithms on a specific classification or regression task. The algorithms within the ensemble model are...
53 KB (6,685 words) - 14:14, 8 June 2025
common attacks in adversarial machine learning include evasion attacks, data poisoning attacks, Byzantine attacks and model extraction. At the MIT Spam...
69 KB (7,819 words) - 08:26, 24 May 2025
approaches. Whisper is a weakly-supervised deep learning acoustic model, made using an encoder-decoder transformer architecture. Whisper Large V2 was released...
15 KB (1,613 words) - 00:22, 7 April 2025
Self-supervised learning (SSL) is a paradigm in machine learning where a model is trained on a task using the data itself to generate supervisory signals...
18 KB (2,047 words) - 12:49, 25 May 2025
Hasbro Transformers: The Ride 3D, theme park rides located in several Universal Studios parks Transformer (machine learning model) Transformer (disambiguation)...
2 KB (226 words) - 22:13, 5 February 2025
GPT-3 (redirect from Generative Pre-trained Transformer 3)
Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020. Like its predecessor, GPT-2, it is a decoder-only transformer model of...
55 KB (4,923 words) - 20:03, 12 May 2025
model (LLM) is a type of machine learning model designed for natural language processing tasks such as language generation. LLMs are language models with...
64 KB (3,361 words) - 16:05, 24 May 2025
simplification Transformer (machine learning model) Truecasing Question answering Word2vec "NLP". Hutchins, J. (2005). "The history of machine translation...
54 KB (6,592 words) - 04:13, 4 June 2025
In machine learning, support vector machines (SVMs, also support vector networks) are supervised max-margin models with associated learning algorithms...
65 KB (9,071 words) - 06:34, 24 May 2025
ChatGPT (redirect from Chat Generative Pre-trained Transformer)
pre-trained transformer (GPT) models and is fine-tuned for conversational applications using a combination of supervised learning and reinforcement learning from...
197 KB (16,777 words) - 10:43, 8 June 2025
GPT-2 (redirect from Generative Pre-trained Transformer 2)
Generative Pre-trained Transformer 2 (GPT-2) is a large language model by OpenAI and the second in their foundational series of GPT models. GPT-2 was pre-trained...
44 KB (3,264 words) - 01:17, 16 May 2025
Deep learning is a subset of machine learning that focuses on utilizing multilayered neural networks to perform tasks such as classification, regression...
180 KB (17,772 words) - 15:04, 30 May 2025
Changliang; Wong, Derek F.; Chao, Lidia S. (2019). "Learning Deep Transformer Models for Machine Translation". arXiv:1906.01787 [cs.CL]. Xiong, Ruibin;...
35 KB (5,361 words) - 06:41, 9 June 2025
which a machine learning model "learns". In the adaptive control literature, the learning rate is commonly referred to as gain. In setting a learning rate...
9 KB (1,108 words) - 10:15, 30 April 2024
information retrieval. Large language models (LLMs), currently their most advanced form, are predominantly based on transformers trained on larger datasets (frequently...
16 KB (2,383 words) - 06:50, 4 June 2025
intelligence (AI), a foundation model (FM), also known as large X model (LxM), is a machine learning or deep learning model trained on vast datasets so that...
44 KB (4,719 words) - 15:41, 30 May 2025