Transformer_(machine_learning_model) Search Results

Transformer (deep learning architecture)

The transformer is a deep learning architecture based on the multi-head attention mechanism, in which text is converted to numerical representations called...

106 KB (13,108 words) - 21:15, 5 June 2025

Mamba (deep learning architecture)

speech processing[citation needed]. Language modeling Transformer (machine learning model) State-space model Recurrent neural network The name comes from...

11 KB (1,159 words) - 19:42, 16 April 2025

Diffusion model

In machine learning, diffusion models, also known as diffusion-based generative models or score-based generative models, are a class of latent variable...

84 KB (14,123 words) - 01:54, 6 June 2025

Vision transformer

of 1.6 exaFLOPs. Transformer (machine learning model) Convolutional neural network Attention (machine learning) Perceiver Deep learning PyTorch TensorFlow...

37 KB (4,127 words) - 20:13, 29 April 2025

Large language model

language model (LLM) is a machine learning model designed for natural language processing tasks, especially language generation. LLMs are language models with...

113 KB (11,798 words) - 13:02, 5 June 2025

Generative pre-trained transformer

that is used in natural language processing by machines. It is based on the transformer deep learning architecture, pre-trained on large data sets of...

65 KB (5,278 words) - 15:49, 30 May 2025

Neural machine translation

and Survey. Attention (machine learning) Transformer (machine learning model) Seq2seq Koehn, Philipp (2020). Neural Machine Translation. Cambridge University...

36 KB (3,901 words) - 17:39, 23 May 2025

Attention (machine learning)

2021. Zhang, Ruiqi (2024). "Trained Transformers Learn Linear Models In-Context" (PDF). Journal of Machine Learning Research 1-55. 25. arXiv:2306.09927...

35 KB (3,425 words) - 14:56, 8 June 2025

Machine learning

Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn...

140 KB (15,571 words) - 05:51, 9 June 2025

Multimodal learning

(2023), and Muse (2023). Unlike later models, DALL-E is not a diffusion model. Instead, it uses a decoder-only Transformer that autoregressively generates a...

9 KB (2,212 words) - 22:40, 1 June 2025

Automated machine learning

includes every stage from beginning with a raw dataset to building a machine learning model ready for deployment. AutoML was proposed as an artificial intelligence-based...

9 KB (1,046 words) - 02:47, 26 May 2025

T5 (language model)

Transfer Transformer) is a series of large language models developed by Google AI introduced in 2019. Like the original Transformer model, T5 models are encoder-decoder...

20 KB (1,932 words) - 03:55, 7 May 2025

Attention Is All You Need

in machine learning authored by eight scientists working at Google. The paper introduced a new deep learning architecture known as the transformer, based...

15 KB (3,910 words) - 20:36, 1 May 2025

Reinforcement learning from human feedback

reward model to represent preferences, which can then be used to train other models through reinforcement learning. In classical reinforcement learning, an...

62 KB (8,617 words) - 19:50, 11 May 2025

Ensemble learning

Ensemble learning trains two or more machine learning algorithms on a specific classification or regression task. The algorithms within the ensemble model are...

53 KB (6,685 words) - 14:14, 8 June 2025

Adversarial machine learning

common attacks in adversarial machine learning include evasion attacks, data poisoning attacks, Byzantine attacks and model extraction. At the MIT Spam...

69 KB (7,819 words) - 08:26, 24 May 2025

Whisper (speech recognition system)

approaches. Whisper is a weakly-supervised deep learning acoustic model, made using an encoder-decoder transformer architecture. Whisper Large V2 was released...

15 KB (1,613 words) - 00:22, 7 April 2025

Self-supervised learning

Self-supervised learning (SSL) is a paradigm in machine learning where a model is trained on a task using the data itself to generate supervisory signals...

18 KB (2,047 words) - 12:49, 25 May 2025

Transformers (disambiguation)

Hasbro Transformers: The Ride 3D, theme park rides located in several Universal Studios parks Transformer (machine learning model) Transformer (disambiguation)...

2 KB (226 words) - 22:13, 5 February 2025

GPT-3 (redirect from Generative Pre-trained Transformer 3)

Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020. Like its predecessor, GPT-2, it is a decoder-only transformer model of...

55 KB (4,923 words) - 20:03, 12 May 2025

List of large language models

model (LLM) is a type of machine learning model designed for natural language processing tasks such as language generation. LLMs are language models with...

64 KB (3,361 words) - 16:05, 24 May 2025

Natural language processing

simplification Transformer (machine learning model) Truecasing Question answering Word2vec "NLP". Hutchins, J. (2005). "The history of machine translation...

54 KB (6,592 words) - 04:13, 4 June 2025

Support vector machine

In machine learning, support vector machines (SVMs, also support vector networks) are supervised max-margin models with associated learning algorithms...

65 KB (9,071 words) - 06:34, 24 May 2025

ChatGPT (redirect from Chat Generative Pre-trained Transformer)

pre-trained transformer (GPT) models and is fine-tuned for conversational applications using a combination of supervised learning and reinforcement learning from...

197 KB (16,777 words) - 10:43, 8 June 2025

GPT-2 (redirect from Generative Pre-trained Transformer 2)

Generative Pre-trained Transformer 2 (GPT-2) is a large language model by OpenAI and the second in their foundational series of GPT models. GPT-2 was pre-trained...

44 KB (3,264 words) - 01:17, 16 May 2025

Deep learning

Deep learning is a subset of machine learning that focuses on utilizing multilayered neural networks to perform tasks such as classification, regression...

180 KB (17,772 words) - 15:04, 30 May 2025

Normalization (machine learning)

Changliang; Wong, Derek F.; Chao, Lidia S. (2019). "Learning Deep Transformer Models for Machine Translation". arXiv:1906.01787 [cs.CL]. Xiong, Ruibin;...

35 KB (5,361 words) - 06:41, 9 June 2025

Learning rate

which a machine learning model "learns". In the adaptive control literature, the learning rate is commonly referred to as gain. In setting a learning rate...

9 KB (1,108 words) - 10:15, 30 April 2024

Language model

information retrieval. Large language models (LLMs), currently their most advanced form, are predominantly based on transformers trained on larger datasets (frequently...

16 KB (2,383 words) - 06:50, 4 June 2025

Foundation model

intelligence (AI), a foundation model (FM), also known as large X model (LxM), is a machine learning or deep learning model trained on vast datasets so that...

44 KB (4,719 words) - 15:41, 30 May 2025