A text-to-video model is a machine learning model that uses a natural language description as input to produce a video relevant to the input text. Advancements...
25 KB (2,248 words) - 12:14, 28 April 2025
a text-to-video model developed by OpenAI. The model generates short video clips based on user prompts, and can also extend existing short videos. Sora...
14 KB (1,300 words) - 19:47, 23 April 2025
Dream Machine is a text-to-video model created by Luma Labs and launched in June 2024. It generates video output based on user prompts or still images...
10 KB (906 words) - 14:33, 10 March 2025
A text-to-image model is a machine learning model which takes an input natural language description and produces an image matching that description. Text-to-image...
20 KB (1,925 words) - 13:44, 30 April 2025
Flux (also known as FLUX.1) is a text-to-image model developed by Black Forest Labs, based in Freiburg im Breisgau, Germany. Black Forest Labs were founded...
23 KB (1,853 words) - 04:09, 20 April 2025
Kuaishou (redirect from Kling (text-to-video model))
transformer text-to-video model, Kling, which they claimed could generate two minutes of video at 30 frames per second and in 1080p resolution. The model has...
24 KB (1,983 words) - 01:17, 29 April 2025
OpenAI (section Text-to-video)
for the GPT family of large language models, the DALL-E series of text-to-image models, and a text-to-video model named Sora. Its release of ChatGPT in...
219 KB (19,127 words) - 19:00, 30 April 2025
VideoPoet is a large language model developed by Google Research in 2023 for video making. It can be asked to animate still images. The model accepts...
3 KB (214 words) - 16:45, 13 January 2025
Multimodal learning (redirect from Multimodal model)
to as modalities, such as text, audio, images, or video. This integration allows for a more holistic understanding of complex data, improving model performance...
9 KB (2,338 words) - 08:44, 24 October 2024
Transformer (deep learning architecture) (redirect from Transformer model)
a text-to-video model. It is a bidirectional masked transformer conditioned on pre-computed text tokens. The generated tokens are then decoded to a video...
106 KB (13,091 words) - 21:14, 29 April 2025
learning on a vast amount of text. The largest and most capable LLMs are generative pretrained transformers (GPTs). Modern models can be fine-tuned for specific...
114 KB (11,942 words) - 05:35, 30 April 2025
encodes both a text prompt and an image prompt. Make-A-Video (2022) is a text-to-video diffusion model. CM3leon (2023) is not a diffusion model, but an autoregressive...
85 KB (14,257 words) - 03:27, 16 April 2025
Runway (company) (category Text-to-video generation)
and models for generating videos, images, and various multimedia content. It is most notable for developing the commercial text-to-video and video generative...
17 KB (1,590 words) - 12:47, 2 May 2025
T5 (Text-to-Text Transfer Transformer) is a series of large language models developed by Google AI introduced in 2019. Like the original Transformer model...
20 KB (1,932 words) - 22:58, 21 March 2025
OX2 receptors Sora-Q, tiny Lunar rover developed in Japan Sora (text-to-video model), developed by OpenAI Southern Rails Cooperative, with reporting...
2 KB (301 words) - 07:51, 29 April 2025
Sora-like technology to achieve artificial general intelligence (AGI). In July 2024, they debuted their "Ying" text-to-video model. After OpenAI announced...
8 KB (679 words) - 06:36, 22 April 2025
artificial intelligence that uses generative models to produce text, images, videos, or other forms of data. These models learn the underlying patterns and structures...
163 KB (13,826 words) - 19:09, 30 April 2025
with self-supervised learning on a vast amount of text. This page lists notable large language models. For the training cost column, 1 petaFLOP-day = 1...
64 KB (3,361 words) - 09:20, 29 April 2025
MiniMax (company) (category Text-to-video generation)
language model consumer platform that provides AI text and music-generating features. In September 2024, MiniMax launched video-01, a text-to-video model under...
10 KB (863 words) - 07:00, 2 May 2025
Computer animation (redirect from Computer-generated video)
The Road to El Dorado, Spirit: Stallion of the Cimarron and Sinbad: Legend of the Seven Seas. A text-to-video model is a machine learning model that uses...
54 KB (5,907 words) - 05:42, 2 May 2025
moving "from a text-based publishing model to video... a reaction to the fact that Facebook has changed their algorithms in favor of video instead of referral...
25 KB (2,682 words) - 23:10, 2 May 2025
Synthetic media (redirect from Text-to-scene)
synthesis Slop (artificial intelligence) Text-to-image model Text-to-video model Transformer (machine learning model) WaveNet Goodstein, Anastasia. "Will...
77 KB (7,508 words) - 18:36, 22 April 2025
could process multiple types of data simultaneously, including text, images, audio, video, and computer code. It had been developed as a collaboration between...
52 KB (4,226 words) - 20:15, 19 April 2025
synthesis Brain–computer interfaces Time series anomaly detection Text-to-Video model Rhythm learning Music composition Grammar learning Handwriting recognition...
89 KB (10,413 words) - 06:01, 17 April 2025
Adobe Firefly (category Text-to-image generation)
generative artificial intelligence models for creative production. Its capabilities include text-to-image and text-to-video. It is part of Adobe Creative Cloud...
9 KB (690 words) - 10:30, 24 April 2025
Convolutional neural network (redirect from CNN (machine learning model))
application can be seen in text-to-video model.[citation needed] CNNs have also been explored for natural language processing. CNN models are effective for various...
138 KB (15,599 words) - 06:42, 18 April 2025
Stable Diffusion (category Text-to-image generation)
Stable Diffusion is a deep learning, text-to-image model released in 2022 based on diffusion techniques. The generative artificial intelligence technology...
66 KB (6,183 words) - 06:03, 14 April 2025
Automatic summarization (redirect from Text Summaries)
to "tag" or index a text document, or key sentences (including headings) that collectively comprise an abstract, and representative images or video segments...
52 KB (6,825 words) - 23:13, 23 July 2024
Reinforcement learning from human feedback (category Language modeling)
tasks such as text summarization and conversational agents, computer vision tasks like text-to-image models, and the development of video game bots. While...
62 KB (8,615 words) - 05:24, 30 April 2025
videos, and sound content, as well as ideograms known as emoji (happy faces, sad faces, and other icons), and on various instant messaging apps. Text...
145 KB (16,849 words) - 11:38, 19 April 2025