AI_alignment Search Results

AI alignment

intelligence (AI), alignment aims to steer AI systems toward a person's or group's intended goals, preferences, or ethical principles. An AI system is considered...

132 KB (12,973 words) - 16:13, 26 April 2025

AI takeover

intelligence (AI), alignment aims to steer AI systems toward a person's or group's intended goals, preferences, or ethical principles. An AI system is considered...

39 KB (4,241 words) - 18:29, 28 April 2025

Existential risk from artificial intelligence (redirect from Existential risk of AI)

published The Alignment Problem, which details the history of progress on AI alignment up to that time. In March 2023, key figures in AI, such as Musk...

127 KB (13,292 words) - 18:26, 28 April 2025

Paul Christiano (category AI safety scientists)

artificial intelligence (AI), with a specific focus on AI alignment, which is the subfield of AI safety research that aims to steer AI systems toward human...

14 KB (1,222 words) - 05:14, 21 April 2025

Jan Leike (category OpenAI people)

Jan Leike (born 1986 or 1987) is an AI alignment researcher who has worked at DeepMind and OpenAI. He joined Anthropic in May 2024. Jan Leike obtained...

6 KB (452 words) - 15:26, 19 April 2025

The Alignment Problem

criticism of its accuracy and bias towards certain demographics. One of AI's main alignment challenges is its black box nature (inputs and outputs are identifiable...

8 KB (804 words) - 19:31, 31 January 2025

Alignment Research Center

focused on the theoretical challenges of AI alignment. They attempt to develop scalable methods for training AI systems to behave honestly and helpfully...

7 KB (601 words) - 14:38, 25 February 2025

AI safety

intelligence (AI) systems. It encompasses machine ethics and AI alignment, which aim to ensure AI systems are moral and beneficial, as well as monitoring AI systems...

87 KB (10,322 words) - 20:49, 28 April 2025

Llama (language model) (redirect from Llama AI)

performed better than larger but lower-quality third-party datasets. For AI alignment, reinforcement learning with human feedback (RLHF) was used with a combination...

53 KB (4,940 words) - 16:55, 22 April 2025

Anthropic (section Constitutional AI)

Krieger: Chief Product Officer Jan Leike: ex-OpenAI alignment researcher Claude incorporates "Constitutional AI" to set safety guidelines for the model's output...

31 KB (2,841 words) - 09:41, 26 April 2025

Artificial general intelligence (redirect from Hard AI)

human brain AI effect AI safety – Research area on making AI safe and beneficial AI alignment – AI conformance to the intended objective A.I. Rising – 2018...

131 KB (14,472 words) - 20:27, 29 April 2025

Alignment

performance and tire wear AI alignment, steering artificial intelligence systems towards the intended objective Alignment level, an audio recording/engineering...

4 KB (473 words) - 10:33, 1 March 2025

Hallucination (artificial intelligence) (redirect from AI hallucination)

scenarios. AI alignment AI effect AI safety Artifact Artificial stupidity Turing test Uncanny valley Shaw, Mary (17 October 2024). "tl;dr: Chill, y'all: AI Will...

69 KB (7,007 words) - 11:32, 30 April 2025

Thumbnail for Multi-agent reinforcement learning

Multi-agent reinforcement learning (section AI alignment)

into AI alignment. The relationship between the different agents in a MARL setting can be compared to the relationship between a human and an AI agent...

29 KB (3,030 words) - 14:51, 14 March 2025

Superintelligence: Paths, Dangers, Strategies

Kurzweil's The Singularity Is Near. Age of Artificial Intelligence AI alignment AI safety Future of Humanity Institute Human Compatible Life 3.0 Philosophy...

13 KB (1,273 words) - 06:15, 3 April 2025

Eliezer Yudkowsky (redirect from Rationality: From AI to Zombies)

introduce the debate about AI alignment to the mainstream, leading a reporter to ask President Joe Biden a question about AI safety at a press briefing...

23 KB (1,864 words) - 07:36, 23 April 2025

History of artificial intelligence (redirect from History of AI)

mitigating the risks and unintended consequences of AI became known as "the value alignment problem" or AI alignment. At the same time, machine learning systems...

166 KB (19,442 words) - 15:28, 29 April 2025

P(doom)

artificial general intelligence Statement on AI risk of extinction AI alignment AI takeover AI safety Conditional on A.I. not being "strongly regulated", time...

10 KB (739 words) - 14:51, 23 April 2025

Ethics of artificial intelligence (redirect from AI ethics)

dynamics, AI safety and alignment, technological unemployment, AI-enabled misinformation, how to treat certain AI systems if they have a moral status (AI welfare...

144 KB (14,880 words) - 04:56, 30 April 2025

Statement on AI risk of extinction

AI alignment Existential risk from artificial general intelligence Pause Giant AI Experiments: An Open Letter "Statement on AI Risk". Center for AI Safety...

7 KB (776 words) - 21:47, 15 February 2025

AI trust paradox

AI outputs are both reliable and ethically sound.[page needed] AI effect AI alignment Polanyi's paradox Trisha Ray,"The paradox of innovation and trust...

7 KB (753 words) - 09:02, 3 January 2025

Technology

agents. Within the field of AI ethics, significant yet-unsolved research problems include AI alignment (ensuring that AI behaviors are aligned with their...

108 KB (10,440 words) - 00:38, 1 May 2025

Friendly artificial intelligence (redirect from Friendly AI)

AI systems may be complex and difficult to interpret, leading to concerns about transparency and accountability. Affective computing AI alignment AI effect...

24 KB (2,709 words) - 23:28, 4 January 2025

Effective accelerationism

on – AI existential risk. Effective altruists (particularly longtermists) argue that AI companies should be cautious and strive to develop safe AI systems...

23 KB (1,988 words) - 14:43, 27 April 2025

Waluigi effect (section History and implications for AI)

located the desired Luigi, it's much easier to summon the Waluigi". AI alignment Hallucination Existential risk from AGI Reinforcement learning from human...

6 KB (627 words) - 16:31, 13 February 2025

EleutherAI

EleutherAI is a "decentralized grassroots collective of volunteer researchers, engineers, and developers focused on AI alignment, scaling, and open-source AI...

34 KB (2,917 words) - 03:24, 29 April 2025

Machine Intelligence Research Institute

from artificial general intelligence. MIRI's work has focused on a friendly AI approach to system design and on predicting the rate of technology development...

16 KB (1,149 words) - 04:07, 16 February 2025

Intelligent agent (redirect from AI agents)

cybercrime, ethical challenges, as well as problems related to AI safety and AI alignment. Other issues involve data privacy. Additional challenges include...

55 KB (5,400 words) - 05:18, 30 April 2025

The Monkey's Paw

Project. Novels portal Rabbit's foot Unintended consequences Hand of Glory AI alignment "The Monkey's Paw", Harper's Monthly, September, 1902. page 634. HathiTrust...

15 KB (1,613 words) - 09:17, 1 April 2025

Neural scaling law (redirect from AI scaling law)

diffusion, generative modeling, multimodal learning, contrastive learning, AI alignment, AI capabilities, robotics, out-of-distribution (OOD) generalization, continual...

44 KB (5,830 words) - 01:56, 30 March 2025