AI Glossary

What is Reinforcement Learning?

Reinforcement Learning (RL) is a machine learning paradigm where an agent learns optimal behavior by interacting with an environment and receiving rewards or penalties. In AI language models, RLHF (Reinforcement Learning from Human Feedback) is used to align model outputs with human preferences after pre-training. RL has also driven breakthroughs in game playing (AlphaGo, AlphaZero), robotics, and autonomous systems. Recent developments include RLAIF (RL from AI Feedback), DPO (Direct Preference Optimization) as a simpler alternative to PPO-based RLHF, and RL-based reasoning training that enables models to "think" before answering.

Frequently Asked Questions

What is reinforcement learning?

Reinforcement learning is a type of machine learning where an agent learns by trial and error, receiving rewards for good actions and penalties for bad ones, optimizing behavior over time.

What is RLHF?

RLHF (Reinforcement Learning from Human Feedback) trains AI models to produce outputs that humans prefer. It is a key technique used to align LLMs like ChatGPT and Claude.