What is Reinforcement Learning (RL), and what are the 4 key components?
Reinforcement Learning (RL) involves an agent interacting with an environment by taking actions.
Key components:
State → Representation of the world at a moment.
Action → Decision made by the agent.
Reward → Feedback from the environment.
Policy function → Defines which action the agent takes in a given state.
Draw Reinforcement Learning
What is Reinforcement Learning from Human Feedback (RLHF)? And name the 3 Process steps
RLHF optimizes language models by using human preferences to guide training.
Reward function R(s, prompt) [s: output]
Reward is higher when humans prefere the output
Process:
1. Estimate reward function R(s; prompt)
Find the best generative model p that maximizes the expected reward
What are the 2 main approaches for estimating Reward R and evaluate them?
Two main approaches:
Get humas to provide absolute scores for each output -> Human judgement on different instances / by different people can be noisy or mis-calibrated
Ask for pairwise comparisons -> Can be more relibale
Scaling Reward Model: Large enough reward trained on large enough data approaching human performance
Name the 5 Steps for Reinforcement Learning from Human Feedback
Collect a dataset of human preferences ->Human annotators rank outputs based on preferability
Use this data to train a reward model -> Reward model returns a scalar reward which should numerically represent the human preference
We want to learn a policy (a Language Model) that optimizes against the reward model
Periodically train the reward model with more samples and human feedback
-> PROBLEM: System will learn to “cheat”, be producing gibberish / irrelevant outputs -> SOLUTION: Add penalty term penalizing deviations from distribution of pretrained LM
What are the challenges in RLHF?
Challenges:
The model may learn to "cheat" by generating gibberish outputs that maximize reward scores.
Difficult to train good reward models.
Requires a lot of human annotations.
How do reward models improve LLM behavior?
Reward models can enforce desired behaviors, such as:
Avioiding bias.
Avoiding toxic outputs.
Staying within the model’s knowledge scope.
2 Limitations of Reinforcement learning
Tricky to get right
Training a good reward may require a lot of annotations
NAme 2 appraoches for overcoming the challenge of LMs need to process massive amounts of data
Scale up the model and train it on longer context window size -> Bottleneck: Memory usage and number of operations in self-attention increase quadratically
Sparse Attention Patters -> Improve efficiency: Make attention operations sparse
What are Sparse Attention Patterns, and how do they help LLMs?
Sparse Attention reduces the number of attention computations by focusing only on selected tokens.
Implementation strategies:
Different layers and attention heads follow different sparsity patterns.
Earlier layers with sparser attention.
What is Retrieval-Augmented Generation (RAG), and why is it useful?
RAG combines LLMs with external data to improve responses.
Why is it useful?
LLMs cannot memorize all knowledge.
LLM knowledge can be outdated and hard to update.
LLM output is challenging to interpret and verify
LLMs are expensive to train; reducing their size while retrieving information is more efficient.
What are the 3 key design questions when implementing RAG?
Key design decisions in RAG:
Memories: What should be stored as memory? (Documents, databases, etc.).
Memory retrieval method: How should retrieval work? (Pretrained retriever or off-the-shelf search).
Retriever memories usage: How should retrieved information be used? (Fusion with prompt or model input).
What are common failures in Retrieval-Augmented Generation (RAG)?
Two major failure modes:
Underutilization → The model ignores retrieved memories.
Overreliance → The model depends too much on retrieved information.
Why do LLMs need external retrieval mechanisms?
Limitations of storing all knowledge in LLMs:
Cannot store long-tail knowledge efficiently.
Difficult to update (LLM retraining is expensive).
Hard to verify generated facts.
Draw regular decoder with IR Embeddings
Last changed3 months ago