Reinforcement Learning from Human Feedback
A training method where AI models are improved using human evaluations of their outputs. This approach helps align model behavior with human preferences and values while improving response quality.
A training method where AI models are improved using human evaluations of their outputs. This approach helps align model behavior with human preferences and values while improving response quality.