Mick' Blog

Posts
Archive
Search
Tags

tag: RL

RSS

Reinforcement Learning

Personal takeaways of RL/RLHF/DPO

January 16, 2024 9 min Mick

© 2025 Mick' Blog CC BY-SA Powered by Hugo & PaperModX