How Good Are the Latest Open LLMs? And Is DPO Better Than PPO?

What a month! We had four major open LLM releases: Mixtral, Meta AI's Llama 3, Microsoft's Phi-3, and Apple's OpenELM. In my new article, I review and discuss all four of these major transformer-based LLM model releases, followed by new research on reinforcement learning with human feedback methods for instruction finetuning using PPO and DPO algorithms.
 •  0 comments  •  flag
Share on Twitter
Published on May 11, 2024 23:03
No comments have been added yet.


Sebastian Raschka's Blog

Sebastian Raschka
Sebastian Raschka isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow Sebastian Raschka's blog with rss.