Wei Yi, Author at Future Tech Stocks

Understand REINFORCE, Actor-Critic and PPO in one go

Use the loss function of the Policy Gradient algorithm as key to understand various reinforcement learning algorithms: REINFORCE, Actor-Critic, and PPO, which are theoretical preparations to understand the Reinforcement Learning from Human Feedback (RLHF) algorithm used to build ChatGPT. Wei Yi · Follow Published in Towards Data Science · 37 min read · 5 hours ago — Image from Unsplash Studying reinforcement learning can be frustrating because the field is cursed with confusing jargon and

Wei Yi July 24, 2024

How Does an Image-Text Foundation Model Work

Learn how an image-text multi-modality model can perform image classification, image retrieval, and image captioning Wei Yi · Follow Published in Towards Data Science · 18 min read · 11 hours ago — Photo by Bozhin Karaivanov on Unsplash Nowadays, there is a surge of multi-modality foundation models. They understand different kinds of data, including text, image, video, audio, and can perform tasks that require the knowledge of…

Wei Yi June 1, 2024

Wei Yi

Understand REINFORCE, Actor-Critic and PPO in one go

How Does an Image-Text Foundation Model Work

Supercharge Your Portfolio with Future Tech Stocks!

Join us for Profitable Insights & Expert Tips!

Wei Yi

Understand REINFORCE, Actor-Critic and PPO in one go

How Does an Image-Text Foundation Model Work

Supercharge Your Portfolio with Future Tech Stocks!

Join us for Profitable Insights & Expert Tips!

Subscribe