Don’t Limit Your RAG Knowledgebase to Just Text

Steal this plug-n-play Python script to easily implement images into your chatbot’s Knowledgebase

Photo by Nitish Meena on Unsplash

When building a Knowledgebase, a common challenge is converting everything into plain text. This can be limiting when dealing with media sources like slides, PDFs, images and more.

So, how can we make proper use of data that’s not in plain text?

⛳ Don’t have medium membership? I got you covered: use this free article link. Please consider leaving highlights, claps, follow, and comments

Thanks to recent advancements in AI, it’s now easier and cheaper than ever. By using Large Language Models (LLMs) with vision capabilities, we can transcribe thousands of images, not just capturing the text but also understanding how the contents are related. These models can even describe visual objects within an image if needed, offering a far richer and more detailed transcription than OCR ever could.

We’ll get started with these three simple steps:

  1. Collect Data: Gather the images you plan to use, ensuring they are well-organized and not overloaded with information.
  2. Upload Data: Set up an AWS S3 bucket to store your images, making sure the cloud-based AI model can…