Dr. Leon Eversberg, Author at Future Tech Stocks

How to Evaluate Multilingual LLMs With Global-MMLU

Evaluation of language-specific LLM accuracy on the global Massive Multitask Language Understanding benchmark in Python Dr. Leon Eversberg · Follow Published in Towards Data Science · 7 min read · 6 hours ago — Photo by Joshua Fuller on Unsplash As soon as a new LLM is released, the obvious question we ask ourselves is this: Is this LLM better than the one I’m currently using? LLMs are typically evaluated against a large number of

Dr. Leon Eversberg December 9, 2024

How to Create a RAG Evaluation Dataset From Documents

Automatically create domain-specific datasets in any language using LLMs Dr. Leon Eversberg · Follow Published in Towards Data Science · 12 min read · 5 hours ago — Our automatically generated RAG evaluation dataset on the Hugging Face Hub (PDF input file from the European Union licensed under CC BY 4.0). Image by the author In this article I will show you how to create your own RAG dataset consisting of contexts, questions, and answers

Dr. Leon Eversberg November 3, 2024

How to Use HyDE for Better LLM RAG Retrieval

Building an advanced local LLM RAG pipeline with hypothetical document embeddings Dr. Leon Eversberg · Follow Published in Towards Data Science · 9 min read · 7 hours ago — Implementing HyDE is very simple in Python. Image by the author Large Language Models (LLMs) can be improved by giving them access to external knowledge through documents. The basic Retrieval Augmented Generation (RAG) pipeline consists of a user query, an embedding model that converts text

Dr. Leon Eversberg October 4, 2024

How to Improve LLM Responses With Better Sampling Parameters

A deep dive into stochastic decoding with temperature, top_p, top_k, and min_p Dr. Leon Eversberg · Follow Published in Towards Data Science · 10 min read · 11 hours ago — When calling the OpenAI API with the Python SDK, have you ever wondered what exactly the temperature and top_p parameters do? When you ask a Large Language Model (LLM) a question, the model outputs a probability for every possible token in its vocabulary. After

Dr. Leon Eversberg September 2, 2024

How to Use Hybrid Search for Better LLM RAG Retrieval

Building an advanced local LLM RAG pipeline by combining dense embeddings with BM25 Dr. Leon Eversberg · Follow Published in Towards Data Science · 11 min read · 12 hours ago — Code snippet from the hybrid search we are going to implement in this article. Image by author The basic Retrieval-Augmented Generation (RAG) pipeline uses an encoder model to search for similar documents when given a query. This is also called semantic search because

Dr. Leon Eversberg August 11, 2024

How to Reduce Embedding Size and Increase RAG Retrieval Speed

Flexible text embedding with Matryoshka Representation Learning (MRL) Dr. Leon Eversberg · Follow Published in Towards Data Science · 7 min read · 11 hours ago — Matryoshka dolls are nesting dolls of decreasing size. Photo by Sandy Millar on Unsplash Introduction Text embeddings are high-dimensional vector representations of single words or entire sentences.

Dr. Leon Eversberg May 26, 2024

Safeguard Your LLM Chatbot With Llama Guard 2

How to apply content moderation to your LLM’s inputs and outputs for a more responsible AI system Dr. Leon Eversberg · Follow Published in Towards Data Science · 9 min read · 8 hours ago — Llama Guard. Image created by author with Adobe Photoshop’s AI image generation. Introduction Responsible AI is a big umbrella term that has seen increased interest since the rise of ChatGPT and Large Language Models (LLMs).

Dr. Leon Eversberg May 13, 2024

How to Build a Local Open-Source LLM Chatbot With RAG

Talking to PDF documents with Google’s Gemma-2b-it, LangChain, and Streamlit Dr. Leon Eversberg · Follow Published in Towards Data Science · 12 min read · 8 hours ago — The LLM chatbot with RAG we will build in this article answers specific questions using a washing machine user manual. Image by author Introduction Large Language Models (LLMs) are remarkable at compressing knowledge about the world into their billions of parameters. However, LLMs have two major

Dr. Leon Eversberg March 31, 2024

Dr. Leon Eversberg