Improving Text Embeddings with Large Language Models: Analysis of Training Hyperparameters | HackerNoon

Auto Encoder: How to Ignore the Signal Noise
October 9, 2024
8:00 pm

Authors:

(1) Liang Wang, Microsoft Corporation, and Correspondence to ([email protected]);

(2) Nan Yang, Microsoft Corporation, and correspondence to ([email protected]);

(3) Xiaolong Huang, Microsoft Corporation;

(4) Linjun Yang, Microsoft Corporation;

(5) Rangan Majumder, Microsoft Corporation;

(6) Furu Wei, Microsoft Corporation and Correspondence to ([email protected]).

Table of Links

Abstract and 1 Introduction

2 Related Work

3 Method

3.1 Synthetic Data Generation

3.2 Training

4 Experiments

4.1 Statistics of the Synthetic Data

4.2 Model Fine-tuning and Evaluation

4.3 Main Results

4.4 Multilingual Retrieval

5 Analysis

5.1 Is Contrastive Pre-training Necessary?

5.2 Extending to Long Text Embeddings and 5.3 Analysis of Training Hyperparameters

6 Conclusion and References

A Implementation Details

B Test Set Contamination Analysis

C Prompts for Synthetic Data Generation

D Instructions for Training and Evaluation

5.2 Extending to Long Text Embeddings

5.3 Analysis of Training Hyperparameters

Table 4 presents the results under different configurations. We notice that the Mistral-7B initialization holds an advantage over LLaMA-2 7B, in line with the findings from Mistral-7B technical report [19]. The choice of pooling types and LoRA ranks does not affect the overall performance substantially, hence we adhere to the default setting despite the marginal superiority of LoRA rank 8. On the other hand, the way of adding instructions has a considerable impact on the performance. We conjecture that natural language instructions better inform the model regarding the embedding task at hand, and thus enable the model to generate more discriminative embeddings. Our framework also provides a way to customize the behavior of text embeddings through instructions without the need to fine-tune the model or re-built document index.

Great Books for AI Engineering

10 books with valuable insights about AI science and engineering Duncan McKinnon · Follow Published in Towards Data Science · 11 min read · 8

January 29, 2025

Nifty Island Aims Beyond Its Token Launch, Putting Players in the Driver’s Seat

If you’ve taken part in Web3 gaming before, you’ve probably caught glimpses of its vast potential: open economies, true digital ownership, and vibrant communities. However,

December 17, 2024

Listening skills bring human-like touch to robots

Imagine sitting in a dark movie theater wondering just how much soda is left in your oversized cup. Rather than prying off the cap and

October 23, 2024

Supercharge Your Portfolio with Future Tech Stocks!

Join us for Profitable Insights & Expert Tips!

With expert analysis, comprehensive market coverage, and actionable insights, our newsletter equips you with the knowledge & tools necessary to make informed decisions & maximize your potential returns in the dynamic world of future tech stocks.