Towards infinite LLM context windows

It all started with GPT having an input context window of 512 tokens. After only 5 years the newest LLMs are capable of handling 1M+ context inputs. Where’s the limit?

9 min read

13 hours ago

I like to think of the LLMs (specifically, of the models parameters, i.e., their weights of the neural network layers and…