CoWPE: Adaptive Context Window Adjustment in LLMs for Complex Input Queries

Main Article Content

Venkata Mohit Tamanampudi

Abstract

Recent work has shown that large language models, or LLMs, are capable of amazing processing context windows based on the nuance and complexity of respective input queries. By changing rotary position embedding (RoPE), a well-liked position encoding technique used by well-known LLMs like LLaMA and GPT-NeoX, recent studies have attempted to expand the context window of LLMs. In order to help LLMs efficiently adapt to a larger context window based on input query complexity and nuance, we identify in this work the inherent need for LLMs' attention entropy (i.e., the information entropy of attention scores) to maintain stability and introduce a novel extension to RoPE that combines adjusting RoPE's base frequency and scaling the attention logits. Our proposal, CoWPE, aims to accomplish this by building neighbor attention information and bi-level grouped attention in order to modify the context window of LLMs. While neighbor attention catches relationships between neighboring tokens within a given range, grouped attention collects interdependence among tokens that are far apart. During inference, the self-attention mechanism of the original model is utilized to calculate the two-level attentions. Our CoWPE requires no fine-tuning and can easily expand the context window of existing LLMs with a small amount of code adjustment. We carry out extensive tests on several benchmarks, and the outcomes demonstrate the CoWPE can successfully increase the context window duration of current LLMs.

Article Details

How to Cite
Tamanampudi, V. M. . (2024). CoWPE: Adaptive Context Window Adjustment in LLMs for Complex Input Queries. Journal of Artificial Intelligence General Science (JAIGS) ISSN:3006-4023, 5(1), 438–450. https://doi.org/10.60087/jaigs.v5i1.221
Section
Articles