Skip to content

Add Adaptive Cost-Aware Cache Eviction Policy #33

@Safaael25

Description

@Safaael25

Motivation

The current cache eviction mechanism relies primarily on traditional recency-based strategies. While effective in many scenarios, these approaches do not consider the actual value of cached entries in terms of retrieval cost and access patterns.

In Retrieval-Augmented Generation (RAG) systems, some cached entries are significantly more expensive to regenerate than others. Evicting these entries too early can increase latency and reduce overall cache efficiency.

Proposed Enhancement

Implement an adaptive cost-aware eviction policy that combines:

Access frequency
Recency of access
Retrieval/generation cost
Chunk size

Each cache entry will receive a dynamic score:

score = α * recency
+ β * frequency
+ γ * retrieval_cost
- δ * size_penalty

The eviction mechanism will remove entries with the lowest overall score.

Expected Benefits
Higher cache hit rate
Lower average retrieval latency
Better utilization of limited cache capacity
Improved throughput under realistic workloads
Evaluation Plan

Compare the proposed policy against the current baseline using:

-Cache hit rate
-Mean latency
-P95 latency
-Throughput
-Memory utilization

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions