Retrieval Improvements Plan

Goal: improve MentisDB retrieval quality and scalability without breaking append-only semantics.

The four tracks are mostly parallelizable if we agree on shared interfaces first.

Phase 0: Shared Interfaces

Do this first, then split work.

0.1 Add Retrieval Pipeline Concepts

Define internal structs/enums, likely in src/search/ranked.rs or new src/search/pipeline.rs.

pub enum RetrievalRoute {
    Lexical,
    SemanticVector,
    ExplicitGraph,
    ImplicitGraph,
    PprGraph,
    PrfExpandedLexical,
    SummaryHierarchy,
}

pub struct RouteScore {
    pub route: RetrievalRoute,
    pub score: f32,
}

pub struct QueryIntent {
    pub temporal: bool,
    pub entity_focused: bool,
    pub agent_focused: bool,
    pub causal: bool,
    pub semantic: bool,
    pub summary_or_global: bool,
}

0.2 Preserve Current Defaults

All new behavior should be gated behind fields in RankedSearchQuery / graph config first.

Example:

{
  "graph": {
    "mode": "bidirectional",
    "algorithm": "bfs",
    "max_depth": 2
  }
}

Later allowed values:

algorithm: "bfs" | "ppr"

Initial default: current behavior.

0.3 Shared Test Fixtures

Create common integration fixtures for:

explicit relations
implicit auto edges
temporal memories
summaries with Summarizes
entity-tagged thoughts
vocabulary mismatch queries

Useful test file:

tests/search_pipeline_integration_tests.rs

Workstream A: PPR Graph Expansion

Can run independently after Phase 0.

Objective

Replace or augment bounded BFS with Personalized PageRank over explicit + implicit graph edges.

Files

src/search/graph.rs
src/search/expansion.rs
src/search/ranked.rs
tests/search_pipeline_integration_tests.rs

Design

Build a weighted graph view:

Edge Source	Weight
explicit `References`	1.0
explicit `Supports`	1.1
explicit `DerivedFrom`	1.2
explicit `Corrects` / `Invalidates`	query-dependent
implicit cosine edge	cosine score, e.g. `0.85..1.0`
temporal adjacency	optional low weight, e.g. `0.15`

Add a PPR function:

pub struct PprConfig {
    pub damping: f32,
    pub max_iters: usize,
    pub tolerance: f32,
    pub max_nodes: usize,
    pub include_implicit_edges: bool,
}

pub struct PprResult {
    pub scores: HashMap<ThoughtLocator, f32>,
    pub seed_paths: HashMap<ThoughtLocator, Vec<GraphExpansionPath>>,
}

Algorithm:

Seed vector from lexical/vector top results.
Expand graph neighborhood up to max_nodes.
Run power iteration.
Return ranked graph scores.
Merge with existing ranked scoring via current RRF/score fields.

Tests

PPR ranks a 2-hop relevant node above unrelated lexical match.
Implicit cosine edges contribute to PPR.
PPR respects max_nodes.
PPR deterministic across runs.
PPR disabled keeps exact old BFS behavior.

Verification

cargo test ppr
cargo test --test search_pipeline_integration_tests
Compare LoCoMo R@10 against BFS.

Parallel Output

Commit:

feat(search): add personalized pagerank graph expansion

Workstream B: PRF Query Expansion

Can run independently after Phase 0.

Objective

Improve lexical recall by expanding queries using pseudo-relevance feedback from top lexical hits.

Files

src/search/lexical.rs
src/search/ranked.rs
src/search/query_expansion.rs new
tests/search_query_expansion_tests.rs

Design

Start with non-LLM PRF.

Pipeline:

Run original lexical query.
Take top N hits, e.g. 5.
Extract high-IDF candidate terms.
Remove stopwords, original query terms, too-common terms.
Weight terms using Rocchio-like formula.
Run expanded lexical query.
Fuse original + expanded route via RRF.

Config:

pub struct PrfConfig {
    pub enabled: bool,
    pub feedback_docs: usize,
    pub expansion_terms: usize,
    pub min_idf: f32,
    pub original_weight: f32,
    pub expansion_weight: f32,
}

Expose later as:

{
  "query_expansion": {
    "mode": "none",
    "feedback_docs": 5,
    "expansion_terms": 8
  }
}

Later allowed values:

mode: "none" | "prf"

Important Guardrails

Do not mutate the stored query.
Do not write expansion terms into thoughts.
Keep expansion route visible in result metadata.
Avoid expansion if top lexical hits are weak/noisy.

Tests

Query "trip cost" expands to "invoice vendor payment" from feedback docs.
Expanded route recovers a result missed by original lexical.
Expansion disabled preserves exact old results.
No feedback docs means no expansion.
Very common terms are filtered.

Verification

cargo test query_expansion
LoCoMo smoke test first 200 queries.
LongMemEval check for no R@5 regression.

Parallel Output

Commit:

feat(search): add pseudo-relevance query expansion route

Workstream C: Append-Only Hierarchical Summaries

Can run mostly independently. Needs final integration with ranked search.

Objective

Create persistent summary thoughts that improve long-horizon retrieval without rewriting original memories.

Files

src/lib.rs
src/search/ranked.rs
src/search/summary_index.rs new
tests/hierarchical_summary_tests.rs
optional dashboard/TUI later

Design

Use existing primitives:

ThoughtType::Summary
ThoughtRelationKind::Summarizes
refs
timestamps
agent/session metadata

No destructive compaction.

Add API:

pub struct SummaryBuildConfig {
    pub window_size: usize,
    pub overlap: usize,
    pub by_session: bool,
    pub by_agent: bool,
    pub by_entity_type: bool,
}

pub fn build_summary_candidates(&self, config: SummaryBuildConfig) -> Vec<SummaryCandidate>;

Important: MentisDB core should not require an LLM. So split into two layers:

Core:

selects ranges/clusters that need summaries
returns candidate source thought IDs
can create extractive summaries if needed

Daemon/API:

optional LLM-generated summary content
append as normal Summary thought

Summary Hierarchy

Level 0:

raw thoughts

Level 1:

session/topic summaries

Level 2:

chain-level rolling summaries

Relations:

summary -> raw thoughts via Summarizes
summary -> previous summary via ContinuesFrom or Summarizes

Retrieval Integration

When query looks broad/global:

search summaries first
expand down from summaries to summarized raw thoughts
boost raw thoughts whose parent summary matched

Tests

Building summary candidates does not mutate chain.
Appended summaries preserve hash-chain integrity.
Summary retrieval can find raw thought through Summarizes.
Re-running candidate selection skips already summarized ranges.
Works across sessions and entity types.

Verification

cargo test hierarchical_summary
Full integrity test after summary append.
Benchmark global/query-summary questions separately.

Parallel Output

Commit:

feat(memory): add append-only hierarchical summary candidates

Workstream D: Query-Aware Routing

Can run independently after Phase 0, but best integrated after A/B/C.

Objective

Route queries to the right retrieval signals before scoring.

Files

src/search/query_intent.rs new
src/search/ranked.rs
src/server.rs only if API fields needed
tests/query_intent_tests.rs

Design

Start deterministic, no LLM.

Heuristics:

Query Pattern	Intent
contains date/time words	temporal
contains “who”, “which agent”, “by”	agent-focused
contains “why”, “because”, “caused”	causal
contains “summarize”, “overall”, “all about”	summary/global
contains known entity type/concept	entity-focused
short abstract query	semantic

Output:

pub struct QueryRoutingPlan {
    pub lexical_weight: f32,
    pub vector_weight: f32,
    pub graph_weight: f32,
    pub ppr_weight: f32,
    pub temporal_weight: f32,
    pub summary_weight: f32,
    pub enable_prf: bool,
}

Integration

In ranked search:

Build QueryIntent.
Build QueryRoutingPlan.
Execute selected routes.
Fuse route results.
Include route metadata in response.

Tests

“when did…” boosts temporal route.
“who said…” searches agent metadata.
“why did…” boosts causal/DerivedFrom/CausedBy edges.
“summarize…” searches summaries first.
Routing disabled keeps legacy weighting.

Verification

cargo test query_intent
LoCoMo category breakdown if labels available.
Manual probes against existing chains.

Parallel Output

Commit:

feat(search): add query-aware retrieval routing

Integration Phase

After A-D land independently.

I.1 Add Combined Ranked Pipeline

In src/search/ranked.rs:

Order:

Parse query intent.
Run lexical original.
Optionally run PRF expanded lexical.
Run vector route.
Run graph route: BFS or PPR.
Optionally run summary hierarchy route.
Fuse with RRF.
Add route score diagnostics.

I.2 Response Metadata

Expose per-result route contributions:

{
  "score": {
    "total": 12.4,
    "lexical": 4.1,
    "vector": 3.2,
    "graph": 2.8,
    "ppr": 1.7,
    "prf": 0.6
  }
}

I.3 Config Flags

Env vars or API fields:

MENTISDB_GRAPH_ALGORITHM=bfs|ppr
MENTISDB_PRF_QUERY_EXPANSION=true|false
MENTISDB_QUERY_ROUTING=true|false
MENTISDB_SUMMARY_ROUTE=true|false

Prefer API fields first, env defaults second.

Benchmark Plan

Smoke

LoCoMo first 200 queries
LongMemEval 100-query subset if available
Compare:
- baseline current
- PPR only
- PRF only
- PPR + PRF
- all routes

Full

LoCoMo-10P R@10
LongMemEval R@5/R@10/R@20

Acceptance Criteria

Ship if:

LoCoMo-10P R@10 improves or holds within -0.5pp
LongMemEval R@5 does not drop more than 2pp
Latency increase acceptable, ideally under 25% for default settings
New features can be disabled

Suggested Parallel Assignment

Agent 1: PPR Graph

Owns Workstream A.

Returns:

implementation
tests
benchmark notes

Agent 2: PRF Query Expansion

Owns Workstream B.

Returns:

implementation
tests
top expansion examples

Agent 3: Summary Hierarchy

Owns Workstream C.

Returns:

candidate builder
append-only summary flow
tests

Agent 4: Query Routing

Owns Workstream D.

Returns:

intent classifier
routing weights
tests

Coordinator

Owns:

Phase 0 interfaces
integration
benchmarks
docs/changelog
final release gate

Best First Slice

If we want maximum impact with minimum risk:

Implement PRF query expansion first.
Implement PPR graph expansion second.
Benchmark both independently.
Only then add query routing and summary hierarchy.

Reason: PRF and PPR directly target retrieval quality and can be evaluated fast. Summary hierarchy is valuable but has more product/API surface area.

Uh oh!

FilesExpand file tree

RETRIEVAL_IMPROVEMENTS_PLAN.md

Latest commit

History

RETRIEVAL_IMPROVEMENTS_PLAN.md

File metadata and controls

Retrieval Improvements Plan

Phase 0: Shared Interfaces

0.1 Add Retrieval Pipeline Concepts

0.2 Preserve Current Defaults

0.3 Shared Test Fixtures

Workstream A: PPR Graph Expansion

Objective

Files

Design

Tests

Verification

Parallel Output

Workstream B: PRF Query Expansion

Objective

Files

Design

Important Guardrails

Tests

Verification

Parallel Output

Workstream C: Append-Only Hierarchical Summaries

Objective

Files

Design

Summary Hierarchy

Retrieval Integration

Tests

Verification

Parallel Output

Workstream D: Query-Aware Routing

Objective

Files

Design

Integration

Tests

Verification

Parallel Output

Integration Phase

I.1 Add Combined Ranked Pipeline

I.2 Response Metadata

I.3 Config Flags

Benchmark Plan

Smoke

Full

Acceptance Criteria

Suggested Parallel Assignment

Agent 1: PPR Graph

Agent 2: PRF Query Expansion

Agent 3: Summary Hierarchy

Agent 4: Query Routing

Coordinator

Best First Slice