Skip to content

Incremental updates silently lose all unchanged nodes — merge-batch-graphs.py drops batch-existing.json #402

@maizoro87

Description

@maizoro87

Summary

On incremental /understand updates, the merged graph silently loses every unchanged node/edge — the result contains only the freshly re-analyzed (changed) files.

Version: 2.7.6 (Claude Code plugin)

Root cause

skills/understand/SKILL.mdPhase 2 — ANALYZE → Incremental update path instructs:

Write the pruned existing nodes/edges as batch-existing.json in the intermediate directory … Run the same merge script — it will combine batch-existing.json with the fresh batch-*.json files.

But skills/understand/merge-batch-graphs.py discovers batch files with a numeric-only regex:

batch-(\d+)(?:-part-(\d+))?\.json

batch-existing.json does not match \d+, so the merge silently skips it, discarding all surviving (unchanged) nodes and edges. The SKILL.md itself even warns about this regex in the full path (re: fused names), but the incremental instruction violates it.

Reproduction

  1. Run /understand (full) on a repo → graph has N nodes.
  2. Commit a change touching a few files.
  3. Run /understand again (incremental).
  4. Observe the node count collapse to only the changed files' nodes; all unchanged nodes are gone.

Evidence

Real run, 2026-06-05, v2.7.6, a ~250-node TS/React project:

  • As written (batch-existing.json): merge produced 94 / 164 (157 surviving nodes lost).
  • Renaming the pruned file to batch-900.json: merge produced the correct 251 / 536.

Impact

Silent data loss on the skill's headline incremental feature — and it's invisible (no error; the graph just shrinks).

Suggested fix (any one)

  1. merge-batch-graphs.py: widen the regex, e.g. batch-(\d+|existing)(?:-part-(\d+))?\.json and treat existing as a normal source.
  2. SKILL.md: instruct the incremental path to write the pruned graph as a high numeric index (e.g. batch-900.json) instead of batch-existing.json.
  3. Have the incremental path write batch-<maxBatchIndex+1>.json.

Option 1 is the most robust (keeps the documented name working).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions