← Back to Research

Knowledge Tree: Structured Long-Term Memory for LLMs

Moving beyond context limits with navigable, structured knowledge trees

June 4, 2024

Moving beyond context limits with navigable, structured knowledge trees

Abstract

Large Language Models are stateless, and context windows are finite. Existing solutions - RAG, extended context, recurrence - each have fundamental tradeoffs. Knowledge Tree takes a different approach: build a navigable tree of structured nodes from long-form content, then let the LLM reason about which branches to explore. Unlike prior work using plain text summaries, each node contains typed metadata (content types, decisions, actions, events) that informs navigation. Combined with partial-answer detection and multi-path retry logic, this enables question answering over corpora far exceeding context limits.


1. Introduction

Large Language Models have transformed how we interact with information. They can summarize documents, answer questions, generate code, and reason through complex problems. Yet they suffer from a fundamental limitation: they are stateless. Every conversation starts fresh. Every context window has a hard limit. The moment content exceeds that limit, something must be discarded.

This matters because real-world knowledge work isn't stateless. Projects accumulate months of decisions, discussions, and documentation. Teams need to recall why a particular architecture was chosen six months ago, or what blockers were discussed in last quarter's retrospectives. The information exists - scattered across tickets, documents, and meeting notes - but it far exceeds what any context window can hold.

1.1 The Limits of Current Approaches

Several approaches attempt to bridge this gap, each with fundamental tradeoffs:

Extended Context Windows

Models now support 100K, 200K, even 1M tokens. But longer isn't always better:

  • Attention quality degrades with distance; models exhibit "lost in the middle" effects where information in the center of long contexts is poorly recalled
  • Positional bias causes models to favor content at the beginning or end
  • Cost scales linearly (or worse) with context length
  • The context window remains finite - eventually, you hit the wall

Retrieval-Augmented Generation (RAG)

RAG systems retrieve relevant chunks and inject them into the prompt. This works well for document search but struggles with coherent long-text understanding:

  • Retrieval is optimized for similarity, not relevance to complex queries
  • Chunks are selected independently, losing narrative coherence
  • Multi-hop reasoning ("find X, then use X to find Y") requires multiple retrieval rounds
  • No mechanism to backtrack when retrieved content proves unhelpful

Recurrence and Summarization

Recurrent approaches compress earlier content into summaries, carrying them forward:

  • Each compression step loses information
  • Older content fades as it passes through multiple summarization layers
  • No way to recover detail once it's been compressed away
  • The compression isn't query-aware - important details for future questions may be discarded

1.2 A Different Approach: Memory as Navigation

Knowledge Tree treats long-form memory as a structure to navigate rather than content to retrieve or compress.

This mirrors how humans handle large bodies of knowledge. We don't load everything into working memory at once. We build mental models - hierarchies of concepts, indexes of where to find what, intuitions about which areas are relevant to which questions. When we need specific information, we navigate to it: "That decision was made during the architecture review... which happened after the Q2 planning... let me look at the technical decisions from that period."

Applied to LLMs, this means:

  1. Build a navigable structure: Transform long-form content into a tree of nodes, each containing structured metadata about what lies beneath
  2. Navigate, don't retrieve: When a query arrives, the LLM reasons about which branches are most likely to contain relevant information
  3. Extract structured knowledge: Nodes aren't just text summaries - they contain typed metadata (decisions, actions, events, topics) that inform navigation
  4. Recover from wrong turns: If a path proves unhelpful, backtrack and try alternatives

The LLM becomes an active explorer rather than a passive recipient of retrieved chunks. It reasons about where to look, evaluates what it finds, and adapts when its first guess is wrong.

1.3 Contributions

This paper presents Knowledge Tree, extending prior work on interactive reading (MemWalker) with several key innovations:

  • Structured node extraction: Beyond plain summaries, we extract typed metadata - content types, decisions, actions, events - that enables more informed navigation
  • Content type taxonomy: A predefined categorization scheme ensures consistent tagging and enables categorical reasoning ("look for decisions, not meeting notes")
  • Robust partial-answer handling: Explicit detection of incomplete answers triggers systematic exploration of alternative paths
  • Multi-attempt navigation: Configurable retry logic across branches and leaves prevents early termination on wrong paths

Together, these enable question answering over corpora far exceeding context limits.

1.4 Paper Organization

  • Section 2 reviews MemWalker and identifies opportunities for improvement
  • Section 3 details Knowledge Tree's key innovations
  • Section 4 explains the construction and navigation algorithms
  • Section 5 walks through a practical example with real project data
  • Section 6 discusses implementation considerations
  • Section 7 addresses limitations and future directions
  • Section 8 concludes


2. Background: MemWalker

MemWalker (Chen et al., 2023) introduced an interactive reading approach for long-context understanding:

What MemWalker Got Right

  • Two-stage approach: Build a tree structure, then navigate it
  • Iterative prompting: LLM decides which branch to explore
  • Revert capability: Can backtrack if a path proves unfruitful
  • Working memory: Carries context from visited nodes

What MemWalker Left on the Table

  • Plain text summaries: Nodes contain only text, no structure
  • Generic navigation: Chooses based on summary similarity alone
  • Binary outcomes: Limited handling of partial information
  • Single-purpose: Tree is built for one query, then discarded


3. Knowledge Tree: Key Innovations

3.1 Structured Node Extraction

Where MemWalker creates text summaries, Knowledge Tree extracts structured knowledge:

FieldPurposeExample
SummaryConcise overview of content"Team discussed Q3 roadmap priorities"
Content TypesCategorization from taxonomy["Meeting minutes", "Project plans"]
Critical ActionsAction items, tasks, TODOs"Design review scheduled for Friday"
DecisionsChoices made, commitments"Decided to use PostgreSQL over MongoDB"
Noteworthy EventsImportant occurrences"Client approved the proposal"
AboutTopics/entities mentioned["authentication", "API redesign", "Q3 goals"]

The Content Type Taxonomy

A predefined taxonomy of 50+ content types enables:

  • Consistent categorization across nodes
  • Filtering during navigation ("look for decisions, not meeting notes")
  • Future hybrid retrieval (semantic + category)

Example categories:

  • Meeting notes & minutes
  • Task records & tickets
  • Design documents
  • Decisions & agreements
  • Requirements & specifications

The taxonomy is domain-specific - define categories that match your content.

Why Structure Helps Navigation

Plain summary comparison:

"Summary 0: The team met to discuss project updates..."
"Summary 1: Technical review of the authentication system..."

>

Structured comparison:

"Option 0: Meeting minutes | Decisions: None | About: [status updates, timeline]"
"Option 1: Design document | Decisions: OAuth2 selected | About: [authentication, security]"

>

The LLM can now reason: "The question asks about auth decisions. Option 1 explicitly contains decisions about authentication."


3.2 Informed Navigation

Navigation prompts present full metadata, not just summaries:

Options:
- Index: 0
  Summary: "..."
  Content Types: [Meeting minutes]
  Decisions: None
  Critical Actions: ["Schedule follow-up"]
  About: [roadmap, timeline]

- Index: 1
  Summary: "..."
  Content Types: [Design document]
  Decisions: "Selected OAuth2 for authentication"
  Critical Actions: None
  About: [authentication, security, API]

This gives the LLM multiple signals to reason about relevance.


3.3 Graceful Degradation

Real-world queries often require exploring multiple paths. Knowledge Tree handles this with:

Partial Answer vs No Answer

ResponseMeaningAction
No AnswerContent is irrelevant to queryTry different leaf/branch
Partial AnswerSome information found, but incompleteTry additional paths, may combine
Complete AnswerQuery fully satisfiedReturn response

Multi-Attempt Strategy

max_branch_attempts = 3
leaves_per_branch = 2

for each branch attempt:
    select best branch
    for each leaf attempt:
        select best leaf
        try to answer
        if complete: return
        if partial/none: remove leaf, retry
    if still incomplete: remove branch, retry

This systematic exploration prevents early termination on wrong paths.


4. How It Works

4.1 Tree Construction

Diagram

flowchart TB
    subgraph "Input"
        CONTENT[/"Long-form content<br/>(documents, tickets, notes)"/]
    end

    subgraph "Stage 1: Segmentation"
        CONTENT --> SEG["Split into fixed-size chunks<br/>(~5000 characters each)"]
    end

    subgraph "Stage 2: Leaf Extraction"
        SEG --> LEAF["For each chunk:<br/>Extract structured summary<br/>+ Content Types<br/>+ Decisions<br/>+ Actions<br/>+ Events<br/>+ About"]
        LEAF --> LEAVES[("LEAF nodes")]
    end

    subgraph "Stage 3: Branch Synthesis"
        LEAVES --> GROUP["Group leaves<br/>(5-8 per branch)"]
        GROUP --> BRANCH["Aggregate summaries<br/>Merge Content Types<br/>Merge About topics"]
        BRANCH --> BRANCHES[("BRANCH nodes")]
    end

    subgraph "Stage 4: Root Creation"
        BRANCHES --> ROOT["Synthesize final summary<br/>from all branches"]
        ROOT --> ROOTNODE[("ROOT node")]
    end

Each node contains:

  • Content: Original text (for leaves) or child references (for branches/root)
  • Summary: Structured metadata object with all extracted fields
  • Parents: References to child nodes for traversal
  • Level: Node type identifier (Leaf, Branch, Root)

4.2 Query Navigation

Diagram

flowchart TB
    START([Query Received]) --> ROOT[Start at ROOT]
    ROOT --> PRESENT[Present child nodes<br/>with full metadata]
    PRESENT --> SELECT{LLM selects<br/>best option}

    SELECT --> |"Selected BRANCH"| DESCEND[Descend to branch]
    DESCEND --> PRESENT

    SELECT --> |"Selected LEAF"| ATTEMPT[Attempt to answer<br/>from leaf content]
    ATTEMPT --> CHECK{Answer<br/>complete?}

    CHECK --> |"Complete"| RETURN([Return answer])
    CHECK --> |"Partial/None"| BACKTRACK[Remove tried leaf<br/>Backtrack]
    BACKTRACK --> RETRY{More leaves<br/>to try?}
    RETRY --> |"Yes"| PRESENT
    RETRY --> |"No"| BRANCHBACK[Backtrack to<br/>parent branch]
    BRANCHBACK --> RETRY2{More branches<br/>to try?}
    RETRY2 --> |"Yes"| PRESENT
    RETRY2 --> |"No"| BEST([Return best<br/>partial answer])

The navigation loop continues until either:

  1. A complete answer is found
  2. All retry attempts are exhausted (returns best partial answer)
  3. No relevant content exists (returns "no answer")

4.3 The Navigation Loop

At each node, the LLM:

  1. Observes structured metadata from all child options
  2. Reasons about which option most likely contains relevant information
  3. Decides which path to explore
  4. Evaluates whether the answer is complete
  5. Adapts by backtracking if the path proves unfruitful

This loop continues until a complete answer is found or all paths are exhausted.

See Section 5 for a complete end-to-end walkthrough with real data.

5. Practical Example: Project Management

To illustrate Knowledge Tree in practice, we walk through a real scenario: a project management assistant that needs to answer questions about a software project's history spanning several months of activity.

5.1 The Scenario

Input corpus: 6 months of project data for "Botterfly MVP" including:

  • 50+ Jira tickets with descriptions, status changes, and assignments
  • Project documentation (features, architecture, integrations)
  • Team information and roles
  • Business model and pitch deck content

Total content size: ~40,000 tokens (well beyond typical context windows)

Goal: Answer natural language questions like:

  • "What UI issues were reported in November?"
  • "Who is working on the notification system?"
  • "What integrations are planned?"


5.2 Tree Construction

The construction phase transforms raw content into a navigable structure:

Diagram

flowchart TB
    subgraph Input
        RAW[/"40,000 tokens of project data"/]
    end

    subgraph "Stage 1: Chunking"
        RAW --> C1[Chunk 1<br/>5000 chars]
        RAW --> C2[Chunk 2<br/>5000 chars]
        RAW --> C3[Chunk 3<br/>5000 chars]
        RAW --> C4[...]
        RAW --> C8[Chunk 8<br/>5000 chars]
    end

    subgraph "Stage 2: Leaf Extraction"
        C1 --> L1[Leaf 1]
        C2 --> L2[Leaf 2]
        C3 --> L3[Leaf 3]
        C4 --> L4[...]
        C8 --> L8[Leaf 8]
    end

    subgraph "Stage 3: Branch Aggregation"
        L1 --> B1[Branch 1]
        L2 --> B1
        L3 --> B1
        L4 --> B2[Branch 2]
        L8 --> B2
    end

    subgraph "Stage 4: Root Synthesis"
        B1 --> ROOT[Root]
        B2 --> ROOT
    end

Each node stores structured metadata, not just text summaries.


5.3 Node Structure

A leaf node extracted from ticket data might look like:

LEAF NODE: L3
├── level: "Leaf"
├── content: [raw chunk - 5000 chars of ticket data]
└── summary:
    ├── Summary: "UI refinements for dashboard including navbar
    │            fixes, margin adjustments, and scroll behavior"
    ├── Content Types: ["Bug & issue tracking records",
    │                   "Task lists & tickets"]
    ├── Critical Actions: ["Fix navbar font color",
    │                      "Adjust margins per Figma",
    │                      "Make columns scrollable"]
    ├── Decisions: "Replace history icon with new chat icon"
    ├── Noteworthy Events: None
    └── About: ["UI", "dashboard", "navbar", "Beenish Khan",
                "BMVP-66", "scroll behavior"]

A branch node aggregating multiple leaves:

BRANCH NODE: B1
├── level: "Branch"
├── parents: [L1._id, L2._id, L3._id]
└── summary:
    ├── Summary: "Frontend development tasks including UI fixes,
    │            component design, and dashboard implementation"
    ├── Content Types: ["Bug & issue tracking records",
    │                   "Task lists & tickets",
    │                   "Design documents"]
    ├── Critical Actions: [aggregated from children]
    ├── Decisions: [aggregated from children]
    └── About: ["UI", "dashboard", "React", "frontend", ...]

5.4 Navigation Walkthrough

Query: "What UI issues were reported and who is fixing them?"

Diagram

flowchart TB
    subgraph "Step 1: Root Selection"
        ROOT[ROOT] --> |"Present options"| CHOOSE1{LLM Chooses}
        CHOOSE1 --> |"Option 0: Frontend tasks<br/>Content Types: Bug tracking, Tasks<br/>About: UI, dashboard, React"| B1[Branch 1 ✓]
        CHOOSE1 -.-> |"Option 1: Backend & integrations<br/>Content Types: API docs, Integration<br/>About: MS Teams, APIs"| B2[Branch 2]
    end

    subgraph "Step 2: Branch Selection"
        B1 --> |"Present leaves"| CHOOSE2{LLM Chooses}
        CHOOSE2 --> |"Leaf 2: Sign-up changes<br/>Decisions: Role dropdown"| L2[Leaf 2]
        CHOOSE2 -.-> |"Leaf 3: Dashboard UI issues<br/>Actions: Fix navbar, margins<br/>About: UI, Beenish Khan"| L3[Leaf 3 ✓]
    end

    subgraph "Step 3: Answer Attempt"
        L2 --> |"First attempt"| ANS1{Try Answer}
        ANS1 --> |"Partial Answer:<br/>Found assignee but<br/>not all UI issues"| RETRY[Backtrack]
        RETRY --> L3
        L3 --> ANS2{Try Answer}
        ANS2 --> |"Complete Answer"| DONE[Return Response]
    end

    style B1 fill:#90EE90
    style L3 fill:#90EE90
    style DONE fill:#90EE90

5.5 The Navigation in Detail

At ROOT, the LLM sees:

OptionSummaryContent TypesAbout
0Frontend tasks: UI fixes, dashboard...Bug tracking, TasksUI, React, dashboard
1Backend: APIs, integrations, auth...API docs, IntegrationMS Teams, Jira, API

The LLM reasons: "Question asks about UI issues. Option 0 explicitly mentions 'UI fixes' and has Content Type 'Bug tracking'. Selecting Option 0."

At Branch 1, the LLM sees leaves:

OptionSummaryCritical ActionsAbout
0Convert frontend from VueJS to ReactJSNoneVueJS, ReactJS
1Sign-up form changesNonesign-up, dropdown
2Dashboard UI issuesFix navbar, margins, scrollUI, Beenish Khan, BMVP-66
3Sidebar implementationNonesidebar, dashboard

The LLM reasons: "Option 2 has explicit 'Critical Actions' about UI fixes and mentions an assignee. Selecting Option 2."

At Leaf 2 (first attempt - wrong path):

The LLM finds sign-up related content but not comprehensive UI issues.

Response: { "Partial Answer": true, "Answer": "Found role dropdown change..." }

Retry mechanism triggers → removes Leaf 2, tries Leaf 3

At Leaf 3 (correct path):

The LLM finds the full ticket content:

"Font color in navbar is all white. Not visible. Paddings and margins are bigger than in the UI design... Each column needs to be individually scrollable..."
>

>

Assignee: Beenish Khan
>

Response: { "Partial Answer": false, "Answer": "Several UI issues were reported including navbar visibility, margin adjustments, scroll behavior, and icon changes. Beenish Khan is assigned to fix these issues (BMVP-66)." }


5.6 Where Knowledge Tree Made the Difference

Structured Metadata Enabled Precise Navigation

Without StructureWith Structure
"Summary mentions dashboard..."Content Types: ["Bug tracking"] → clearly issue-related
"Might be about UI?"About: ["UI", "navbar", "Beenish Khan"] → confirms relevance
"Unknown who's responsible"Critical Actions list shows specific fixes

Partial Answer Handling Prevented False Negatives

Diagram

flowchart LR
    Q[Query] --> L2[Leaf 2]
    L2 --> |"MemWalker would stop here<br/>with incomplete answer"| FAIL[❌ Incomplete]

    Q --> L2B[Leaf 2]
    L2B --> |"Partial Answer = true"| RETRY[Retry]
    RETRY --> L3[Leaf 3]
    L3 --> |"Complete Answer"| SUCCESS[✓ Full Answer]

    style FAIL fill:#FFB6C1
    style SUCCESS fill:#90EE90

The original MemWalker approach would have returned after the first leaf, missing critical information. Knowledge Tree's explicit partial answer handling triggered exploration of additional leaves.

Content Type Taxonomy Avoided Wrong Branches

The query mentioned "issues" - the taxonomy distinguished between:

  • Bug & issue tracking records (correct)
  • Meeting minutes & notes (wrong - would contain discussion, not tickets)
  • Project plans & roadmaps (wrong - future-looking, not issues)

This categorical signal helped the LLM avoid branches that might have similar keywords but wrong content types.


5.7 Tree Statistics

For this example corpus:

MetricValue
Input tokens~40,000
Leaf nodes8
Branch nodes2
Tree depth3 (Root → Branch → Leaf)
Avg navigation steps2.5
Tokens read per query~15,000 (37% of total)

The tree structure reduced the tokens processed per query by 63% compared to feeding the entire corpus.


6. Implementation Considerations

Model Selection

  • Reasoning capability is critical (per original MemWalker findings)
  • 70B+ parameter models recommended for complex navigation
  • Smaller models may work for simpler trees / fewer branches

Chunk Size & Tree Depth Tradeoffs

Larger chunksSmaller chunks
Fewer nodes, shallower treeMore nodes, deeper tree
More context per leafMore precise localization
Risk losing detail in summaryRisk fragmenting coherent content

Recommended starting point: ~5000 characters per leaf segment.

Token Budget

  • Navigation prompts grow with number of children
  • Limit children per node (5-8 recommended)
  • Working memory may need truncation for deep traversals


7. Limitations & Future Work

Current Limitations

  • Scaling: Very large corpora produce large trees; construction cost grows linearly
  • Static structure: Tree is built once; updates require reconstruction
  • Single-tree: One tree per corpus; no cross-tree navigation

Future Directions

  • Incremental updates: Add new content without full rebuild
  • Hybrid retrieval: Combine tree navigation with vector similarity
  • Multi-tree federation: Navigate across multiple knowledge trees
  • Self-improvement: Use query patterns to restructure tree over time


8. Conclusion

Knowledge Tree treats long-form memory as a structure to navigate rather than text to retrieve. The LLM decides which branches to explore, evaluates what it finds, and backtracks when needed. Three properties make this work:

Structure enables reasoning. Plain text summaries force the LLM to infer relevance from prose. Structured metadata provides explicit signals: content types enable categorical matching, decision fields surface choices directly, action fields highlight tasks. The LLM reasons about structure, not just semantics.

Graceful degradation beats early termination. Real queries often require information scattered across multiple locations. Explicit partial-answer detection and multi-path retry logic keep the system exploring until it finds complete answers or exhausts its options.

The tree is reusable. Unlike embeddings, Knowledge Tree nodes contain human-readable structured knowledge. A tree built for one query serves future queries, and can support purposes beyond Q&A: generating summaries, identifying patterns, onboarding.

When to Use Knowledge Tree

Knowledge Tree is well-suited for scenarios involving:

  • Coherent long-form content: Project histories, documentation, conversation logs - where context and narrative matter
  • Complex queries: Questions requiring reasoning across multiple pieces of information
  • Persistent knowledge bases: Corpora that will be queried repeatedly, justifying upfront construction cost
  • Explainability needs: The navigation trace shows exactly which content informed each answer

It is less suited for:

  • Simple keyword lookup (traditional search is faster)
  • Rapidly changing content (tree reconstruction has cost)
  • Single-use queries over disposable content

Looking Forward

The limitations noted in Section 7 - static construction, single-tree scope - are engineering challenges, not fundamental barriers. As LLM reasoning capabilities improve, structured navigation approaches become more viable.

The shift is conceptual: from "what text matches this query?" to "where should I look?" That reframing sidesteps the context window problem entirely.


Knowledge Tree builds on the MemWalker approach introduced by Chen et al. (2023), extending it with structured extraction, content taxonomies, and robust partial-answer handling.

Appendix: Prompt Templates

Knowledge Tree uses three core prompt templates: one for tree construction (summarization) and two for navigation (branch selection and leaf evaluation).


A.1 Summarization Prompt (Tree Construction)

Used to extract structured metadata from content chunks during tree building.

Instruction: |
  Evaluate content and extract structured metadata.
  Return JSON only in the format specified under 'Provide Answers'.

Context Explanation:
  Strategy: |
    Build a memory tree that condenses long texts into structured
    summaries, enabling guided navigation to segments relevant
    to user queries.

  Memory Tree Details:
    Total Levels: [Root, Branch, Leaf]
    Current Level: {level}  # Leaf, Branch, or Root

Field Explanations:
  About: |
    Should help traverse the memory tree easily. List everything
    mentioned or discussed in the current content - entities,
    topics, people, systems, etc.

  Content Types: |
    Categorize content using the taxonomy. Avoid over-tagging;
    prefer high-level types when multiple apply. Introduce new
    types if none fit.

  Possible Content Types:
    - Meeting notes & minutes
    - Task records & tickets
    - Design documents
    - Decisions & agreements
    # ... [domain-specific categories]

Content: {content_chunk}

Provide Answers:
  Summary: <Concise overview of content>
  Content Types: [<List of applicable types>]
  Critical Actions: <Action items, tasks, TODOs if any>
  Decisions: <Choices made, commitments if any>
  Noteworthy Events: <Important occurrences if any>
  About: [<List of topics, entities, people mentioned>]

Key design decisions:

  • Explicit field explanations reduce ambiguity
  • Taxonomy provided in-prompt ensures consistency
  • "If any" qualifiers prevent hallucinated metadata
  • About field optimized for navigation relevance


A.2 Navigation Prompt (Branch/Node Selection)

Used at non-leaf nodes to select which child to explore.

Instruction: |
  The user has asked a question related to the project. Evaluate
  the options and select the option index with highest potential
  to answer the user's question.

  Don't worry about options not fully answering the question.
  Consider 'Content Types', 'Critical Actions', 'Noteworthy Events',
  and 'Decisions' to determine each option's potential.

  Your response will be used to navigate further down the tree.
  Return JSON only in format under 'Provide Answers'.

Memory Tree Details:
  Total Levels: [Root, Branch, Leaf]
  Current Level: {level}  # Branch or Leaf

Project Context:
  Project Name: {project_name}
  Project Root Summary: {root_summary}

# Included when navigating from a branch (provides breadcrumb context)
Selected Branch:  # Optional - only present after first navigation
  Summary: {branch_summary}
  Content Types: {branch_content_types}
  Critical Actions: {branch_actions}
  Decisions: {branch_decisions}
  Noteworthy Events: {branch_events}
  About: {branch_about}

Options:
  - Index: 0
    Summary: {option_0_summary}
    Content Types: {option_0_types}
    Critical Actions: {option_0_actions}
    Decisions: {option_0_decisions}
    Noteworthy Events: {option_0_events}
    About: {option_0_about}

  - Index: 1
    Summary: {option_1_summary}
    # ... same fields

  # ... additional options

Question: {user_query}

Provide Answers:
  Selected Option Index: <integer>
  Selection Reason: <brief justification>

Key design decisions:

  • Full metadata per option enables informed comparison
  • "Selection Reason" captures reasoning (useful for debugging/explainability)
  • Branch context (when present) provides navigation breadcrumb
  • Explicit instruction to consider all metadata fields, not just summary


A.3 Answer Prompt (Leaf Evaluation)

Used at leaf nodes to attempt answering from content, with explicit partial/no answer detection.

Instruction: |
  You are a knowledge assistant. Answer the user's question based
  on the provided context.

  Determine the completeness of your answer:
  - Set 'Partial Answer' to true if any part of the question
    remains unanswered due to insufficient context
  - Set 'No Answer' to true only if you cannot provide any
    response based on the context

Answer Requirements: |
  Write a full, conversational answer under 'Answer' field.
  Ensure all parts of the user's question are addressed.

Response Format: JSON under 'Provide Answers'

Context: {leaf_content}

Question: {user_query}

Partial Answer Explained: |
  Set to true if the provided context is not sufficient to
  completely satisfy ALL aspects of the user's question, or
  if required information is missing. If any sub-question
  remains unanswered, set to true.

No Answer Explained: |
  Set to true ONLY if the provided context does not allow
  for any form of answer whatsoever.

Provide Answers:
  Answer: <full conversational response or null>
  No Answer: <boolean>
  Partial Answer: <boolean>

Key design decisions:

  • Explicit definitions for Partial vs No Answer prevent ambiguity
  • "ALL aspects" language catches multi-part questions
  • Conversational answer requirement ensures useful responses
  • Boolean flags enable programmatic retry logic


A.4 Prompt Design Principles

The prompts follow several principles that improve reliability:

PrincipleImplementation
Structured outputJSON format with explicit schema
Field explanationsEach field has inline documentation
Taxonomy constraintsContent types from predefined list
Graceful unknowns"If any" qualifiers for optional fields
Reasoning captureSelection reason for explainability
Explicit thresholdsClear definitions for partial/no answer
Context preservationBranch breadcrumbs during navigation

A.5 Example: Summarization Prompt in Action

Input content chunk:

BMVP-66: Fix minor UI issues in Dashboard view
Status: Selected for Development
Assignee: Beenish Khan

Description: Font color in navbar is all white. Not visible.
Paddings and margins are bigger than in the UI design.
Make sure to follow the margins used in the figma file.
Icons to collapse each section are inverted...

LLM Response:

{
  "Summary": "UI bug fixes for dashboard view including navbar
              visibility, margin adjustments, and icon corrections",
  "Content Types": ["Bug & issue tracking records",
                    "Task lists & tickets"],
  "Critical Actions": "Fix navbar font color, adjust margins per
                       Figma, correct collapse icons, make columns
                       scrollable",
  "Decisions": "Replace history icon with new chat icon",
  "Noteworthy Events": null,
  "About": ["dashboard", "UI", "navbar", "Beenish Khan",
            "BMVP-66", "Figma", "scroll behavior", "margins"]
}

This structured output enables precise navigation when a user later asks "What UI bugs were reported?" - the Content Type Bug & issue tracking records and About field ["UI", "dashboard", ...] directly signal relevance.


References

On this page