RAGLib — Plan9Basic Documentation

RAGLib — Knowledge Retrieval Library

Provides functions to index, search, and retrieve content from a local document collection. Given any free-text query, RAGLib scores and ranks the most relevant documents, returning them as plain text or structured JSON. Use the results however your applet needs: display them directly, feed them into an AI prompt via AILib, or build search features over your own data. RAGLib handles all the indexing, scoring, and formatting — your program just provides a query and gets relevant content back. 13 functions.

Category	Count	Description
Engine Lifecycle	3	rag#, rag_free, rag_rebuild#
Core Retrieval	3	retrieve$ (text), retrieve_json$ (JSON), retrieve_budget$ (token-limited)
Direct Lookup	3	doc$ (by ID), functions$ (by function names), tags$ (by tags)
Query Analysis	1	analyze$ (intent, keywords, category hints as JSON)
Information	3	count, funccount, summary$

How It Works

Your Query

→

RAGLib
score & rank

→

Relevant Docs

→

AI Prompt

→

Better Answer

The RAG engine reads markdown documents from a directory tree, builds a search index, and scores documents by keyword matching, identifier overlap, and tag relevance. Results are returned as formatted text ready for display or prompt injection, or as structured JSON with scores and metadata.

ⓘ Works with any document collection. RAGLib is domain-agnostic — point it at a folder of markdown files and it becomes a search engine for that content. Combine it with AILib to build AI assistants grounded in your own data: product manuals, FAQs, recipes, travel guides, personal notes, or any structured text.

Knowledge Base Structure

A knowledge base is simply a directory tree of markdown files. The structure and subject matter can be anything:

knowledge/
  ├─ products/
  │   ├─ modelx-overview.md
  │   ├─ modelx-setup.md
  │   └─ troubleshooting.md
  ├─ faq/
  │   ├─ billing.md
  │   └─ shipping.md
  ├─ guides/
  │   └─ getting-started.md
  └─ policies/
      └─ returns.md

Each document contains free-text content and can include tags and categories. Technical documents may also list identifiers (such as function names) that the engine indexes for precise lookup. The broader the collection, the richer the search results.

Engine Lifecycle

Function	Signature	Description
`rag#(path$)`	`rag#@$`	Create RAG engine from a knowledge base directory
`rag_free(eng#)`	`rag_free@#`	Free RAG engine and release resources
`rag_rebuild#(eng#)`	`rag_rebuild#@#`	Rebuild the search index from source documents (saved to disk)

╯ lifecycle─◻✕
let eng# = rag#("knowledge/")
println "Documents: " + str$(rag_count(eng#))
println "Functions: " + str$(rag_funccount(eng#))

' Rebuild after adding/changing docs
rag_rebuild#(eng#)
println "Index rebuilt!"

rag_free(eng#)

ⓘ Index persistence. The index is saved to disk as JSON after rag_rebuild#. On subsequent runs, rag# loads the cached index instantly. Only rebuild when documents change.

Core Retrieval

Three ways to search the knowledge base, differing only in output format and budget control. All three accept the same free-text query:

Function	Signature	Description
`rag_retrieve$(eng#, query$)`	`rag_retrieve$@#$`	Search → formatted text (ready for display or prompt injection)
`rag_retrieve_json$(eng#, query$)`	`rag_retrieve_json$@#$`	Search → JSON array with scores, categories, token counts
`rag_retrieve_budget$(eng#, q$, maxTok)`	`rag_retrieve_budget$@#$n`	Search with token budget — stops when budget is exhausted

╯ retrieval.bas─◻✕
let eng# = rag#("knowledge/")

' Simple text retrieval — returns formatted content for display
let docs$ = rag_retrieve$(eng#, "how do I return a product?")
println docs$

' JSON retrieval — returns scores and metadata for each match
let json$ = rag_retrieve_json$(eng#, "delivery times")
let results# = json_parse#(json$)
for i = 0 to json_len(results#) - 1
    let doc# = json_item#(results#, i)
    println json_gets$(doc#, "title") + " (score: " + str$(json_getn(doc#, "score")) + ")"
next

' Budget-controlled retrieval — limits result size to fit AI context windows
let docs$ = rag_retrieve_budget$(eng#, "setup and installation", 2000)
println docs$

Direct Lookup

When you know exactly what you’re looking for — by document ID, specific technical identifier, or tag:

Function	Signature	Description
`rag_doc$(eng#, docId$)`	`rag_doc$@#$`	Get full content of a document by its ID
`rag_functions$(eng#, names$)`	`rag_functions$@#$`	Find docs mentioning specific identifiers or function names (comma or space separated)
`rag_tags$(eng#, tags$)`	`rag_tags$@#$`	Find docs matching specific tags (comma separated)

╯ direct-lookup.bas─◻✕
' Get a specific document by its ID
println rag_doc$(eng#, "troubleshooting")

' Find docs mentioning specific identifiers
println rag_functions$(eng#, "order-number, tracking-code")

' Find docs by tag
println rag_tags$(eng#, "shipping,returns,policy")

Query Analysis

Function	Signature	Description
`rag_analyze$(eng#, query$)`	`rag_analyze$@#$`	Analyze query → JSON with intent, keywords, function names, library hints

Returns a JSON object describing how the engine interprets the query — useful for debugging retrieval, building routing logic, or displaying search intent to the user.

╯ analysis.bas─◻✕
let a$ = rag_analyze$(eng#, "how do I track my order after it ships?")
println a$

' Example output:
' {"query":"how do I track my order after it ships?",
'  "intent":"information_request",
'  "is_followup":false,
'  "keywords":["track","order","ships","shipping"],
'  "function_names":[],
'  "library_hints":["shipping","faq"]}

Information

Function	Signature	Description
`rag_count(eng#)`	`rag_count@#`	Number of documents in the index
`rag_funccount(eng#)`	`rag_funccount@#`	Number of function signatures indexed
`rag_summary$(eng#)`	`rag_summary$@#`	Human-readable index summary (counts, categories, stats)

Complete Examples

AI-Powered Q&A Assistant

╯ rag-codegen.bas─◻✕
println "=== Product Support Assistant ==="

let ai# = ai_client#("anthropic", "sk-ant-xxxxx")
ai_model#(ai#, "claude-sonnet-4-20250514")
ai_maxtokens#(ai#, 512)

let eng# = rag#("knowledge/")

' Retrieve content relevant to the user question
let question$ = "How long does standard shipping take?"
let docs$ = rag_retrieve_budget$(eng#, question$, 2000)

' Build system prompt: instruct AI to answer from the retrieved content
let sys$ = "You are a helpful customer support assistant." + chr$(10)
sys$ = sys$ + "Answer the user's question using only the information below." + chr$(10)
sys$ = sys$ + "If the answer is not in the content, say so clearly." + chr$(10)
sys$ = sys$ + chr$(10) + docs$

' Ask the AI
let answer$ = ai_completesystem$(ai#, sys$, question$)
if ai_ok(ai#) = 1 then
    println answer$
else
    println "Error: " + ai_errormsg$()
end if

rag_free(eng#)
ai_free(ai#)

ⓘ The core RAG pattern: retrieve relevant content → prepend to system prompt → call AI. The AI is constrained to answer from your documents, which makes responses more accurate and domain-specific than relying on model training alone.

Query Analysis Inspector

╯ analyze.bas─◻✕
println "=== Query Analysis Inspector ==="

let eng# = rag#("knowledge/")

let a$ = rag_analyze$(eng#, "how do I return a damaged item?")
let a# = json_parse#(a$)
println "Intent: " + json_gets$(a#, "intent")

let kw# = json_get#(a#, "keywords")
let s$ = ""
for i = 0 to json_len(kw#) - 1
    if i > 0 then s$ = s$ + ", "
    s$ = s$ + json_items$(kw#, i)
next
println "Keywords: " + s$

let hints# = json_get#(a#, "library_hints")
let h$ = ""
for i = 0 to json_len(hints#) - 1
    if i > 0 then h$ = h$ + ", "
    h$ = h$ + json_items$(hints#, i)
next
println "Categories: " + h$

rag_free(eng#)

Best Practices

Practice	Why
Call `rag_rebuild#` after adding/modifying docs	New documents aren’t searchable until the index is rebuilt
Use `rag_retrieve_budget$` when passing results to AI	Prevents exceeding the AI model’s context window
Budget of 2000–4000 tokens is usually enough	Provides good context for most Q&A and information tasks
Use `rag_retrieve_json$` for custom scoring logic	Gives you scores, token counts, and match reasons
Use `rag_functions$` when you know exact identifiers	More precise than keyword search for named entities, SKUs, or API names
Organize docs into thematic categories	Improves search relevance and tag-based filtering

Quick Reference — All 13 Functions

Function	Signature	Description
ENGINE LIFECYCLE (3)
`rag#(path$)`	`rag#@$`	Create RAG engine
`rag_free(eng#)`	`rag_free@#`	Free engine
`rag_rebuild#(eng#)`	`rag_rebuild#@#`	Rebuild index
CORE RETRIEVAL (3)
`rag_retrieve$(eng#, q$)`	`rag_retrieve$@#$`	Search (formatted text)
`rag_retrieve_json$(eng#, q$)`	`rag_retrieve_json$@#$`	Search (JSON + scores)
`rag_retrieve_budget$(eng#, q$, tok)`	`rag_retrieve_budget$@#$n`	Search (token-limited)
DIRECT LOOKUP (3)
`rag_doc$(eng#, id$)`	`rag_doc$@#$`	Get document by ID
`rag_functions$(eng#, names$)`	`rag_functions$@#$`	Find by function names
`rag_tags$(eng#, tags$)`	`rag_tags$@#$`	Find by tags
QUERY ANALYSIS (1)
`rag_analyze$(eng#, q$)`	`rag_analyze$@#$`	Analyze intent (JSON)
INFORMATION (3)
`rag_count(eng#)`	`rag_count@#`	Document count
`rag_funccount(eng#)`	`rag_funccount@#`	Function count
`rag_summary$(eng#)`	`rag_summary$@#`	Index summary

13 functions. Pairs with AILib to build knowledge-grounded AI features in any Plan9Basic applet.