RAGLib — Knowledge Retrieval Library

Provides functions to index, search, and retrieve content from a local document collection. Given any free-text query, RAGLib scores and ranks the most relevant documents, returning them as plain text or structured JSON. Use the results however your applet needs: display them directly, feed them into an AI prompt via AILib, or build search features over your own data. RAGLib handles all the indexing, scoring, and formatting — your program just provides a query and gets relevant content back. 13 functions.

CategoryCountDescription
Engine Lifecycle3rag#, rag_free, rag_rebuild#
Core Retrieval3retrieve$ (text), retrieve_json$ (JSON), retrieve_budget$ (token-limited)
Direct Lookup3doc$ (by ID), functions$ (by function names), tags$ (by tags)
Query Analysis1analyze$ (intent, keywords, category hints as JSON)
Information3count, funccount, summary$

How It Works

Your Query
RAGLib
score & rank
Relevant Docs
AI Prompt
Better Answer

The RAG engine reads markdown documents from a directory tree, builds a search index, and scores documents by keyword matching, identifier overlap, and tag relevance. Results are returned as formatted text ready for display or prompt injection, or as structured JSON with scores and metadata.

ⓘ Works with any document collection. RAGLib is domain-agnostic — point it at a folder of markdown files and it becomes a search engine for that content. Combine it with AILib to build AI assistants grounded in your own data: product manuals, FAQs, recipes, travel guides, personal notes, or any structured text.

Knowledge Base Structure

A knowledge base is simply a directory tree of markdown files. The structure and subject matter can be anything:

knowledge/
  ├─ products/
  │   ├─ modelx-overview.md
  │   ├─ modelx-setup.md
  │   └─ troubleshooting.md
  ├─ faq/
  │   ├─ billing.md
  │   └─ shipping.md
  ├─ guides/
  │   └─ getting-started.md
  └─ policies/
      └─ returns.md

Each document contains free-text content and can include tags and categories. Technical documents may also list identifiers (such as function names) that the engine indexes for precise lookup. The broader the collection, the richer the search results.

Engine Lifecycle

FunctionSignatureDescription
rag#(path$)rag#@$Create RAG engine from a knowledge base directory
rag_free(eng#)rag_free@#Free RAG engine and release resources
rag_rebuild#(eng#)rag_rebuild#@#Rebuild the search index from source documents (saved to disk)
╯ lifecycle
let eng# = rag#("knowledge/")
println "Documents: " + str$(rag_count(eng#))
println "Functions: " + str$(rag_funccount(eng#))

' Rebuild after adding/changing docs
rag_rebuild#(eng#)
println "Index rebuilt!"

rag_free(eng#)
ⓘ Index persistence. The index is saved to disk as JSON after rag_rebuild#. On subsequent runs, rag# loads the cached index instantly. Only rebuild when documents change.

Core Retrieval

Three ways to search the knowledge base, differing only in output format and budget control. All three accept the same free-text query:

FunctionSignatureDescription
rag_retrieve$(eng#, query$)rag_retrieve$@#$Search → formatted text (ready for display or prompt injection)
rag_retrieve_json$(eng#, query$)rag_retrieve_json$@#$Search → JSON array with scores, categories, token counts
rag_retrieve_budget$(eng#, q$, maxTok)rag_retrieve_budget$@#$nSearch with token budget — stops when budget is exhausted
╯ retrieval.bas
let eng# = rag#("knowledge/")

' Simple text retrieval — returns formatted content for display
let docs$ = rag_retrieve$(eng#, "how do I return a product?")
println docs$

' JSON retrieval — returns scores and metadata for each match
let json$ = rag_retrieve_json$(eng#, "delivery times")
let results# = json_parse#(json$)
for i = 0 to json_len(results#) - 1
    let doc# = json_item#(results#, i)
    println json_gets$(doc#, "title") + " (score: " + str$(json_getn(doc#, "score")) + ")"
next

' Budget-controlled retrieval — limits result size to fit AI context windows
let docs$ = rag_retrieve_budget$(eng#, "setup and installation", 2000)
println docs$

Direct Lookup

When you know exactly what you’re looking for — by document ID, specific technical identifier, or tag:

FunctionSignatureDescription
rag_doc$(eng#, docId$)rag_doc$@#$Get full content of a document by its ID
rag_functions$(eng#, names$)rag_functions$@#$Find docs mentioning specific identifiers or function names (comma or space separated)
rag_tags$(eng#, tags$)rag_tags$@#$Find docs matching specific tags (comma separated)
╯ direct-lookup.bas
' Get a specific document by its ID
println rag_doc$(eng#, "troubleshooting")

' Find docs mentioning specific identifiers
println rag_functions$(eng#, "order-number, tracking-code")

' Find docs by tag
println rag_tags$(eng#, "shipping,returns,policy")

Query Analysis

FunctionSignatureDescription
rag_analyze$(eng#, query$)rag_analyze$@#$Analyze query → JSON with intent, keywords, function names, library hints

Returns a JSON object describing how the engine interprets the query — useful for debugging retrieval, building routing logic, or displaying search intent to the user.

╯ analysis.bas
let a$ = rag_analyze$(eng#, "how do I track my order after it ships?")
println a$

' Example output:
' {"query":"how do I track my order after it ships?",
'  "intent":"information_request",
'  "is_followup":false,
'  "keywords":["track","order","ships","shipping"],
'  "function_names":[],
'  "library_hints":["shipping","faq"]}

Information

FunctionSignatureDescription
rag_count(eng#)rag_count@#Number of documents in the index
rag_funccount(eng#)rag_funccount@#Number of function signatures indexed
rag_summary$(eng#)rag_summary$@#Human-readable index summary (counts, categories, stats)

Complete Examples

AI-Powered Q&A Assistant

╯ rag-codegen.bas
println "=== Product Support Assistant ==="

let ai# = ai_client#("anthropic", "sk-ant-xxxxx")
ai_model#(ai#, "claude-sonnet-4-20250514")
ai_maxtokens#(ai#, 512)

let eng# = rag#("knowledge/")

' Retrieve content relevant to the user question
let question$ = "How long does standard shipping take?"
let docs$ = rag_retrieve_budget$(eng#, question$, 2000)

' Build system prompt: instruct AI to answer from the retrieved content
let sys$ = "You are a helpful customer support assistant." + chr$(10)
sys$ = sys$ + "Answer the user's question using only the information below." + chr$(10)
sys$ = sys$ + "If the answer is not in the content, say so clearly." + chr$(10)
sys$ = sys$ + chr$(10) + docs$

' Ask the AI
let answer$ = ai_completesystem$(ai#, sys$, question$)
if ai_ok(ai#) = 1 then
    println answer$
else
    println "Error: " + ai_errormsg$()
end if

rag_free(eng#)
ai_free(ai#)
ⓘ The core RAG pattern: retrieve relevant content → prepend to system prompt → call AI. The AI is constrained to answer from your documents, which makes responses more accurate and domain-specific than relying on model training alone.

Query Analysis Inspector

╯ analyze.bas
println "=== Query Analysis Inspector ==="

let eng# = rag#("knowledge/")

let a$ = rag_analyze$(eng#, "how do I return a damaged item?")
let a# = json_parse#(a$)
println "Intent: " + json_gets$(a#, "intent")

let kw# = json_get#(a#, "keywords")
let s$ = ""
for i = 0 to json_len(kw#) - 1
    if i > 0 then s$ = s$ + ", "
    s$ = s$ + json_items$(kw#, i)
next
println "Keywords: " + s$

let hints# = json_get#(a#, "library_hints")
let h$ = ""
for i = 0 to json_len(hints#) - 1
    if i > 0 then h$ = h$ + ", "
    h$ = h$ + json_items$(hints#, i)
next
println "Categories: " + h$

rag_free(eng#)

Best Practices

PracticeWhy
Call rag_rebuild# after adding/modifying docsNew documents aren’t searchable until the index is rebuilt
Use rag_retrieve_budget$ when passing results to AIPrevents exceeding the AI model’s context window
Budget of 2000–4000 tokens is usually enoughProvides good context for most Q&A and information tasks
Use rag_retrieve_json$ for custom scoring logicGives you scores, token counts, and match reasons
Use rag_functions$ when you know exact identifiersMore precise than keyword search for named entities, SKUs, or API names
Organize docs into thematic categoriesImproves search relevance and tag-based filtering

Quick Reference — All 13 Functions

FunctionSignatureDescription
ENGINE LIFECYCLE (3)
rag#(path$)rag#@$Create RAG engine
rag_free(eng#)rag_free@#Free engine
rag_rebuild#(eng#)rag_rebuild#@#Rebuild index
CORE RETRIEVAL (3)
rag_retrieve$(eng#, q$)rag_retrieve$@#$Search (formatted text)
rag_retrieve_json$(eng#, q$)rag_retrieve_json$@#$Search (JSON + scores)
rag_retrieve_budget$(eng#, q$, tok)rag_retrieve_budget$@#$nSearch (token-limited)
DIRECT LOOKUP (3)
rag_doc$(eng#, id$)rag_doc$@#$Get document by ID
rag_functions$(eng#, names$)rag_functions$@#$Find by function names
rag_tags$(eng#, tags$)rag_tags$@#$Find by tags
QUERY ANALYSIS (1)
rag_analyze$(eng#, q$)rag_analyze$@#$Analyze intent (JSON)
INFORMATION (3)
rag_count(eng#)rag_count@#Document count
rag_funccount(eng#)rag_funccount@#Function count
rag_summary$(eng#)rag_summary$@#Index summary

13 functions. Pairs with AILib to build knowledge-grounded AI features in any Plan9Basic applet.

See Also

  • AILib — AI client for making API calls; use together with RAGLib for context-enriched completions
  • JsonLib — Parse JSON returned by rag_retrieve_json$ and rag_analyze$