AI capabilities / Knowledge bases

AI knowledge bases over your private documents.

Retrieval Augmented Generation over your business document library. Staff query in plain language, get cited answers in seconds. Built on pgvector or Pinecone with Anthropic Claude or OpenAI GPT for the answer layer.

01What it does

What a private knowledge base does.

A private knowledge base lets staff query everything the business has ever written in plain English and get cited answers in seconds. Tax firms query IRS publications. Insurance agencies query policy documents. Clinics query treatment protocols. The single highest leverage AI use for any business with a document library.

Most small businesses have a document library that nobody can search. PDFs in folders. Procedures in Google Docs. Policies in Word. Past memos in email. The information exists; finding it is the bottleneck. A private knowledge base indexes all of that content into a searchable vector database, and an LLM writes plain language answers with citations to the source document.

The interface is a search box that returns answers, not documents. The staff member asks "what is the procedure for handling a return after 90 days?" and gets a paragraph answer that cites your returns policy memo from 2024. Every answer carries a citation, so the staff member can verify before acting on it.

02How it works

The architecture, end to end.

Source documents are ingested through a chunking pipeline. Each chunk is converted to a high dimensional vector by an embedding model. Vectors are stored in a vector database with the source citation attached. When a user submits a query, the query is embedded and the closest matching chunks are retrieved (typically the top 5 to 20). Those chunks plus the query plus a system prompt are sent to an LLM that writes the answer with citations.

The chunking pipeline is where most of the engineering lives. Chunk too big and retrieval becomes coarse; chunk too small and context is lost. We tune chunk size and overlap per document type. PDFs with tables need different chunking than memos. Treatment protocols need different chunking than IRS publications.

Sources: Lewis et al. RAG paper, Anthropic embeddings, pgvector, Pinecone learning center.

03Stack

What we build with.

The default knowledge base stack we ship.

Ingest

document layer
PyPDFunstructured.iodocling

Embeddings

vector layer
OpenAI text-embedding-3-largeVoyageBGE local

Vector DB

storage
pgvectorPineconeQdrant

Answer LLM

generation
Claude OpusClaude SonnetGPT 4o
05Pricing and timeline

What a knowledge base costs.

A production knowledge base ships in 3 to 5 weeks at the $2,497 Business OS tier. Larger document corpuses (10,000+ documents) are quoted from $2,997 Enterprise.

The cost of a knowledge base scales with the size of the document corpus and the desired retrieval quality. Embedding cost is one time. Storage cost runs $10 to $100 per month. Query cost runs per query, typically under one cent.

06FAQ

Knowledge base FAQ.

Will the knowledge base see PII or PHI?

Depends on the corpus. For corpuses with PII or PHI we deploy on BAA compliant infrastructure and use private inference. We do not send PII to public LLM APIs without consent and a Data Processing Agreement.

How fresh are the answers?

The knowledge base reflects whatever has been indexed. We can re index on schedule (nightly, weekly) or on document change.

Can I delete a document and have it removed from answers?

Yes. Documents are individually removable from the vector database. The next query will not return content from removed documents.

What about access control?

Documents can be tagged with access groups. Queries are filtered to the requesting user permissions before retrieval.

How does it compare to ChatGPT with a custom GPT?

A custom GPT is good for low scale internal use. A custom RAG system gives you control over the chunking, embedding model, retrieval algorithm, and access control. For business critical document retrieval we recommend the custom path.

Ready to scope a Knowledge bases project?

Free audit comes first. We confirm scope, lock the timeline, and quote any add ons before you sign.