AI-Native
File Database
cat + grep for any file format. Parse once, query forever. Built for LLM agents that need to read, search, and discover files without writing parsing scripts.
Without YakDB
# Agent writes inline parsing script
# ~500 tokens, often fails
run_command("""python -c "
import fitz
doc = fitz.open('report.pdf')
for page in doc:
print(page.get_text())
" """)With YakDB
# One call, always works
read("report.pdf")
# With options
read("report.pdf", pages="1-3")
search("quarterly revenue")
glob("**/*.xlsx")3 Tools Replace Everything
Read. Search. Discover.
yakdb_read
Read any file — PDF, DOCX, XLSX, images, code. One call, structured output.
yakdb_search
Full-text search across all indexed documents. Regex grep for code files.
yakdb_glob
Discover files with glob patterns. Instant metadata and type inference.
Universal Parsing
Every Format, One API
PDF, Word, PowerPoint, Excel, CSV, images, code files — YakDB parses them all server-side so your agent doesn't have to.
Parse Once, Query Forever
Files are parsed and indexed on ingest. Every subsequent read is instant — no repeated parsing, no wasted tokens.
Dual-Mode Storage
SQLite for zero-config local use. PostgreSQL for shared team deployments. Same API, same tools.
Real-Time Directory Watch
Point YakDB at a folder and it auto-indexes new and modified files. No manual re-ingestion.
Open Source (AGPL)
Fully open source. Inspect, extend, self-host. Community contributions welcome.
Get Started in 30 Seconds
Three Commands
Benchmarks
Tested Across 4 Models,
24 Document Tasks
Token Usage
55–73% saved
Task Speed
42–64% faster
Success Rate
+21% tasks
Built for OpenYak.
Works Everywhere.
YakDB ships as OpenYak's native file layer — but it's a standalone tool. Use it with any MCP-compatible agent, via REST API, or as a Python library.