table.search(query, ...) runs retrieval against a table. Control behavior with strategy and related kwargs.
Basic search
Strategy reference
strategy | Behavior |
|---|---|
| (default) | Hybrid-style: sparse + dense when vectors exist |
"sparse", "bm25", "text" | Lexical BM25 only |
"dense", "vector", "hnsw" | Dense ANN (HNSW); needs embeddings or proxy vector |
"diskann" | On-disk ANN graph |
"hyde" | HYDE expansion path |
"crag" | CRAG filtering path |
"distributed" | Parallel segment scan |
"graph", "hybrid" | Enables graph expansion (see below) |
Parameters
| Parameter | Default | Description |
|---|---|---|
top_k | 20 | Max results returned |
offset | 0 | Skip first N hits (pagination) |
explain | false | Execute with provenance — returns a structured DAG of candidate flow |
graph_expand | false | Graph neighbor expansion |
depth | 2 | Graph expansion depth |
query_vector | None | Query embedding for dense / DiskANN |
Retrieval provenance (explain=True)
ToraDB is the only retrieval database that shows its work. When explain=True, every search returns a structured provenance trace alongside results — showing exactly which documents were considered and why each was kept or dropped at every tier.
<table>/_search_log.ndjson on disk for cross-query analysis.
See Retrieval provenance for the full schema and use cases.
Examples
Metadata-style query (field filters in query string):Performance (large segment-only tables)
After bulk ingest, the first search may mmap many segment indexes (cold start). Mitigations:- Use serving profile settings:
TORADB_CACHE_AUTO=1or setTORADB_CACHE_INDEX_BYTEShigh enough to hold hot segment sidecars (see Production serving profiles). - Enable
TORADB_WARMUP_ON_START=1so the API warms indexes in the background. - Run
toradb-ingest resumeafter ingest so TBM3 block-max sidecars, lexicons, andbm25.route.binexist (required once after upgrading index format). - Use
explain=Trueto inspect the provenance trace and identify which tier is slowest.
