SQL analytics - ToraDB

GROUP BY

SELECT tag, COUNT(*) FROM articles GROUP BY tag

GROUP BY id is supported and groups by document id. On large tables, that can produce one group per row.

Global count

SELECT COUNT(*) FROM articles

Use this for total rows. It is faster than grouping by id.

WHERE

SELECT tag, COUNT(*) FROM articles
WHERE tag = 'patent'
GROUP BY tag

Comparison operators: =, !=, <, <=, >, >=.

IN, BETWEEN, LIKE

SELECT tag, COUNT(*) FROM articles
WHERE tag IN ('patent', 'science')
GROUP BY tag

SELECT tag, COUNT(*) FROM articles
WHERE year BETWEEN 2000 AND 2010
GROUP BY tag

SELECT tag, COUNT(*) FROM articles
WHERE title LIKE '%Tesla%'
GROUP BY tag

NOT BETWEEN and NOT LIKE are supported. LIKE uses % (any run) and _ (single char) and is case-sensitive.

Aggregates

Function	Example
`COUNT(*)`	`SELECT tag, COUNT(*) FROM t GROUP BY tag`
`SUM(col)`	`SELECT tag, SUM(score) FROM t GROUP BY tag`
`AVG(col)`	`SELECT tag, AVG(score) FROM t GROUP BY tag`
`MIN(col)`	`SELECT tag, MIN(score) FROM t GROUP BY tag`
`MAX(col)`	`SELECT tag, MAX(score) FROM t GROUP BY tag`

Aggregate over a function

Aggregates can wrap a scalar function:

SELECT tag, SUM(abs(amount)) AS net FROM ledger GROUP BY tag

Aliases

Name the output value column with AS:

SELECT tag, COUNT(*) AS n, AVG(score) AS mean FROM articles GROUP BY tag

GROUP BY a function

Group by a computed key, e.g. bucket timestamps by month:

SELECT date_trunc('month', created) AS month, COUNT(*)
FROM events
GROUP BY date_trunc('month', created)

HAVING

Filter groups after aggregation. Reference the aggregate value column by its output name (count, sum_<col>, avg_<col>, …, or an alias) or the group key column:

SELECT tag, COUNT(*) FROM articles
GROUP BY tag
HAVING count > 2

Comparison operators, IN, BETWEEN, and LIKE are supported in HAVING. Use the value column name (count, sum_score) — not the COUNT(*)/SUM(...) expression.

Retrieval + analytics

Apply search first, then aggregate:

SELECT tag, COUNT(*) FROM articles
SPARSE SEARCH body BM25('Nikola Tesla')
GROUP BY tag

Hybrid example:

SELECT tag, COUNT(*) FROM papers
SPARSE SEARCH title BM25('Tesla coil')
VECTOR SEARCH embedding ANN([0.9, 0.1, 0.0, 0.0])
GROUP BY tag

​GROUP BY

​Global count

​WHERE

​IN, BETWEEN, LIKE

​Aggregates

​Aggregate over a function

​Aliases

​GROUP BY a function

​HAVING

​Retrieval + analytics