Skip to main content
This guide will walk you through a few queries to give you a feel for ParadeDB.

Create Example Table

ParadeDB comes with a helpful procedure that creates a table populated with mock data to help you get started. Once connected with psql, run the following commands to create and inspect this table.
CALL paradedb.create_bm25_test_table(
  schema_name => 'public',
  table_name => 'mock_items'
);

SELECT description, rating, category
FROM mock_items
LIMIT 3;
Expected Response
       description        | rating |  category
--------------------------+--------+-------------
 Ergonomic metal keyboard |      4 | Electronics
 Plastic Keyboard         |      4 | Electronics
 Sleek running shoes      |      5 | Footwear
(3 rows)
Next, let’s create a BM25 index called search_idx on this table. A BM25 index is a covering index, which means that multiple columns can be included in the same index.
CREATE INDEX search_idx ON mock_items
USING bm25 (id, description, category, rating, in_stock, created_at, metadata, weight_range)
WITH (key_field='id');
As a general rule of thumb, any columns that you want to filter, GROUP BY, ORDER BY, or aggregate as part of a full text query should be added to the index for faster performance.
Note the mandatory key_field option. See choosing a key field for more details.

Match Query

We’re now ready to execute a basic text search query. We’ll look for matches where description matches running shoes where rating is greater than 2.
SELECT description, rating, category
FROM mock_items
WHERE description ||| 'running shoes' AND rating > 2
ORDER BY rating
LIMIT 5;
Expected Response
     description     | rating | category
---------------------+--------+----------
 White jogging shoes |      3 | Footwear
 Generic shoes       |      4 | Footwear
 Sleek running shoes |      5 | Footwear
(3 rows)
||| is ParadeDB’s custom match disjunction operator, which means “find me all documents containing running OR shoes. If we want all documents containing running AND shoes, we can use ParadeDB’s &&& match conjunction operator.
SELECT description, rating, category
FROM mock_items
WHERE description &&& 'running shoes' AND rating > 2
ORDER BY rating
LIMIT 5;
Expected Response
     description     | rating | category
---------------------+--------+----------
 Sleek running shoes |      5 | Footwear
(1 row)

BM25 Scoring

Next, let’s add BM25 scoring to the results, which sorts matches by relevance. To do this, we’ll use pdb.score.
SELECT description, pdb.score(id)
FROM mock_items
WHERE description ||| 'running shoes' AND rating > 2
ORDER BY score DESC
LIMIT 5;
Expected Response
     description     |   score
---------------------+-----------
 Sleek running shoes |  6.817111
 Generic shoes       | 3.8772602
 White jogging shoes | 3.4849067
(3 rows)

Highlighting

Finally, let’s also highlight the relevant portions of the documents that were matched. To do this, we’ll use pdb.snippet.
SELECT description, pdb.snippet(description), pdb.score(id)
FROM mock_items
WHERE description ||| 'running shoes' AND rating > 2
ORDER BY score DESC
LIMIT 5;
Expected Response
     description     |              snippet              |   score
---------------------+-----------------------------------+-----------
 Sleek running shoes | Sleek <b>running</b> <b>shoes</b> |  6.817111
 Generic shoes       | Generic <b>shoes</b>              | 3.8772602
 White jogging shoes | White jogging <b>shoes</b>        | 3.4849067
(3 rows)

Top N

ParadeDB is highly optimized for quickly returning the Top N results out of the index. In SQL, this means queries that contain an ORDER BY...LIMIT:
SELECT description, rating, category
FROM mock_items
WHERE description ||| 'running shoes'
ORDER BY rating
LIMIT 5;
Expected Response
     description     | rating | category
---------------------+--------+----------
 White jogging shoes |      3 | Footwear
 Generic shoes       |      4 | Footwear
 Sleek running shoes |      5 | Footwear
(3 rows)

Facets

Faceted queries allow a single query to return both the Top N results and an aggregate value, which is more CPU-efficient than issuing two separate queries. For example, the following query returns the top 3 results as well as the total number of results matched.
SELECT
     description, rating, category,
     pdb.agg('{"value_count": {"field": "id"}}') OVER ()
FROM mock_items
WHERE description ||| 'running shoes'
ORDER BY rating
LIMIT 5;
Expected Response
     description     | rating | category |      agg
---------------------+--------+----------+----------------
 White jogging shoes |      3 | Footwear | {"value": 3.0}
 Generic shoes       |      4 | Footwear | {"value": 3.0}
 Sleek running shoes |      5 | Footwear | {"value": 3.0}
(3 rows)
That’s it! Next, let’s load your data to start running real queries.