Skip to main content

PG-Vector

PG-Vector is a PostgreSQL extension that allows for vector similarity search. It stores vectors in the database and allows for similarity search to them.

How to run​

Running on CGC is very simple. You just need to create a new database and load the PG-Vector extension.

cgc db create -n <database_name> pg-vector

After creating the database, you will receive an app token for your database.

How to connect to the database​

You can connect to the database from any notebook running in your namespace using CGC SDK.

import cgc.sdk as cgc
postgresql = cgc.postgresql_client("<database_name>","<app_token>")

For more information about CGC SDK and how to connect to the database, please refer to our docs

Installation​

Enable the PG-Vector extension (do this once in each database where you want to use it)

CREATE EXTENSION vector;

Example usage​

Create a vector column with 3 dimensions.

CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3));

Insert vectors

INSERT INTO items (embedding) VALUES ('[1,2,3]'), ('[4,5,6]');

Get the nearest neighbors by L2 distance

SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;

PGVector also supports:

  • inner product (<#>)
  • cosine distance (<=>)
  • L1 distance (<+>)

Note: <#> returns the negative inner product since Postgres only supports ASC order index scans on operators

For more information or options for PGVector, please refer to the Github repo