PG-Vector

PG-Vector is a PostgreSQL extension that allows for vector similarity search. It stores vectors in the database and allows for similarity search to them.

How to run

Running on CGC is very simple. You just need to create a new database and load the PG-Vector extension.

cgc db create -n <database_name> pg-vector

After creating the database, you will receive an app token for your database.

How to connect to the database

You can connect to the database from any notebook running in your namespace using CGC SDK.

import cgc.sdk as cgc
postgresql = cgc.postgresql_client("<database_name>","<app_token>")

For more information about CGC SDK and how to connect to the database, please refer to our docs

Installation

Enable the PG-Vector extension (do this once in each database where you want to use it)

CREATE EXTENSION vector;

Example usage

Create a vector column with 3 dimensions.

CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3));

Insert vectors

INSERT INTO items (embedding) VALUES ('[1,2,3]'), ('[4,5,6]');

Get the nearest neighbors by L2 distance

SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;

PGVector also supports:

inner product (<#>)
cosine distance (<=>)
L1 distance (<+>)

Note: <#> returns the negative inner product since Postgres only supports ASC order index scans on operators

For more information or options for PGVector, please refer to the Github repo

PG-Vector

How to run​

How to connect to the database​

Installation​

Example usage​

How to run

How to connect to the database

Installation

Example usage