PG-Vector
PG-Vector is a PostgreSQL extension that allows for vector similarity search. It stores vectors in the database and allows for similarity search to them.
How to run​
Running on CGC is very simple. You just need to create a new database and load the PG-Vector extension.
cgc db create -n <database_name> pg-vector
After creating the database, you will receive an app token for your database.
How to connect to the database​
You can connect to the database from any notebook running in your namespace using CGC SDK.
import cgc.sdk as cgc
postgresql = cgc.postgresql_client("<database_name>","<app_token>")
For more information about CGC SDK and how to connect to the database, please refer to our docs
Installation​
Enable the PG-Vector extension (do this once in each database where you want to use it)
CREATE EXTENSION vector;
Example usage​
Create a vector column with 3 dimensions.
CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3));
Insert vectors
INSERT INTO items (embedding) VALUES ('[1,2,3]'), ('[4,5,6]');
Get the nearest neighbors by L2 distance
SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;
PGVector also supports:
- inner product (
<#>
) - cosine distance (
<=>
) - L1 distance (
<+>
)
Note: <#>
returns the negative inner product since Postgres only supports ASC
order index scans on operators
For more information or options for PGVector, please refer to the Github repo