PG-Vector
PG-Vector is a PostgreSQL extension that allows for vector similarity search. It stores vectors in the database and allows for similarity search to them.
How to run​
Running on CGC is very simple. You just need to create a new database and load the PG-Vector extension.
cgc db create -n <database_name> pg-vector
After creating the database, you will receive an app token for your database.
Default configuration​
The default configuration for PG-Vector is set to use the PostgreSQL database engine. The database will be created with the following parameters:
POSTGRES_PASSWORD
: The password for the PostgreSQL user. This is set to CGC specificapp_token
that you receive after creating the database.POSTGRES_USER=admin
: The user for the PostgreSQL database.POSTGRES_DB=db
: The name of the PostgreSQL database. This is set to the name you provided when creating the database.POSTGRES_HOST_AUTH_METHOD=trust
: Configures PostgreSQL to skip password verification for connections, trusting that authentication has already occurred at the system level. This is convenient for development but creates a significant security risk in production environments.
How to connect to the database​
For more information about CGC SDK and how to connect to the database, please refer to our docs
Installation​
Enable the PG-Vector extension (do this once in each database where you want to use it)
CREATE EXTENSION vector;
Example usage​
Create a vector column with 3 dimensions.
CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3));
Insert vectors
INSERT INTO items (embedding) VALUES ('[1,2,3]'), ('[4,5,6]');
Get the nearest neighbors by L2 distance
SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;
PGVector also supports:
- inner product (
<#>
) - cosine distance (
<=>
) - L1 distance (
<+>
)
Note: <#>
returns the negative inner product since Postgres only supports ASC
order index scans on operators
For more information or options for PGVector, please refer to the Github repo