Weaviate
Weaviate is an open source, AI-native, vector database that helps developers create intuitive and reliable, AI-powered applications.
Weaviate comes with a custom vectorizer, powered by the intfloat/multilingual-e5-large
model.
More about the model can be found at the official huggingface repo.
Vectorizer has to be run separately as compute resource.
How to run it​
If you want to run a basic database, you just need to create a new database through cgc.
$ cgc db create --name weaviate01 -c 4 -m 8 -v weaviate_volume weaviate
If you'd like to use the external vectorizer, you need to run it first
cgc compute create --name vectorizer -c 4 -m 24 -g 1 -gt A5000 t2v-transformers
For more information about the vectorizer please refer to its documentation page
and then pass additional parameters to the startup command
$ cgc db create --name weaviate01 -c 4 -m 8 -v weaviate_volume weaviate -d weaviate_enable_modules=text2vec-transformers -d weaviate_transformers_inference_api=http://<VECTORIZER_RESOURCE_NAME>:8080
How to connect​
We are working on incorporating the weaviate client into CGC SDK, but as for now, please use the official client installed with pip
.
In your notebook environment you can connect to the database like this.
First install the weaviate client
!pip install weaviate-client
Then obtain your weaviate token from
cgc db list -d
Next, import the client and make a connection
import weaviate
WEAVIATE_URL = "http://weaviate:8080"
auth_client_secret = weaviate.AuthApiKey(api_key="<WEAVIATE_TOKEN>")
client = weaviate.Client(
url=WEAVIATE_URL,
auth_client_secret=auth_client_secret,
)