Working with Databases in Notebooks

This page explains how to integrate databases into your notebook workflows.

Qdrant

Qdrant is a vector database designed for storing and searching embeddings, making it a powerful tool in machine learning workflows. In a notebook context, it allows you to seamlessly manage collections of vectors generated by LLMs, enabling tasks like semantic search or similarity matching. By integrating Qdrant with LLM outputs, you can build intelligent applications that combine natural language understanding with efficient vector-based retrieval.

Requirements

Before you start, make sure you have the necessary dependencies installed in your notebook environment.

Python - Qdrant SDK

pip install qdrant_client

Listing collections

You can list all collections available in your Qdrant instance. This is useful to check which datasets are already stored.

from qdrant_client.async_qdrant_client import AsyncQdrantClient
import os

q = AsyncQdrantClient(
    url = os.environ.get("FATHOM_SDK_BASE_URL"),
    check_compatibility = False,
    prefix = os.environ.get("FATHOM_SDK_SERVICE_PATH_VECTOR_DATABASE").rstrip("/"),
    timeout = 30,
    headers = {
        "authorization": os.environ.get("FATHOM_SDK_AUTHORIZATION")
    }
);

all_collections = await q.get_collections()

print(all_collections)

This will return metadata about all collections currently stored in Qdrant.

Creating a collection

You can create a new collection to store vectors. When creating a collection, you need to specify the vector size and distance metric.

from qdrant_client.async_qdrant_client import AsyncQdrantClient
from qdrant_client.http.models import (
    VectorParams
)
import os

q = AsyncQdrantClient(
    url = os.environ.get("FATHOM_SDK_BASE_URL"),
    check_compatibility = False,
    prefix = os.environ.get("FATHOM_SDK_SERVICE_PATH_VECTOR_DATABASE").rstrip("/"),
    timeout = 30,
    headers = {
        "authorization": os.environ.get("FATHOM_SDK_AUTHORIZATION")
    }
);

result = await q.create_collection(
    collection_name="my_collection",
    vectors_config=VectorParams(
        size=128,
        distance="Cosine"
    )
);

print(result)

This example creates a collection named my_collection with vectors of size 128 and cosine similarity as the distance metric.