Working with LLMs in Notebooks

Getting started guide for integrating Large Language Models (LLMs) into your notebook workflows.

This documentation explains how to communicate with Large Language Models (LLMs) directly from a notebook environment.

Requirements

Before you start, make sure you have the necessary dependencies installed in your notebook environment.

Install the OpenAI SDK

The OpenAI SDK (openai) is only required if you want to write code in Python and communicate with LLMs via the SDK.

pip install openai

Listing models

Before starting, you may want to see which models are available in your environment. This helps you choose the right model for your task.

from openai import OpenAI

client = OpenAI(
    base_url = os.environ.get("FATHOM_SDK_BASE_URL") + '/llms/v1',
    api_key = "",
    default_headers = {
        "Authorization": os.environ.get("FATHOM_SDK_AUTHORIZATION")
    }
)

models = client.models.list()

print(models)

import requests
import os
import json

response = requests.get(
    os.environ.get("FATHOM_SDK_BASE_URL") + "/llms/v1/models",
    headers={"Authorization": os.environ.get("FATHOM_SDK_AUTHORIZATION")},
)

if response.status_code == 200:
    print("Models list:")
    print(json.dumps(response.json(), indent=4))

else:
    print("Error:", response)

This will output a list of model identifiers (e.g., gpt-4.1, gpt-4o-mini, etc.) that you can use in subsequent calls.

Creating chats

Chats allow you to interact with an LLM in a conversational style. You can provide a sequence of messages, and the model will respond accordingly.

from openai import OpenAI

client = OpenAI(
    base_url=os.environ.get("FATHOM_SDK_BASE_URL") + "/llms/v1",
    api_key="",
    default_headers={"Authorization": os.environ.get("FATHOM_SDK_AUTHORIZATION")},
)

response = client.chat.completions.create(
    model="Qwen/Qwen2.5-VL-3B-Instruct",
    messages=[
        {"role": "developer", "content": "Talk like a pirate."},
        {
            "role": "user",
            "content": "How do I check if a Python object is an instance of a class?",
        },
    ],
)

print(response)

import requests
import os
import json

data = {
    "model": "google/gemma-3-12b-it",
    "messages": [{"role": "user", "content": "What time is it in Poland"}],
}

response = requests.post(
    os.environ.get("FATHOM_SDK_BASE_URL") + "/llms/v1/chat/completions",
    headers={"Authorization": os.environ.get("FATHOM_SDK_AUTHORIZATION")},
    json=data,
)

if response.status_code == 200:
    print("Success:")
    print(json.dumps(response.json(), indent=4))

else:
    print("Error:", response)

Direct Communication with a Custom LLM Endpoint

In some cases, you may want to communicate with an LLM that is not OpenAI-compatible. This usually means the model is hosted on a custom server or API endpoint. Instead of using the built-in chat.completions.create or completions.create methods, you can send requests directly to your endpoint using standard HTTP libraries such as requests.

Important

When listing models, each model also contains a property uris.base. Example value:

/v1/backends/gemini/

This property is the base path you must use to construct the URL for direct communication with the backend. It is only relevant when you want to bypass the SDK and talk directly to the LLM server.

import requests
import os
import json

backend_uri = "/v1/backends/gemini/" # uri retrieved from models list

data = {
    "model": "models/gemini-2.5-flash",
    "messages": [
        {"role": "user", "content": "What time is it in Poland"}
    ]
}

response = requests.post(os.environ.get("FATHOM_SDK_BASE_URL") + '/llms' + backend_uri + 'chat/completions', headers={
    'Authorization': os.environ.get("FATHOM_SDK_AUTHORIZATION")
}, json=data)

if response.status_code == 200:
    print('Success:')
    print(json.dumps(response.json(), indent=4))

else:
    print('Error:', response);

Last modified April 24, 2026: Merge 3bb4672c9ae15a0dcd08061a85722c9bcadfae8f into 78b252ca111bc5b90174676e09115f7af7d6d37f (92ca1b0)