Working with LLMs in Notebooks

Getting started guide for integrating Large Language Models (LLMs) into your notebook workflows.

This documentation explains how to communicate with Large Language Models (LLMs) directly from a notebook environment.

Requirements

Before you start, make sure you have the necessary dependencies installed in your notebook environment.

Install the OpenAI SDK

The OpenAI SDK (openai) is only required if you want to write code in Python and communicate with LLMs via the SDK.

pip install openai

Listing models

Before starting, you may want to see which models are available in your environment. This helps you choose the right model for your task.

from openai import OpenAI

client = OpenAI(
    base_url = os.environ.get("FATHOM_SDK_BASE_URL") + '/llms/v1',
    api_key = "",
    default_headers = {
        "Authorization": os.environ.get("FATHOM_SDK_AUTHORIZATION")
    }
)

models = client.models.list()

print(models)
import requests
import os
import json

response = requests.get(
    os.environ.get("FATHOM_SDK_BASE_URL") + "/llms/v1/models",
    headers={"Authorization": os.environ.get("FATHOM_SDK_AUTHORIZATION")},
)

if response.status_code == 200:
    print("Models list:")
    print(json.dumps(response.json(), indent=4))

else:
    print("Error:", response)

This will output a list of model identifiers (e.g., gpt-4.1, gpt-4o-mini, etc.) that you can use in subsequent calls.

Creating chats

Chats allow you to interact with an LLM in a conversational style. You can provide a sequence of messages, and the model will respond accordingly.

from openai import OpenAI

client = OpenAI(
    base_url=os.environ.get("FATHOM_SDK_BASE_URL") + "/llms/v1",
    api_key="",
    default_headers={"Authorization": os.environ.get("FATHOM_SDK_AUTHORIZATION")},
)

response = client.chat.completions.create(
    model="Qwen/Qwen2.5-VL-3B-Instruct",
    messages=[
        {"role": "developer", "content": "Talk like a pirate."},
        {
            "role": "user",
            "content": "How do I check if a Python object is an instance of a class?",
        },
    ],
)

print(response)
import requests
import os
import json

data = {
    "model": "google/gemma-3-12b-it",
    "messages": [{"role": "user", "content": "What time is it in Poland"}],
}

response = requests.post(
    os.environ.get("FATHOM_SDK_BASE_URL") + "/llms/v1/chat/completions",
    headers={"Authorization": os.environ.get("FATHOM_SDK_AUTHORIZATION")},
    json=data,
)

if response.status_code == 200:
    print("Success:")
    print(json.dumps(response.json(), indent=4))

else:
    print("Error:", response)

Direct Communication with a Custom LLM Endpoint

In some cases, you may want to communicate with an LLM that is not OpenAI-compatible. This usually means the model is hosted on a custom server or API endpoint. Instead of using the built-in chat.completions.create or completions.create methods, you can send requests directly to your endpoint using standard HTTP libraries such as requests.

import requests
import os
import json

backend_uri = "/v1/backends/gemini/" # uri retrieved from models list

data = {
    "model": "models/gemini-2.5-flash",
    "messages": [
        {"role": "user", "content": "What time is it in Poland"}
    ]
}

response = requests.post(os.environ.get("FATHOM_SDK_BASE_URL") + '/llms' + backend_uri + 'chat/completions', headers={
    'Authorization': os.environ.get("FATHOM_SDK_AUTHORIZATION")
}, json=data)

if response.status_code == 200:
    print('Success:')
    print(json.dumps(response.json(), indent=4))

else:
    print('Error:', response);