Working with LLMs in Notebooks
This documentation explains how to communicate with Large Language Models (LLMs) directly from a notebook environment.
Requirements
Before you start, make sure you have the necessary dependencies installed in your notebook environment.
Install the OpenAI SDK
The OpenAI SDK (openai) is only required if you want to write code in Python and communicate with LLMs via the SDK.
pip install openai
Listing models
Before starting, you may want to see which models are available in your environment. This helps you choose the right model for your task.
from openai import OpenAI
client = OpenAI(
base_url = os.environ.get("FATHOM_SDK_BASE_URL") + '/llms/v1',
api_key = "",
default_headers = {
"Authorization": os.environ.get("FATHOM_SDK_AUTHORIZATION")
}
)
models = client.models.list()
print(models)import requests
import os
import json
response = requests.get(
os.environ.get("FATHOM_SDK_BASE_URL") + "/llms/v1/models",
headers={"Authorization": os.environ.get("FATHOM_SDK_AUTHORIZATION")},
)
if response.status_code == 200:
print("Models list:")
print(json.dumps(response.json(), indent=4))
else:
print("Error:", response)This will output a list of model identifiers (e.g., gpt-4.1, gpt-4o-mini, etc.) that you can use in subsequent calls.
Creating chats
Chats allow you to interact with an LLM in a conversational style. You can provide a sequence of messages, and the model will respond accordingly.
from openai import OpenAI
client = OpenAI(
base_url=os.environ.get("FATHOM_SDK_BASE_URL") + "/llms/v1",
api_key="",
default_headers={"Authorization": os.environ.get("FATHOM_SDK_AUTHORIZATION")},
)
response = client.chat.completions.create(
model="Qwen/Qwen2.5-VL-3B-Instruct",
messages=[
{"role": "developer", "content": "Talk like a pirate."},
{
"role": "user",
"content": "How do I check if a Python object is an instance of a class?",
},
],
)
print(response)import requests
import os
import json
data = {
"model": "google/gemma-3-12b-it",
"messages": [{"role": "user", "content": "What time is it in Poland"}],
}
response = requests.post(
os.environ.get("FATHOM_SDK_BASE_URL") + "/llms/v1/chat/completions",
headers={"Authorization": os.environ.get("FATHOM_SDK_AUTHORIZATION")},
json=data,
)
if response.status_code == 200:
print("Success:")
print(json.dumps(response.json(), indent=4))
else:
print("Error:", response)Direct Communication with a Custom LLM Endpoint
In some cases, you may want to communicate with an LLM that is not OpenAI-compatible. This usually means the model is hosted on a custom server or API endpoint. Instead of using the built-in chat.completions.create or completions.create methods, you can send requests directly to your endpoint using standard HTTP libraries such as requests.
Important
When listing models, each model also contains a property uris.base. Example value:
/v1/backends/gemini/
This property is the base path you must use to construct the URL for direct communication with the backend. It is only relevant when you want to bypass the SDK and talk directly to the LLM server.
import requests
import os
import json
backend_uri = "/v1/backends/gemini/" # uri retrieved from models list
data = {
"model": "models/gemini-2.5-flash",
"messages": [
{"role": "user", "content": "What time is it in Poland"}
]
}
response = requests.post(os.environ.get("FATHOM_SDK_BASE_URL") + '/llms' + backend_uri + 'chat/completions', headers={
'Authorization': os.environ.get("FATHOM_SDK_AUTHORIZATION")
}, json=data)
if response.status_code == 200:
print('Success:')
print(json.dumps(response.json(), indent=4))
else:
print('Error:', response);