Managed LLMs

This section describes how to access and configure Large Language Models (LLMs) managed directly by the platform. It focuses on rapid integration, prompt engineering via API, and cost-efficient scaling without the overhead of infrastructure management.

The Managed LLMs service provides a unified interface to access Large Language Models provided natively by the platform, as well as external models running as a service (such as OpenAI, Claude, or Gemini).

Access and Permissions

To ensure security and cost management, access to these models is governed by API keys managed at two levels:

Platform Level: Global models provided by the infrastructure.
Organization Level: Custom integrations where organization administrators can plug in their own provider keys.

This architecture allows teams to use state-of-the-art models without managing individual credentials, while administrators maintain full control over which models are available to specific organizations.

Listing Available Models

Before interacting with an LLM, you can list all models currently available in your active context. This list includes both native and third-party models (e.g., gpt-4o, claude-3-5-sonnet).

fathom intelligence llms model list

Chat Completions

The chat command is the primary way to interact with Managed LLMs via the CLI. It is an excellent tool for testing connectivity, validating model behavior, or quickly generating content.

fathom intelligence llms model chat <MODEL_NAME> --prompt <PROMPT_TEXT> [OPTIONS]

Key Options

Option	Default	Description
<MODEL_NAME>	Required	The model ID to use (e.g., gpt-4o, gemini-1.5-pro).
–prompt, -p	Required	The message to send to the model.
–system, -s	“You are a helpful…”	Sets the behavior/persona of the assistant.
–temperature, -t	0.7	Controls creativity (0.0 = deterministic, 1.0 = creative).
–max-tokens, -n	N/A	Limits the length of the generated response.
–no-stream	N/A	Disables real-time streaming of the response to the terminal.

Example: Basic Interactive Chat

To send a simple query to gemma-3-12b-it:

fathom intelligence llms model chat google/gemma-3-12b-it --prompt 'Explain quantum entanglement in one sentence.'

Example: Advanced System Behavior

You can override the default assistant behavior to act as a specific persona:

fathom intelligence llms model chat Qwen/Qwen2.5-VL-3B-Instruct --system 'You are a senior Rust developer. Provide code examples only.' --prompt 'How do I implement a trait in Rust?' --temperature 0.2

Developer Resources

Ready to integrate these models into your code? Check out our API Integration Guide for detailed documentation on endpoints, authentication headers, and code examples in Rust and Python.

Last modified April 24, 2026: Merge 3bb4672c9ae15a0dcd08061a85722c9bcadfae8f into 78b252ca111bc5b90174676e09115f7af7d6d37f (92ca1b0)