Hugging Face Models
For Large Language Models (LLMs) and Text Embedding models, Fathom Intelligence provides a direct integration with Hugging Face. This allows you to skip the Model Registry and deploy industry-standard models with a single CLI command.
You can deploy any supported model by providing its Hugging Face repository ID (e.g., mistralai/Mistral-7B-v0.1). The platform automatically handles the weights download, environment setup, and API wrapping.
Chat Model
For conversational AI, Fathom Intelligence supports Instruct models. Unlike base models that simply “complete” text, Instruct models are fine-tuned to follow directions and maintain a dialogue. When deployed, these models expose an OpenAI-compatible API, allowing you to use them as a drop-in replacement for existing AI integrations.
To deploy a chat-optimized model:
fathom intelligence machine-learning deployment create hugging-face --model-id "Qwen/Qwen2.5-0.5B-Instruct" --name "qwen-tiny-chat" --description "Fast 0.5B parameter chat model" -s large-high-mem
Once the deployment status reaches Running / Hot, you can interact with the model using the chat command. This command automatically handles the complex formatting (roles like user and assistant) required by the model’s internal chat template.
fathom intelligence machine-learning deployment chat <DEPLOYMENT_ID> --prompt "Explain the concept of 'Open Source' in one sentence."
Embedding Model
Fathom Intelligence allows you to bypass the manual model registration process for industry-standard architectures. You can deploy models directly from the Hugging Face Hub using their repository ID.
To start, we will deploy a lightweight but high-performance embedding model. This model converts text into 384-dimensional vectors.
fathom intelligence machine-learning deployment create hugging-face --model-id "sentence-transformers/all-MiniLM-L6-v2" --name "tiny-embed" --description "Small embedding model for testing" --serving-size large
The embed command is designed for models that perform Feature Extraction (e.g., BERT, RoBERTa, BGE). It converts raw text into high-dimensional numerical vectors (embeddings), which are essential for semantic search, clustering, and Retrieval-Augmented Generation (RAG).
fathom intelligence machine-learning deployment embed <DEPLOYMENT_ID> --input "The quick brown fox jumps over the lazy dog"
The command returns a JSON object containing the vector (embedding) for your input. For a standard model like all-MiniLM-L6-v2, the output will look like this:
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [
0.0125, -0.0456, 0.0892, ... 384 dimensions total
]
}
],
"model": "default",
"usage": {
"prompt_tokens": 8,
"total_tokens": 8
}
}