REST API Comparison β€” Ollama vs LM Studio vs OpenCode

Privacy-first AI on your terms. Own your conversations.

REST API Comparison

Private Chat Hub talks to three self-hosted AI backendsβ€”Ollama, LM Studio, and OpenCodeβ€”each with its own REST API. This page documents the key differences and similarities as they apply to the features used in this app: model selection, system prompts, tool/function calling, streaming, and authentication.

πŸ—ΊοΈ Quick Reference

Aspect Ollama LM Studio OpenCode
Default port 11434 1234 8080
Chat endpoint POST /api/chat POST /api/v1/chat POST /session/{id}/message
Model field "model": "string" "model": "string" "model": {"providerID":…,"modelID":…}
System prompt Message with role: "system" Top-level "system_prompt" field Embedded in the prompt text
Tool calling βœ… Native (tools array) ❌ Not in proprietary API βœ… Via provider (OpenAI/Anthropic)
Streaming format Newline-delimited JSON Server-Sent Events (SSE) SSE (separate GET /event)
Vision / images Base64 in images array on message data_url in input array Provider-dependent
Authentication None (local) Optional Bearer token Optional HTTP Basic Auth
Session management Stateless (history in messages) Stateful via previous_response_id Stateful via named sessions

1. Specifying the Model

Ollama

Pass the model name as a plain string in the model field. The format is <name>:<tag> (tag defaults to latest).

POST /api/chat
{
  "model": "mistral",
  "messages": [...]
}

Other examples: "llama3.2:8b", "gemma3:4b", "deepseek-r1:7b"

List available models: GET /api/tags

LM Studio

Pass the model identifier as a plain string in the model field. The exact ID depends on what is loaded in LM Studio's server (visible in the server tab).

POST /api/v1/chat
{
  "model": "lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF",
  "input": "Hello"
}

List available models: GET /api/v1/models

OpenCode

OpenCode routes to cloud providers, so the model is identified by a two-part object with a providerID and a modelID:

POST /session/{sessionId}/message
{
  "parts": [{"type": "text", "text": "Hello"}],
  "model": {
    "providerID": "openai",
    "modelID": "gpt-4o"
  }
}

Other examples: {"providerID":"anthropic","modelID":"claude-sonnet-4-5"}, {"providerID":"google","modelID":"gemini-2.5-flash"}

List all providers and their models: GET /provider

Summary: Ollama and LM Studio both use a flat string; OpenCode uses a structured object because it routes to different cloud providers.

2. System Prompt

Ollama

The system prompt is sent as a message in the messages array with "role": "system". It must appear before the first user message.

POST /api/chat
{
  "model": "mistral",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful coding assistant. Prefer Python."
    },
    {
      "role": "user",
      "content": "Write a function to reverse a string."
    }
  ]
}

Note: The older POST /api/generate endpoint also supports a top-level "system" field (separate from the prompt), but /api/chat uses the messages array exclusively.

LM Studio

LM Studio's proprietary API uses a top-level system_prompt fieldβ€”it is not part of the messages array. This maps to the model's system role internally.

POST /api/v1/chat
{
  "model": "...",
  "input": "Write a function to reverse a string.",
  "system_prompt": "You are a helpful coding assistant. Prefer Python.",
  "stream": false
}

Note: LM Studio also exposes an OpenAI-compatible endpoint at /v1/chat/completions where the system prompt follows the standard OpenAI messages format (role: "system" in the array). Private Chat Hub uses the proprietary /api/v1/chat endpoint which has its own format.

OpenCode

OpenCode does not have a dedicated system prompt field in the message request. The app embeds the system instructions directly into the conversation text before sending to the OpenCode session. OpenCode then forwards the full text to the underlying cloud provider (which handles the system role natively).

POST /session/{sessionId}/message
{
  "parts": [
    {
      "type": "text",
      "text": "[System: You are a helpful coding assistant.]

Write a function to reverse a string."
    }
  ],
  "model": {"providerID": "openai", "modelID": "gpt-4o"}
}

Summary: All three support system prompts, but through different mechanisms. Ollama uses the standard OpenAI-style role: "system" message. LM Studio uses a dedicated top-level field. OpenCode requires embedding the instruction in the prompt text.

3. Tool / Function Calling

Ollama β€” Native Tool Calling

Ollama has first-class tool calling support. Tools are defined in an OpenAI-compatible format and passed in the top-level tools array. When a model decides to call a tool, it returns a tool_calls array instead of a plain text response.

Request (with tools):

POST /api/chat
{
  "model": "llama3.1",
  "messages": [
    {"role": "user", "content": "What's the weather in Paris right now?"}
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "web_search",
        "description": "Search the web for current information",
        "parameters": {
          "type": "object",
          "properties": {
            "query": {
              "type": "string",
              "description": "The search query"
            }
          },
          "required": ["query"]
        }
      }
    }
  ]
}

Response when model calls a tool:

{
  "model": "llama3.1",
  "done": true,
  "message": {
    "role": "assistant",
    "content": "",
    "tool_calls": [
      {
        "id": "call_abc123",
        "function": {
          "name": "web_search",
          "arguments": {
            "query": "current weather in Paris"
          }
        }
      }
    ]
  }
}

Returning the tool result (next request):

POST /api/chat
{
  "model": "llama3.1",
  "messages": [
    {"role": "user",      "content": "What's the weather in Paris right now?"},
    {"role": "assistant", "content": "",
     "tool_calls": [{"id": "call_abc123", "function": {"name":"web_search","arguments":{"query":"current weather in Paris"}}}]},
    {
      "role": "tool",
      "tool_name": "web_search",
      "tool_id": "call_abc123",
      "content": "Paris: 18Β°C, partly cloudy. (source: weather.com)"
    }
  ],
  "tools": [...]
}

This loop continues until the model returns a plain text response without tool_calls. Private Chat Hub's agentic loop handles this automatically.

LM Studio β€” No Tool Calling in Proprietary API

LM Studio's /api/v1/chat proprietary API does not expose a tools parameter. Tool calling is not supported when using this endpoint.

Alternative: LM Studio also provides an OpenAI-compatible /v1/chat/completions endpoint which does support the standard OpenAI tools parameter for models that have been fine-tuned for function calling. Private Chat Hub currently uses the proprietary endpoint and therefore does not support tool calls on LM Studio.

OpenCode β€” Provider-Level Tool Calling

OpenCode itself does not expose a tools parameter in its message API. Tool calling is handled transparently by the underlying cloud provider (OpenAI, Anthropic, or Google), depending on which model is selected. The tool definitions and the multi-turn tool call loop happen inside OpenCode's server, invisible to the client. Private Chat Hub receives the final text response after all tool calls have been resolved.

Summary:

Backend Client-side tool definitions? Agentic loop?
Ollama βœ… Yes β€” pass tools array in request βœ… Handled by app
LM Studio (proprietary) ❌ Not supported ❌ Not applicable
OpenCode ❌ Not exposed to client βœ… Handled server-side

4. Streaming Responses

Ollama β€” Newline-Delimited JSON

Set "stream": true. The server sends a series of JSON objects, each on its own line. Each chunk contains a partial message.content. The last chunk has "done": true.

{"model":"mistral","message":{"role":"assistant","content":"The"},"done":false}
{"model":"mistral","message":{"role":"assistant","content":" capital"},"done":false}
{"model":"mistral","message":{"role":"assistant","content":" of France is Paris."},"done":false}
{"model":"mistral","message":{"role":"assistant","content":""},"done":true,"eval_count":23}

LM Studio β€” Server-Sent Events (SSE)

Set "stream": true and add Accept: text/event-stream. The server sends SSE events. The message.delta event carries each text chunk; chat.end signals completion.

event: message.delta
data: {"content": "The"}

event: message.delta
data: {"content": " capital of France is Paris."}

event: chat.end
data: {"result": {"output": [{"type":"text","content":"The capital of France is Paris."}]}}

OpenCode β€” SSE via Separate Event Endpoint

OpenCode streaming is handled differently. You send the message via POST /session/{id}/prompt_async (which returns immediately with 204), then subscribe to a persistent SSE stream at GET /event to receive events as they arrive.

// Step 1 β€” send message async (returns 204 immediately)
POST /session/{sessionId}/prompt_async
{"parts":[{"type":"text","text":"Hello"}],"model":{...}}

// Step 2 β€” listen for events
GET /event
Accept: text/event-stream

event: message.part
data: {"sessionID":"...","part":{"type":"text","text":"The capital"}}

event: message.part
data: {"sessionID":"...","part":{"type":"text","text":" of France is Paris."}}

event: message.complete
data: {"sessionID":"..."}

Summary: All three support streaming, but the wire formats differ significantly. Ollama uses the simplest format (plain newline-delimited JSON). LM Studio and OpenCode both use SSE, but OpenCode separates the send and receive into two HTTP connections.

5. Authentication

Ollama β€” No authentication by default. All requests to localhost:11434 are accepted. When exposed over a network, access control is managed at the network/firewall level.

LM Studio β€” Optional Bearer token. If an API key is configured in LM Studio's server settings, include it in the Authorization header:

Authorization: Bearer lm-studio-api-key

OpenCode β€” Optional HTTP Basic Auth. If a username/password is configured on the opencode server, encode the credentials in the Authorization header:

Authorization: Basic base64(username:password)

6. Sending Images (Vision)

Ollama

Images are attached to a user message as a list of base64-encoded strings in the images field:

{
  "model": "llava",
  "messages": [
    {
      "role": "user",
      "content": "What is in this image?",
      "images": ["iVBORw0KGgoAAAANSUhEUg..."]
    }
  ]
}

LM Studio

Images are included in the input field as an array of typed parts. Text and images are separate objects in the array:

{
  "model": "...",
  "input": [
    {"type": "message", "content": "What is in this image?"},
    {"type": "image",   "data_url": "data:image/jpeg;base64,iVBOR..."}
  ]
}

OpenCode

Image input is handled by the underlying cloud provider. The format depends on which provider/model is selected (e.g. OpenAI vision uses image_url content parts). OpenCode may forward the image data transparently depending on provider support.

Summary: Ollama and LM Studio both embed images as base64 data, but in different fields and formats. Ollama uses a simple array on the message; LM Studio uses a typed parts array as the input.

7. Conversation State & History

Ollama β€” Stateless

Ollama is fully stateless. To maintain conversation context, the client must send the full message history on every request (all previous user, assistant, and tool messages). The app manages this array locally.

LM Studio β€” Stateful via Response ID

LM Studio's /api/v1/chat endpoint is stateful. Each chat response returns a response_id. To continue the conversation, pass the previous response ID as previous_response_id in the next requestβ€”you don't need to resend the full history.

// First turn
POST /api/v1/chat  β†’  {"response_id": "resp_abc"}

// Second turn (continues the same conversation)
POST /api/v1/chat
{"model": "...", "input": "And what about Lyon?",
 "previous_response_id": "resp_abc"}

OpenCode β€” Stateful via Sessions

OpenCode manages conversation state as named sessions. Create a session once (POST /session), then send all messages to that session. The server maintains the history; you only send the new user message each time.

// Create a session
POST /session  β†’  {"id": "sess_xyz", "title": "My conversation"}

// All messages go to the same session
POST /session/sess_xyz/message
{"parts": [{"type":"text","text":"Tell me about Paris"}], "model": {...}}

POST /session/sess_xyz/message
{"parts": [{"type":"text","text":"And Lyon?"}], "model": {...}}

Summary: Ollama requires the client to track and send full history; LM Studio uses a response-chaining pattern; OpenCode uses a persistent server-side session.

8. Other Notable Endpoints

Operation Ollama LM Studio OpenCode
List models GET /api/tags GET /api/v1/models GET /provider
Pull / download model POST /api/pull Via LM Studio GUI N/A (cloud)
Delete model DELETE /api/delete Via LM Studio GUI N/A
Health check GET /api/tags (200 = healthy) GET /api/v1/models (200 = healthy) GET /global/health
Model info POST /api/show Included in models list Included in provider list
Embeddings POST /api/embeddings N/A Via provider

9. What Is Similar Across All Three

  • All use HTTP REST over a local (or LAN) server
  • All use JSON for request and response bodies
  • All require Content-Type: application/json
  • All support streaming (though in different formats)
  • All return partial text tokens during streaming, not full messages
  • All support temperature or equivalent sampling controls
  • Tool definitions use the same JSON Schema structure (type: "object", properties, required)
  • The logical concept of user / assistant turns is present in all

10. Key Differences at a Glance

  • Model identifier format: Ollama/LM Studio use a flat string; OpenCode requires a structured object (providerID + modelID)
  • System prompt placement: Ollama puts it in the messages array (role: "system"); LM Studio has a dedicated top-level field (system_prompt); OpenCode embeds it in the text
  • Tool calling: Ollama exposes a first-class tools API; LM Studio's proprietary endpoint has none; OpenCode delegates to the cloud provider
  • Streaming wire format: Ollama = newline JSON; LM Studio = SSE on the chat endpoint; OpenCode = SSE on a separate persistent event endpoint
  • Conversation state: Ollama = stateless (client sends full history); LM Studio = response-ID chaining; OpenCode = server-managed sessions
  • Image attachment format: Ollama = base64 in images[] on message; LM Studio = typed parts array in input
  • Authentication: Ollama has none; LM Studio uses Bearer; OpenCode uses Basic Auth
  • Model management: Ollama supports pull/delete via API; LM Studio and OpenCode manage models outside the chat API

πŸ“š Related Pages

All Features β€’ Installation Guide β€’ GitHub Repository