Skip to main content

LLM Providers

ORQO supports multiple LLM providers out of the box — from cloud APIs to fully local inference. You can mix providers within a single team, assigning different models to different agents based on cost, capability, or privacy requirements.

Cloud Providers

OpenRouter

The recommended provider for most users. A single API key gives you access to 200+ models from OpenAI, Anthropic, Google, Meta, Mistral, and more.

Engine provideropenai (OpenAI-compatible API)
Base URLhttps://openrouter.ai/api/v1
CredentialOPENROUTER_API_KEY
Models200+ from all major providers
tip

OpenRouter is the recommended single-key solution — one API key gives you access to all major providers without managing separate accounts.

OpenAI

Direct access to OpenAI's models.

Engine provideropenai
Base URLhttps://api.openai.com/v1
CredentialOPENAI_API_KEY
ModelsGPT-4.1, GPT-4o, o1, o3, o4-mini, and more

Anthropic

Direct access to Anthropic's Claude models.

Engine provideranthropic
Base URLhttps://api.anthropic.com
CredentialANTHROPIC_API_KEY
ModelsClaude Opus, Sonnet, Haiku

Google AI

Direct access to Google's Gemini models.

Engine providergoogle
Base URLhttps://generativelanguage.googleapis.com/v1beta
CredentialGOOGLE_API_KEY
ModelsGemini 2.5 Flash, Gemini 2.5 Pro, and more

Open-Source & Self-Hosted Models

ORQO supports open-source models via Ollama, LM Studio, and any OpenAI-compatible inference server. This enables privacy-sensitive deployments, cost control, and mixed-provider teams where some agents use cloud models and others use your own.

Important: Base URL Must Be Reachable by ORQO

The base URL you configure is where the ORQO server connects to — not your browser. On managed ORQO (the default), the server runs in the cloud. This means:

  • Cloud-hosted model endpoints (Ollama Cloud, your own server at https://llm.yourcompany.com) work out of the box.
  • localhost only works in self-hosted deployments where the ORQO server and model server run on the same machine (Enterprise split-plane solution).
  • If you run a model server in your own infrastructure, expose it at a reachable address (e.g., https://ollama.yourcompany.com:11434) and configure that as the base URL.

Ollama

Ollama runs open-source models with a single command. ORQO connects to Ollama via its native API — either your own Ollama instance or Ollama Cloud.

Engine providerollama
CredentialOLLAMA_API_KEY (optional — only needed for Ollama Cloud)
Models400+ open-source models (Gemma, Llama, Qwen, Mistral, and more)

The easiest way to use open-source models — no infrastructure to manage.

  1. Sign up at ollama.com and create an API key at ollama.com/settings/keys.
  2. In ORQO, go to Settings → Credentials and add an OLLAMA_API_KEY credential with your key.
  3. Set the custom base URL on your Ollama model to https://ollama.com.
  4. Assign the credential to your LLM config.

Ollama Cloud offers a free tier, a Pro plan ($20/month), and a Max plan ($100/month). All plans include access to every model in the Ollama library.

Your Own Ollama Instance

Run Ollama on your own server for full data sovereignty.

  1. Install Ollama on a server reachable by the ORQO server.
  2. Pull models: ollama pull gemma4 or ollama pull llama3.1:8b.
  3. Expose the Ollama port (default 11434) at a reachable address — e.g., https://ollama.yourcompany.com:11434.
  4. In ORQO, set the custom base URL on the Ollama model to that address. No credential needed.
ModelParametersSizeContextBest for
gemma4:latest4B MoE~9.6 GB128KBest balance of speed and capability
gemma4:26b26B MoE~18 GB256KHighest capability (needs 32GB+ RAM)
llama3.1:8b8B~4.7 GB128KFast, reliable tool calling
qwen2.5:7b-instruct7B~4.7 GB128KStrong multilingual performance
qwen2.5-coder:7b7B~4.7 GB128KSpecialized for code tasks

LM Studio

LM Studio is a desktop application for running local models with a visual interface. It exposes an OpenAI-compatible API — a different protocol from Ollama.

Engine provideropenai (OpenAI-compatible API)
CredentialNone required

To use LM Studio with ORQO:

  1. Run LM Studio on a server reachable by the ORQO server and start the local server.
  2. In ORQO, go to Settings → LLM Configs and click Add Custom Model.
  3. Set:
    • Display name — e.g., "Gemma 4 (LM Studio)"
    • Model ID — The exact model identifier shown in LM Studio
    • Engine provideropenai
    • Base URL — The address where LM Studio is reachable (e.g., https://lmstudio.yourcompany.com:1234/v1)
  4. Create an LLM config selecting your custom model.
Ollama vs LM Studio

These tools use different API protocols. Ollama uses its own format (/api/chat), while LM Studio speaks the OpenAI format (/v1/chat/completions). ORQO handles both — just select the correct engine provider: ollama for Ollama, openai for LM Studio.

Custom / OpenAI-Compatible Endpoints

Any service that implements the OpenAI chat completions API can be added as a Custom provider. This includes inference servers like vLLM, TGI, or corporate API gateways.

  1. Go to Settings → LLM Configs and click Add Custom Model.
  2. Set the engine provider to match the API format:
    • openai — for OpenAI-compatible APIs (most common)
    • anthropic — for Anthropic-compatible APIs
    • ollama — for Ollama-compatible APIs
  3. Set the base URL to your endpoint (e.g., https://llm-gateway.yourcompany.com/v1).
  4. Set the model ID to match what your server expects.
  5. If your server requires authentication, create a credential and assign it to the LLM config.

Mixing Providers in a Team

One of ORQO's key capabilities is mixed-provider teams. Each agent in a team can use a different LLM configuration, letting you optimize for cost, speed, capability, or privacy per role.

Example team setup:

AgentRoleProviderModelWhy
ResearcherWeb research, data gatheringOpenRouterGPT-4.1 MiniFast, cheap
AnalystDeep analysis, reasoningAnthropicClaude Sonnet 4Best reasoning
WriterContent generationOllama CloudGemma 4 26BOpen-source, private

To set this up:

  1. Create LLM configs for each provider/model combination.
  2. Set the team's default LLM config (used by agents without an override).
  3. On individual agents, select a different LLM config to override the team default.

Full details: LLM Assignment


Provider Comparison

ProviderAuthCostData SovereigntyTool CallingBest For
OpenRouterAPI keyPer-tokenProvider-hostedYesMost users — widest model selection
OpenAIAPI keyPer-tokenProvider-hostedYesDirect access, higher rate limits
AnthropicAPI keyPer-tokenProvider-hostedYesClaude models specifically
GoogleAPI keyPer-tokenProvider-hostedYesGemini models, long context
Ollama CloudAPI keySubscriptionOllama-hostedYesEasy access to open-source models
Ollama (self-hosted)NoneYour hardwareFull controlYesPrivacy, air-gapped, compliance
LM StudioNoneYour hardwareFull controlYesSelf-hosted with visual management
CustomOptionalVariesVariesDependsCorporate gateways, vLLM, TGI