AI Completion

Run LLM completions from multiple providers within a flow. Supports custom OpenAI-compatible endpoints, tool calling via MCP servers, streaming, RAG via static context, and sandboxed agent execution.

Configuration

- ai_completion:
    name: analyze
    provider: google
    model: gemini-2.5-flash-lite
    credentials_path: /path/to/credentials.json
    prompt: "Analyze this data: {{event.data}}"

Fields

FieldTypeDefaultDescription
namestringrequiredTask name.
providerstringrequiredLLM provider β€” see Providers below.
modelstringrequiredModel identifier (e.g. gpt-4, claude-3-5-sonnet-20241022, gemini-2.5-flash-lite).
promptstring/resourcerequiredUser prompt template. Supports handlebars templating and resource files.
credentials_pathstringPath to provider credentials JSON. Required for hosted providers, optional for custom/ollama (local providers without auth).
endpointstringBase URL of a custom OpenAI-compatible endpoint. Required when provider: custom or provider: ollama. Example: https://llm-gateway.internal.example.com/v1.
system_promptstring/resourceSystem prompt.
max_tokensintMaximum tokens in response.
temperaturefloatSampling temperature (0.0–1.0).
streamboolfalseStream response as separate events instead of one final event.
static_contextlistRAG documents (inline strings or resource: references) always available to the agent.
max_turnsintunlimitedMaximum recursive agent turns. Prevents runaway tool-calling loops.
mcp_serverslistMCP server URLs for tool discovery. The agent connects to each, discovers tools, and makes them callable during completion.
sandboxobjectSandbox config for tool execution (nsjail).
depends_onlistUpstream task names.
retryobjectRetry configuration.

Providers

ProviderAuthRequired credentials fields
openaiAPI keyapi_key
anthropicAPI keyapi_key
googleAPI key (Gemini API)api_key
cohereAPI keyapi_key
mistralAPI keyapi_key
groqAPI keyapi_key
togetherAPI keyapi_key
xaiAPI keyapi_key
openrouterAPI keyapi_key
perplexityAPI keyapi_key
huggingfaceAPI keyapi_key
vertexaiGoogle ADCproject_id, region (optional, can come from env)
ollamaOptional API keyoptional api_key
customOptional API keyoptional api_key

For providers not listed above (cloud-native services with non-api_key auth, on-prem model servers, multi-tenant routing), set provider: custom and point endpoint at any service speaking the OpenAI chat-completions wire format. The flowgen AI gateway (llm_proxy task) can itself act as that endpoint, letting you route, log, and rate-limit across upstream models from one place.

Credentials file format

For API-key providers, credentials file is a simple JSON:

{ "api_key": "sk-..." }

For vertexai, the JSON sets the project and region; authentication itself goes through Google Application Default Credentials (ADC) β€” the pod must have access to a service account via the GOOGLE_APPLICATION_CREDENTIALS env var or workload identity:

{ "project_id": "my-gcp-project", "region": "us-central1" }

Example: custom OpenAI-compatible endpoint

Point at any service speaking the OpenAI chat-completions wire format β€” an internal model server, a self-hosted gateway, or the flowgen AI gateway itself routing to one or more upstream models.

- ai_completion:
    name: internal_llm
    provider: custom
    model: my-internal-model
    endpoint: "https://llm-gateway.internal.example.com/v1"
    credentials_path: /etc/flowgen/credentials/internal_llm.json
    prompt: "{{event.data.question}}"

For endpoints without auth, omit credentials_path:

- ai_completion:
    name: unauth_endpoint
    provider: custom
    model: my-model
    endpoint: "http://llm.cluster.local/v1"
    prompt: "{{event.data}}"

Example: RAG with static context

static_context documents are injected into every completion. Mix inline strings and resource files.

- ai_completion:
    name: support_bot
    provider: openai
    model: gpt-4-turbo
    credentials_path: /etc/flowgen/credentials/openai.json
    system_prompt: "Answer using the provided context."
    prompt: "{{event.data.question}}"
    static_context:
      - resource: "docs/product_manual.md"
      - resource: "docs/pricing.md"
      - "Support contact: support@example.com"
    max_turns: 3

Example: MCP tool calling

The agent connects to MCP servers, discovers tools, and can call them during completion. Use flowgen’s own MCP server (mcp_tool task) or any external MCP-compatible server.

- ai_completion:
    name: agent
    provider: anthropic
    model: claude-3-5-sonnet-20241022
    credentials_path: /etc/flowgen/credentials/anthropic.json
    prompt: "{{event.data.task}}"
    max_turns: 5
    mcp_servers:
      - url: "http://localhost:3001/mcp"
      - url: "http://external-tools:8080/mcp"

Optional per-server auth uses the same credentials JSON format as http_request (bearer_auth or basic_auth):

    mcp_servers:
      - url: "https://tools.example.com/mcp"
        credentials_path: /etc/flowgen/credentials/mcp_tools.json

Sandbox

Optional sandboxing for MCP tool execution via nsjail. Rhai scripts do not need sandboxing (safe by design).

- ai_completion:
    name: agent
    provider: google
    model: gemini-2.5-flash-lite
    prompt: "{{event.data}}"
    sandbox:
      memory_limit_mb: 512
      time_limit_seconds: 30
      max_pids: 10
      allow_network: false

Output

Format: JSON

FieldTypeDescription
textstringGenerated completion text.
modelstringModel used.
providerstringProvider name.
usageobjectToken counts (prompt_tokens, completion_tokens, total_tokens).

In stream: true mode, each chunk is emitted as a separate event with partial text content.