AI Completion
Run LLM completions from multiple providers within a flow. Supports custom OpenAI-compatible endpoints, tool calling via MCP servers, streaming, RAG via static context, and sandboxed agent execution.
Configuration
- ai_completion:
name: analyze
provider: google
model: gemini-2.5-flash-lite
credentials_path: /path/to/credentials.json
prompt: "Analyze this data: {{event.data}}" Fields
| Field | Type | Default | Description |
|---|---|---|---|
name | string | required | Task name. |
provider | string | required | LLM provider β see Providers below. |
model | string | required | Model identifier (e.g. gpt-4, claude-3-5-sonnet-20241022, gemini-2.5-flash-lite). |
prompt | string/resource | required | User prompt template. Supports handlebars templating and resource files. |
credentials_path | string | Path to provider credentials JSON. Required for hosted providers, optional for custom/ollama (local providers without auth). | |
endpoint | string | Base URL of a custom OpenAI-compatible endpoint. Required when provider: custom or provider: ollama. Example: https://llm-gateway.internal.example.com/v1. | |
system_prompt | string/resource | System prompt. | |
max_tokens | int | Maximum tokens in response. | |
temperature | float | Sampling temperature (0.0β1.0). | |
stream | bool | false | Stream response as separate events instead of one final event. |
static_context | list | RAG documents (inline strings or resource: references) always available to the agent. | |
max_turns | int | unlimited | Maximum recursive agent turns. Prevents runaway tool-calling loops. |
mcp_servers | list | MCP server URLs for tool discovery. The agent connects to each, discovers tools, and makes them callable during completion. | |
sandbox | object | Sandbox config for tool execution (nsjail). | |
depends_on | list | Upstream task names. | |
retry | object | Retry configuration. |
Providers
| Provider | Auth | Required credentials fields |
|---|---|---|
openai | API key | api_key |
anthropic | API key | api_key |
google | API key (Gemini API) | api_key |
cohere | API key | api_key |
mistral | API key | api_key |
groq | API key | api_key |
together | API key | api_key |
xai | API key | api_key |
openrouter | API key | api_key |
perplexity | API key | api_key |
huggingface | API key | api_key |
vertexai | Google ADC | project_id, region (optional, can come from env) |
ollama | Optional API key | optional api_key |
custom | Optional API key | optional api_key |
For providers not listed above (cloud-native services with non-api_key auth, on-prem model servers, multi-tenant routing), set provider: custom and point endpoint at any service speaking the OpenAI chat-completions wire format. The flowgen AI gateway (llm_proxy task) can itself act as that endpoint, letting you route, log, and rate-limit across upstream models from one place.
Credentials file format
For API-key providers, credentials file is a simple JSON:
{ "api_key": "sk-..." } For vertexai, the JSON sets the project and region; authentication itself goes through Google Application Default Credentials (ADC) β the pod must have access to a service account via the GOOGLE_APPLICATION_CREDENTIALS env var or workload identity:
{ "project_id": "my-gcp-project", "region": "us-central1" } Example: custom OpenAI-compatible endpoint
Point at any service speaking the OpenAI chat-completions wire format β an internal model server, a self-hosted gateway, or the flowgen AI gateway itself routing to one or more upstream models.
- ai_completion:
name: internal_llm
provider: custom
model: my-internal-model
endpoint: "https://llm-gateway.internal.example.com/v1"
credentials_path: /etc/flowgen/credentials/internal_llm.json
prompt: "{{event.data.question}}" For endpoints without auth, omit credentials_path:
- ai_completion:
name: unauth_endpoint
provider: custom
model: my-model
endpoint: "http://llm.cluster.local/v1"
prompt: "{{event.data}}" Example: RAG with static context
static_context documents are injected into every completion. Mix inline strings and resource files.
- ai_completion:
name: support_bot
provider: openai
model: gpt-4-turbo
credentials_path: /etc/flowgen/credentials/openai.json
system_prompt: "Answer using the provided context."
prompt: "{{event.data.question}}"
static_context:
- resource: "docs/product_manual.md"
- resource: "docs/pricing.md"
- "Support contact: support@example.com"
max_turns: 3 Example: MCP tool calling
The agent connects to MCP servers, discovers tools, and can call them during completion. Use flowgenβs own MCP server (mcp_tool task) or any external MCP-compatible server.
- ai_completion:
name: agent
provider: anthropic
model: claude-3-5-sonnet-20241022
credentials_path: /etc/flowgen/credentials/anthropic.json
prompt: "{{event.data.task}}"
max_turns: 5
mcp_servers:
- url: "http://localhost:3001/mcp"
- url: "http://external-tools:8080/mcp" Optional per-server auth uses the same credentials JSON format as http_request (bearer_auth or basic_auth):
mcp_servers:
- url: "https://tools.example.com/mcp"
credentials_path: /etc/flowgen/credentials/mcp_tools.json Sandbox
Optional sandboxing for MCP tool execution via nsjail. Rhai scripts do not need sandboxing (safe by design).
- ai_completion:
name: agent
provider: google
model: gemini-2.5-flash-lite
prompt: "{{event.data}}"
sandbox:
memory_limit_mb: 512
time_limit_seconds: 30
max_pids: 10
allow_network: false Output
Format: JSON
| Field | Type | Description |
|---|---|---|
text | string | Generated completion text. |
model | string | Model used. |
provider | string | Provider name. |
usage | object | Token counts (prompt_tokens, completion_tokens, total_tokens). |
In stream: true mode, each chunk is emitted as a separate event with partial text content.