Skip to content

Tokens & API keys

InferenceKey draws a hard line between provisioning workloads and calling them. Each side has its own token, and neither can do the other’s job. This is the single most important security property of the SDK: a leaked inference key can never reconfigure your infrastructure, and a management token can never run up an inference bill.

Before you start

You need an InferenceKey account and the SDK installed.

Install (Python ≥ 3.9)
pip install inferencekey

Then grab two tokens from the dashboard — that’s what the rest of this page covers.

The two tokens

ik_sdk_ — Control plane

Provisions and reconciles workloads. Held by your ManagementClient and scoped to one project. Use it from deploy scripts, CI, or a backend that declares what should exist. Cannot call inference.

ik_live_ — Data plane

Calls inference against a workload’s OpenAI-compatible endpoint. Passed per workload — one app can hold many of them, one per workload, each with its own scope. Cannot provision.

The prefix tells you which is which: control-plane tokens start with ik_sdk_, data-plane keys start with ik_live_. They are not interchangeable, and using one where the other is expected returns a 403 (see Least privilege below).

Where to get each one

Both live in the dashboard, but on different screens because they have different blast radius.

  1. ik_sdk_ — the Tokens screen. Open your project in the dashboard and go to Tokens. Create an SDK token; it is bound to that project. This is the value your ManagementClient reads.

  2. ik_live_ — the API Keys screen. Go to API Keys and mint a key. Choose its access scope: either specific endpoints (one or more workloads — the narrowest, recommended choice) or the entire project. Hand each app the narrowest key it needs. This is the value you pass to data.endpoint(...).

Project-wide keys

When you pick “Entire project” for a key’s scope, it is granted a projectSlug/* wildcard and can call every workload in the project, including ones created later. This is handy when you don’t want to re-issue or reassign a key every time you add a workload — for example minting the key before the workload exists, then ensure()-ing the workload from code.

Environment variables

The SDK loads tokens from the environment so secrets never have to appear in your source.

VariableTokenUsed byDefault for
INFERENCEKEY_SDK_TOKENik_sdk_…ManagementClientThe control plane
INFERENCEKEY_API_KEYik_live_…DataClientThe default ik_live_ key
INFERENCEKEY_PROJECTboth clientsThe project slug
INFERENCEKEY_BASE_URLboth clientsThe API base URL

Precedence

When a value can come from more than one place, the SDK resolves it in this order:

  1. Explicit argument — anything you pass in code (e.g. project="acme", api_key="ik_live_…") always wins.
  2. Environment variable — the INFERENCEKEY_* vars above.

So an api_key passed straight to data.endpoint(...) overrides INFERENCEKEY_API_KEY, and a project passed to from_env/fromEnv overrides INFERENCEKEY_PROJECT.

Per-language setup

Export the tokens, then construct each client with from_env / fromEnv. The management client reads INFERENCEKEY_SDK_TOKEN; the data client reads INFERENCEKEY_API_KEY unless you pass an api_key per endpoint.

Set the tokens (shell)
export INFERENCEKEY_SDK_TOKEN="ik_sdk_your_control_plane_token"
export INFERENCEKEY_API_KEY="ik_live_your_default_inference_key"
export INFERENCEKEY_PROJECT="acme"
# optional: export INFERENCEKEY_BASE_URL="https://api.inferencekey.com"
Load both clients from the environment
from inferencekey import ManagementClient, DataClient
# Reads INFERENCEKEY_SDK_TOKEN (control plane).
mgmt = ManagementClient.from_env(project="acme")
# Reads INFERENCEKEY_API_KEY by default (data plane).
data = DataClient.from_env(project="acme")
# Override per workload with an explicit ik_live_ key — this wins over the env var.
ep = data.endpoint("support-bot", api_key="ik_live_...")

Least privilege: why two tokens

The split is enforced on the wire, not just by convention. Hand a token to the wrong client and the API rejects it with a 403:

Error codeWhat happened
wrong_credential_typeAn ik_live_ key was used to provision, or an ik_sdk_ token was used to call inference.
project_scope_mismatchThe token is valid but scoped to a different project than the one you addressed.
scope_insufficientThe token’s scope does not cover the requested action or workload.

Keep it simple:

  • Provisioning code (deploy scripts, CI, reconcilers) holds only ik_sdk_.
  • Application code that serves users holds only the ik_live_ keys for the workloads it actually calls.

If a piece of code needs to do both, that’s a sign it should be split into a deploy step and a runtime step.

Next steps


New to InferenceKey? Create an account or open the dashboard · Learn more at inferencekey.com.