ik_sdk_ — Control plane
Provisions and reconciles workloads. Held by your ManagementClient and scoped to one project. Use it from deploy scripts, CI, or a backend that declares what should exist. Cannot call inference.
Ce contenu n’est pas encore disponible dans votre langue.
InferenceKey draws a hard line between provisioning workloads and calling them. Each side has its own token, and neither can do the other’s job. This is the single most important security property of the SDK: a leaked inference key can never reconfigure your infrastructure, and a management token can never run up an inference bill.
You need an InferenceKey account and the SDK installed.
pip install inferencekeynpm install @inferencekey/sdkThen grab two tokens from the dashboard — that’s what the rest of this page covers.
ik_sdk_ — Control plane
Provisions and reconciles workloads. Held by your ManagementClient and scoped to one project. Use it from deploy scripts, CI, or a backend that declares what should exist. Cannot call inference.
ik_live_ — Data plane
Calls inference against a workload’s OpenAI-compatible endpoint. Passed per workload — one app can hold many of them, one per workload, each with its own scope. Cannot provision.
The prefix tells you which is which: control-plane tokens start with ik_sdk_, data-plane keys start with ik_live_. They are not interchangeable, and using one where the other is expected returns a 403 (see Least privilege below).
Both live in the dashboard, but on different screens because they have different blast radius.
ik_sdk_ — the Tokens screen.
Open your project in the dashboard and go to Tokens. Create an SDK token; it is bound to that project. This is the value your ManagementClient reads.
ik_live_ — the API Keys screen.
Go to API Keys and mint a key. Choose its access scope: either specific endpoints (one or more workloads — the narrowest, recommended choice) or the entire project. Hand each app the narrowest key it needs. This is the value you pass to data.endpoint(...).
When you pick “Entire project” for a key’s scope, it is granted a
projectSlug/* wildcard and can call every workload in the project, including
ones created later. This is handy when you don’t want to re-issue or reassign
a key every time you add a workload — for example minting the key before the
workload exists, then ensure()-ing the workload from code.
The SDK loads tokens from the environment so secrets never have to appear in your source.
| Variable | Token | Used by | Default for |
|---|---|---|---|
INFERENCEKEY_SDK_TOKEN | ik_sdk_… | ManagementClient | The control plane |
INFERENCEKEY_API_KEY | ik_live_… | DataClient | The default ik_live_ key |
INFERENCEKEY_PROJECT | — | both clients | The project slug |
INFERENCEKEY_BASE_URL | — | both clients | The API base URL |
When a value can come from more than one place, the SDK resolves it in this order:
project="acme", api_key="ik_live_…") always wins.INFERENCEKEY_* vars above.So an api_key passed straight to data.endpoint(...) overrides INFERENCEKEY_API_KEY, and a project passed to from_env/fromEnv overrides INFERENCEKEY_PROJECT.
Export the tokens, then construct each client with from_env / fromEnv. The management client reads INFERENCEKEY_SDK_TOKEN; the data client reads INFERENCEKEY_API_KEY unless you pass an api_key per endpoint.
export INFERENCEKEY_SDK_TOKEN="ik_sdk_your_control_plane_token"export INFERENCEKEY_API_KEY="ik_live_your_default_inference_key"export INFERENCEKEY_PROJECT="acme"# optional: export INFERENCEKEY_BASE_URL="https://api.inferencekey.com"from inferencekey import ManagementClient, DataClient
# Reads INFERENCEKEY_SDK_TOKEN (control plane).mgmt = ManagementClient.from_env(project="acme")
# Reads INFERENCEKEY_API_KEY by default (data plane).data = DataClient.from_env(project="acme")
# Override per workload with an explicit ik_live_ key — this wins over the env var.ep = data.endpoint("support-bot", api_key="ik_live_...")export INFERENCEKEY_SDK_TOKEN="ik_sdk_your_control_plane_token"export INFERENCEKEY_API_KEY="ik_live_your_default_inference_key"export INFERENCEKEY_PROJECT="acme"# optional: export INFERENCEKEY_BASE_URL="https://api.inferencekey.com"import { ManagementClient, DataClient } from "@inferencekey/sdk";
// Reads INFERENCEKEY_SDK_TOKEN (control plane).const mgmt = ManagementClient.fromEnv({ project: "acme" });
// Reads INFERENCEKEY_API_KEY by default (data plane).const data = DataClient.fromEnv({ project: "acme" });
// Override per workload with an explicit ik_live_ key — this wins over the env var.const ep = data.endpoint("support-bot", { apiKey: process.env.SUPPORT_IK_LIVE });The split is enforced on the wire, not just by convention. Hand a token to the wrong client and the API rejects it with a 403:
| Error code | What happened |
|---|---|
wrong_credential_type | An ik_live_ key was used to provision, or an ik_sdk_ token was used to call inference. |
project_scope_mismatch | The token is valid but scoped to a different project than the one you addressed. |
scope_insufficient | The token’s scope does not cover the requested action or workload. |
Keep it simple:
ik_sdk_.ik_live_ keys for the workloads it actually calls.If a piece of code needs to do both, that’s a sign it should be split into a deploy step and a runtime step.
New to InferenceKey? Create an account or open the dashboard · Learn more at inferencekey.com.