ik_sdk_ — control plane
Provisions and reconciles workloads. Held by your ManagementClient, scoped to one project. Cannot call inference.
Esta página aún no está disponible en tu idioma.
InferenceKey uses two kinds of tokens, one per plane, so that the code that provisions workloads never holds the credential that can call them — and vice versa. This is least privilege by construction: a leaked data token cannot create or delete workloads, and a leaked control token cannot run inference or read prompts.
ik_sdk_ — control plane
Provisions and reconciles workloads. Held by your ManagementClient, scoped to one project. Cannot call inference.
ik_live_ — data plane
Calls inference. Passed per workload to the DataClient. One app can
use many ik_live_ keys. Cannot provision.
ik_live_ | ik_sdk_ | |
|---|---|---|
| Purpose | Call inference (generate, embed, …) | Provision & reconcile workloads |
| Plane | Data | Control |
| Scope | Per workload, or whole project (projectSlug/*) | Per project |
| Stored in | INFERENCEKEY_API_KEY (default) or passed per call | INFERENCEKEY_SDK_TOKEN |
| Can call inference? | Yes | No |
| Can provision? | No | Yes |
| SDK client | DataClient | ManagementClient |
| SDK placement | data.endpoint(slug, api_key=…) | ManagementClient.from_env(...) |
| Env var | INFERENCEKEY_API_KEY | INFERENCEKEY_SDK_TOKEN |
The control token is read once, from the environment, when you build the
ManagementClient. The data token is passed per workload when you resolve
an endpoint — it is never baked into the management client.
from inferencekey import ManagementClient, DataClient, WorkloadSpec, Backend
# Control plane: ik_sdk_ comes from INFERENCEKEY_SDK_TOKEN.mgmt = ManagementClient.from_env(project="acme")ref = mgmt.ensure(WorkloadSpec( name="support-bot", slug="support-bot", model="meta-llama/Llama-3.1-8B-Instruct", backend=Backend.VLLM, command="vllm serve meta-llama/Llama-3.1-8B-Instruct --max-model-len 8192",))
# Data plane: ik_live_ is passed per workload.data = DataClient.from_env(project="acme")ep = data.endpoint(ref.workload_slug, api_key="ik_live_...")out = ep.generate_text(prompt="Hola", temperature=0.2, max_tokens=300)print(out.text)import { ManagementClient, DataClient, Backend } from "@inferencekey/sdk";
// Control plane: ik_sdk_ comes from INFERENCEKEY_SDK_TOKEN.const mgmt = ManagementClient.fromEnv({ project: "acme" });const ref = await mgmt.ensure({ name: "support-bot", slug: "support-bot", model: "meta-llama/Llama-3.1-8B-Instruct", backend: Backend.Vllm, command: "vllm serve meta-llama/Llama-3.1-8B-Instruct --max-model-len 8192",});
// Data plane: ik_live_ is passed per workload.const data = DataClient.fromEnv({ project: "acme" });const ep = data.endpoint(ref.workloadSlug, { apiKey: process.env.SUPPORT_IK_LIVE });const out = await ep.generateText({ prompt: "Hola", temperature: 0.2, maxTokens: 300 });console.log(out.text);Resolution order is explicit argument > environment variable.
export INFERENCEKEY_BASE_URL="https://api.inferencekey.com"export INFERENCEKEY_PROJECT="acme"export INFERENCEKEY_SDK_TOKEN="ik_sdk_..." # control plane (ManagementClient)export INFERENCEKEY_API_KEY="ik_live_..." # data plane default (DataClient)INFERENCEKEY_API_KEY is the default ik_live_ key. When an app calls
several workloads with different keys, pass each one explicitly to
data.endpoint(slug, api_key=…) and the explicit value wins over the env
default.
ik_sdk_The control token is project-scoped: it can act only inside the single project it was issued for, and only through the control routes. Its capabilities map to the workload lifecycle:
POST /api/projects/:project_id/workloadsPATCH /api/workloads/:idDELETE /api/workloads/:idGET the workload listWhat an ik_sdk_ token can not do:
/endpoint/...) — that is the data plane’s job.acme cannot touch beta.Create in the dashboard. Both ik_sdk_ and ik_live_ tokens are minted in
the dashboard. Control tokens are
issued for a project; data tokens are issued for a workload.
Copy it once. The full secret is shown only at creation time. The dashboard stores a hash, not the raw token — if you lose it, you cannot read it back, you rotate it.
Store it as a secret. Put ik_sdk_ in INFERENCEKEY_SDK_TOKEN and your
ik_live_ keys in your secret manager / INFERENCEKEY_API_KEY (or pass them
per workload). Never commit a token to source control.
Rotate or revoke. Generate a replacement and update the environment, then
revoke the old token in the dashboard. Revoking takes effect immediately:
control calls start returning 403, data calls start returning AuthError.
New to InferenceKey? Create an account or open the dashboard · Learn more at inferencekey.com.