403 wrong_credential_type
You passed an ik_sdk_ control token to the DataClient. Use the workload’s ik_live_ key instead.
Your workload exists on the platform (see Your first ensure). Now call it.
Inference happens on the data plane. You reach a workload through its OpenAI-compatible endpoint with the DataClient, authenticated by a per-workload ik_live_ key — never your ik_sdk_ control token.
A workload that already exists — you have its workload_slug (for example support-bot) from ensure().
An ik_live_ key scoped to that workload. Generate one per workload in the dashboard; pass a different key for each workload your app calls.
The SDK installed and your environment configured:
pip install inferencekeyexport INFERENCEKEY_BASE_URL="https://api.inferencekey.com"export INFERENCEKEY_PROJECT="acme"export SUPPORT_IK_LIVE="ik_live_your_workload_key"npm install @inferencekey/sdkexport INFERENCEKEY_BASE_URL="https://api.inferencekey.com"export INFERENCEKEY_PROJECT="acme"export SUPPORT_IK_LIVE="ik_live_your_workload_key"Build a DataClient from the environment, open an endpoint for the workload slug with its ik_live_ key, then call generate_text.
import osfrom inferencekey import DataClient
data = DataClient.from_env(project="acme")
ep = data.endpoint("support-bot", api_key=os.environ["SUPPORT_IK_LIVE"])
out = ep.generate_text( prompt="Hola, ¿cómo puedo cancelar mi pedido?", temperature=0.2, max_tokens=300,)
print(out.text) # the completionprint(out.model) # the model that served itimport { DataClient } from "@inferencekey/sdk";
const data = DataClient.fromEnv({ project: "acme" });
const ep = data.endpoint("support-bot", { apiKey: process.env.SUPPORT_IK_LIVE,});
const out = await ep.generateText({ prompt: "Hola, ¿cómo puedo cancelar mi pedido?", temperature: 0.2, maxTokens: 300,});
console.log(out.text); // the completionconsole.log(out.model); // the model that served itThe result carries the generated text, the model that served the request, and a finish_reason. DataClient.from_env reads INFERENCEKEY_BASE_URL and INFERENCEKEY_PROJECT; the explicit project argument wins over the environment.
The same DataClient reaches an embedding workload. Open its endpoint with that workload’s key and call embed with one string or a list of strings — you get one vector per input back on embeddings.
import osfrom inferencekey import DataClient
data = DataClient.from_env(project="acme")
emb = data.endpoint( "billing", api_key=os.environ["BILLING_IK_LIVE"],).embed(input=["first document", "second document"])
print(len(emb.embeddings)) # 2 vectorsprint(len(emb.embeddings[0])) # dimensionality of each vectorprint(emb.model) # the embedding modelimport { DataClient } from "@inferencekey/sdk";
const data = DataClient.fromEnv({ project: "acme" });
const emb = await data .endpoint("billing", { apiKey: process.env.BILLING_IK_LIVE }) .embed({ input: ["first document", "second document"] });
console.log(emb.embeddings.length); // 2 vectorsconsole.log(emb.embeddings[0].length); // dimensionality of each vectorconsole.log(emb.model); // the embedding modelgenerate_text returns a single completed result. To stream tokens as they’re produced, use generate_text_stream / generateTextStream instead — same parameters, but it yields one TextChunk at a time. Concatenate chunk.text to rebuild the full reply.
for chunk in ep.generate_text_stream(prompt="Hola"): print(chunk.text, end="", flush=True)for await (const chunk of ep.generateTextStream({ prompt: "Hola" })) { process.stdout.write(chunk.text);}Under the hood the endpoint speaks server-sent events, terminated by data: [DONE]; the SDK parses those frames into TextChunks for you. For the raw wire contract see the references below.
403 wrong_credential_type
You passed an ik_sdk_ control token to the DataClient. Use the workload’s ik_live_ key instead.
403 project_scope_mismatch
The key belongs to a different project than the DataClient. Check INFERENCEKEY_PROJECT and the key’s scope.
403 scope_insufficient
The key isn’t scoped to this workload. Generate a key for this workload in the dashboard.
The SDK raises typed errors — AuthError, PermissionDenied, ValidationError, ApiError — all subclasses of InferenceKeyError. See Common errors.
New to InferenceKey? Create an account or open the dashboard · Learn more at inferencekey.com.