403 wrong_credential_type
You passed an ik_sdk_ control token to the DataClient. Use the workload’s ik_live_ key instead.
Esta página aún no está disponible en tu idioma.
Your workload exists on the platform (see Your first ensure). Now call it.
Inference happens on the data plane. You reach a workload through its OpenAI-compatible endpoint with the DataClient, authenticated by a per-workload ik_live_ key — never your ik_sdk_ control token.
A workload that already exists — you have its workload_slug (for example support-bot) from ensure().
An ik_live_ key scoped to that workload. Generate one per workload in the dashboard; pass a different key for each workload your app calls.
The SDK installed and your environment configured:
pip install inferencekeyexport INFERENCEKEY_BASE_URL="https://api.inferencekey.com"export INFERENCEKEY_PROJECT="acme"export SUPPORT_IK_LIVE="ik_live_your_workload_key"npm install @inferencekey/sdkexport INFERENCEKEY_BASE_URL="https://api.inferencekey.com"export INFERENCEKEY_PROJECT="acme"export SUPPORT_IK_LIVE="ik_live_your_workload_key"Build a DataClient from the environment, open an endpoint for the workload slug with its ik_live_ key, then call generate_text.
import osfrom inferencekey import DataClient
data = DataClient.from_env(project="acme")
ep = data.endpoint("support-bot", api_key=os.environ["SUPPORT_IK_LIVE"])
out = ep.generate_text( prompt="Hola, ¿cómo puedo cancelar mi pedido?", temperature=0.2, max_tokens=300,)
print(out.text) # the completionprint(out.model) # the model that served itimport { DataClient } from "@inferencekey/sdk";
const data = DataClient.fromEnv({ project: "acme" });
const ep = data.endpoint("support-bot", { apiKey: process.env.SUPPORT_IK_LIVE,});
const out = await ep.generateText({ prompt: "Hola, ¿cómo puedo cancelar mi pedido?", temperature: 0.2, maxTokens: 300,});
console.log(out.text); // the completionconsole.log(out.model); // the model that served itThe result carries the generated text, the model that served the request, and a finish_reason. DataClient.from_env reads INFERENCEKEY_BASE_URL and INFERENCEKEY_PROJECT; the explicit project argument wins over the environment.
The same DataClient reaches an embedding workload. Open its endpoint with that workload’s key and call embed with one string or a list of strings — you get one vector per input back on embeddings.
import osfrom inferencekey import DataClient
data = DataClient.from_env(project="acme")
emb = data.endpoint( "billing", api_key=os.environ["BILLING_IK_LIVE"],).embed(input=["first document", "second document"])
print(len(emb.embeddings)) # 2 vectorsprint(len(emb.embeddings[0])) # dimensionality of each vectorprint(emb.model) # the embedding modelimport { DataClient } from "@inferencekey/sdk";
const data = DataClient.fromEnv({ project: "acme" });
const emb = await data .endpoint("billing", { apiKey: process.env.BILLING_IK_LIVE }) .embed({ input: ["first document", "second document"] });
console.log(emb.embeddings.length); // 2 vectorsconsole.log(emb.embeddings[0].length); // dimensionality of each vectorconsole.log(emb.model); // the embedding modelgenerate_text returns a single completed result. To stream tokens as they’re produced, use generate_text_stream / generateTextStream instead — same parameters, but it yields one TextChunk at a time. Concatenate chunk.text to rebuild the full reply.
for chunk in ep.generate_text_stream(prompt="Hola"): print(chunk.text, end="", flush=True)for await (const chunk of ep.generateTextStream({ prompt: "Hola" })) { process.stdout.write(chunk.text);}Under the hood the endpoint speaks server-sent events, terminated by data: [DONE]; the SDK parses those frames into TextChunks for you. For the raw wire contract see the references below.
403 wrong_credential_type
You passed an ik_sdk_ control token to the DataClient. Use the workload’s ik_live_ key instead.
403 project_scope_mismatch
The key belongs to a different project than the DataClient. Check INFERENCEKEY_PROJECT and the key’s scope.
403 scope_insufficient
The key isn’t scoped to this workload. Generate a key for this workload in the dashboard.
The SDK raises typed errors — AuthError, PermissionDenied, ValidationError, ApiError — all subclasses of InferenceKeyError. See Common errors.
New to InferenceKey? Create an account or open the dashboard · Learn more at inferencekey.com.