Back to all workshops
WorkshopBeginner

OpenClaw + Token Factory — Agent-Grade Inference

Open Models. Structured Outputs. Production-Ready Tool Calling.

Deploy OpenClaw on Nebius Serverless (CPU-only) and connect it to Token Factory for inference. No Mac Minis, no local installs, no security headaches — a single serverless endpoint replaces $1,000 in hardware, and you can run 100 agents for the same cost. You'll choose the right open model, configure structured outputs for reliable tool calling, and optimize cost and latency.

Jump to Step-by-Step Guide

Who This Is For

Developers, founders, AI engineers getting started with agents

Key Value

Production agent running on serverless infra with optimized inference

You'll Say

"I deployed an agent with tool calling in under 30 minutes — and it costs pennies per run"

What You'll Build

1

An OpenClaw agent deployed on Nebius Serverless (CPU-only, no GPU needed)

2

Connected to Token Factory endpoints for fast, cheap open-model inference

3

A working tool-calling workflow with structured outputs (JSON mode)

What We'll Cover

  • Why serverless beats local: a Mac Mini costs $1,000, exposes your network, and runs one agent — a serverless endpoint costs pennies and scales to hundreds
  • How Nebius Serverless works: deploy containers without managing VMs or clusters
  • Token Factory basics: OpenAI-compatible API, model catalog, per-token pricing
  • Choosing the right model: Llama 3.1, Mistral, DeepSeek — tradeoffs for agents
  • Structured outputs and JSON mode for reliable tool calling
  • Cost and latency optimization: batching, model selection, endpoint configuration

Schedule

12:00 PM – 12:30 PM

Architecture Overview: Serverless + Token Factory

How Nebius Serverless and Token Factory fit together for agent workloads

  • Why not local? Mac Minis cost $1,000+, expose your home network, and run a single agent. Serverless scales to hundreds for less.
  • Serverless CPU-only containers for hosting OpenClaw (no GPU needed for the orchestrator)
  • Token Factory for inference: OpenAI-compatible API, open models, per-token billing
12:30 PM – 1:15 PM

Hands-On: Deploy OpenClaw on Serverless

Set up the Nebius CLI, create a serverless endpoint, and deploy OpenClaw

  • Install and configure the Nebius CLI
  • Create a serverless endpoint with the OpenClaw container image
  • Verify the endpoint is live and accepting requests
  • Connect to Token Factory for inference with your API key
1:15 PM – 2:00 PM

Build a Tool-Calling Workflow

Configure structured outputs and build a real agent workflow

  • Choose your model: compare Llama 3.1, Mistral, DeepSeek for tool calling
  • Set up JSON mode and structured outputs for reliable function calls
  • Build a business workflow: email triage, lead scoring, or data extraction
  • Test end-to-end and iterate on prompts
2:00 PM – 2:30 PM

Optimization & Cost Analysis

Tune latency, compare model costs, and plan for production

  • Benchmark latency across different models and presets
  • Cost breakdown: per-token pricing vs. closed API alternatives
  • When to use dedicated endpoints vs. shared Token Factory
  • Q&A and next steps for production deployment

Prerequisites

  • Laptop with a browser and terminal access
  • A Nebius AI Cloud account (we'll help you set one up if needed)
  • Basic comfort with command-line tools

You'll Leave With

A live OpenClaw agent running on Nebius Serverless
Token Factory API keys and configured endpoints
A tool-calling workflow with structured JSON outputs
Cost estimates for running agents 24/7 on open models
CLI commands to redeploy, update, and monitor your agent

Step-by-Step Guide

Follow these steps during the workshop. Each step includes commands you can copy, tips from our mentors, and a checkpoint to verify before moving on.

Step 1~5 min

Install the Nebius CLI

Install and configure the Nebius AI Cloud CLI, which is the primary way to manage serverless endpoints.

Instructions

  1. 1.Download and install the Nebius CLI from the official docs
  2. 2.Run the login flow to authenticate with your Nebius account
  3. 3.Verify your project ID is set correctly

Commands

# Install Nebius CLI (macOS)
curl -sSL https://storage.eu-north1.nebius.cloud/cli/install.sh | bash
# Login to your account
nebius auth login
# Verify your config
cat ~/.nebius/config.yaml

Tips

If you don't have a Nebius account yet, sign up at nebius.com — mentors can help you get set up

Checkpoint

Running 'nebius iam whoami' returns your user info without errors.

Step 2~3 min

Get Your Network and Subnet IDs

Look up the network and subnet IDs you'll need to deploy your serverless endpoint.

Instructions

  1. 1.List your VPC networks to find the network ID
  2. 2.Get the default subnet ID within that network
  3. 3.Save both IDs — you'll use them in the next step

Commands

# List networks
nebius vpc network list
# Get default subnet ID
nebius vpc subnet get-by-name --name default-subnet \
--format jsonpath='{.metadata.id}'

Checkpoint

You have a network ID and subnet ID saved for the next step.

Step 3~10 min

Deploy OpenClaw on Serverless

Create a serverless endpoint running OpenClaw. This is a CPU-only container — no GPU needed for the agent orchestrator.

Instructions

  1. 1.Generate an auth token for your endpoint
  2. 2.Create the serverless endpoint using the Nebius CLI
  3. 3.Wait ~30 seconds for the endpoint to become active
  4. 4.Retrieve the public IP to test connectivity

Commands

# Generate auth token
export AUTH_TOKEN=$(openssl rand -hex 32)
# Deploy OpenClaw on Serverless
nebius msp serverless v1alpha1 endpoint create \
--name openclaw-agent \
--container-image openclaw:latest \
--container-template-resources-platform cpu-d3 \
--container-template-resources-preset 4vcpu-16gb \
--port 8080 \
--username admin \
--password "$AUTH_TOKEN" \
--network-id <your-network-id> \
--parent-id <your-project-id>
# Get endpoint ID
export ENDPOINT_ID=$(nebius msp serverless v1alpha1 endpoint get-by-name \
--name openclaw-agent --format jsonpath='{.metadata.id}')
# Get public IP
export ENDPOINT_IP=$(nebius msp serverless v1alpha1 endpoint get $ENDPOINT_ID \
--format jsonpath='{.status.public_endpoints[0]}')

Tips

CPU-only is intentional — OpenClaw is the orchestrator, not the model. Token Factory handles inference on GPUs.
The 4vcpu-16gb preset is plenty for most agent workloads. Scale up later if needed.

Checkpoint

curl http://$ENDPOINT_IP:8080/health returns a 200 response.

Step 4~5 min

Connect to Token Factory

Get your Token Factory API key and configure OpenClaw to use it for inference.

Instructions

  1. 1.Go to Token Factory in the Nebius console and create an API key
  2. 2.Set the API key and endpoint URL in your OpenClaw config
  3. 3.Choose a model from the catalog (we recommend starting with Llama 3.1 70B)
  4. 4.Test a basic completion to verify the connection

Commands

# Set Token Factory credentials in OpenClaw
export TF_API_KEY=<your-token-factory-api-key>
# Test the connection directly
curl https://api.studio.nebius.com/v1/chat/completions \
-H "Authorization: Bearer $TF_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/Meta-Llama-3.1-70B-Instruct",
"messages": [{"role": "user", "content": "Hello, world!"}],
"max_tokens": 100
}'

Tips

Token Factory uses an OpenAI-compatible API — if your code works with OpenAI, it works with TF
Start with Llama 3.1 70B for a good balance of quality and cost. Try DeepSeek for coding tasks.

Checkpoint

The curl command returns a valid chat completion response from Token Factory.

Step 5~10 min

Configure Structured Outputs

Set up JSON mode and structured outputs so your agent can reliably call tools and parse responses.

Instructions

  1. 1.Enable JSON mode in your Token Factory requests
  2. 2.Define a tool schema for your use case (e.g., email actions, database queries)
  3. 3.Configure OpenClaw to use structured outputs for tool calling
  4. 4.Test that the model returns valid JSON matching your schema

Commands

# Example: structured output for tool calling
curl https://api.studio.nebius.com/v1/chat/completions \
-H "Authorization: Bearer $TF_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/Meta-Llama-3.1-70B-Instruct",
"messages": [{"role": "user", "content": "Classify this email: Meeting tomorrow at 3pm"}],
"response_format": {"type": "json_object"},
"max_tokens": 200
}'

Tips

JSON mode forces the model to output valid JSON — no more parsing errors in your agent loop
For complex tool schemas, test with a few examples before wiring into the full workflow

Checkpoint

Your agent returns valid JSON that matches your tool schema on 5 consecutive test runs.

Step 6~15 min

Build Your Agent Workflow

Wire everything together into a real business workflow — email triage, lead scoring, data extraction, or your own use case.

Instructions

  1. 1.Pick a use case from the examples or bring your own
  2. 2.Configure the agent loop: receive input → call model → execute tool → return result
  3. 3.Test with real-world data (sample emails, documents, etc.)
  4. 4.Iterate on prompts and tool schemas until results are reliable

Tips

Start with a simple workflow and add complexity. A 3-tool agent beats a 10-tool agent that breaks.
Use the mentor chat for prompt engineering help — we've seen what works across hundreds of use cases.

Checkpoint

Your agent handles 3 different inputs correctly end-to-end without manual intervention.

Step 7~10 min

Cost Analysis & Production Planning

Benchmark costs, compare models, and plan your production deployment.

Instructions

  1. 1.Run your workflow 10 times and note the token usage
  2. 2.Calculate monthly cost at your expected volume
  3. 3.Compare with GPT-4 / Claude API pricing for the same workload
  4. 4.Document your configuration for production deployment

Commands

# Check endpoint logs
nebius msp serverless v1alpha1 endpoint logs $ENDPOINT_ID
# View endpoint status
nebius msp serverless v1alpha1 endpoint get $ENDPOINT_ID

Tips

Most agent workflows cost 60-80% less on Token Factory vs. closed APIs
Save your CLI commands in a deploy script for easy redeployment

Checkpoint

You have a working agent, a cost estimate, and a deploy script you can run again anytime.

Ready to Build?

RSVP required. Spots are limited since we provide hands-on support for every attendee.

Register Now