Open Models. Structured Outputs. Production-Ready Tool Calling.
Deploy OpenClaw on Nebius Serverless (CPU-only) and connect it to Token Factory for inference. No Mac Minis, no local installs, no security headaches — a single serverless endpoint replaces $1,000 in hardware, and you can run 100 agents for the same cost. You'll choose the right open model, configure structured outputs for reliable tool calling, and optimize cost and latency.
Jump to Step-by-Step GuideDevelopers, founders, AI engineers getting started with agents
Production agent running on serverless infra with optimized inference
"I deployed an agent with tool calling in under 30 minutes — and it costs pennies per run"
An OpenClaw agent deployed on Nebius Serverless (CPU-only, no GPU needed)
Connected to Token Factory endpoints for fast, cheap open-model inference
A working tool-calling workflow with structured outputs (JSON mode)
How Nebius Serverless and Token Factory fit together for agent workloads
Set up the Nebius CLI, create a serverless endpoint, and deploy OpenClaw
Configure structured outputs and build a real agent workflow
Tune latency, compare model costs, and plan for production
Follow these steps during the workshop. Each step includes commands you can copy, tips from our mentors, and a checkpoint to verify before moving on.
Install and configure the Nebius AI Cloud CLI, which is the primary way to manage serverless endpoints.
# Install Nebius CLI (macOS)curl -sSL https://storage.eu-north1.nebius.cloud/cli/install.sh | bash# Login to your accountnebius auth login# Verify your configcat ~/.nebius/config.yaml
Running 'nebius iam whoami' returns your user info without errors.
Look up the network and subnet IDs you'll need to deploy your serverless endpoint.
# List networksnebius vpc network list# Get default subnet IDnebius vpc subnet get-by-name --name default-subnet \--format jsonpath='{.metadata.id}'
You have a network ID and subnet ID saved for the next step.
Create a serverless endpoint running OpenClaw. This is a CPU-only container — no GPU needed for the agent orchestrator.
# Generate auth tokenexport AUTH_TOKEN=$(openssl rand -hex 32)# Deploy OpenClaw on Serverlessnebius msp serverless v1alpha1 endpoint create \--name openclaw-agent \--container-image openclaw:latest \--container-template-resources-platform cpu-d3 \--container-template-resources-preset 4vcpu-16gb \--port 8080 \--username admin \--password "$AUTH_TOKEN" \--network-id <your-network-id> \--parent-id <your-project-id># Get endpoint IDexport ENDPOINT_ID=$(nebius msp serverless v1alpha1 endpoint get-by-name \--name openclaw-agent --format jsonpath='{.metadata.id}')# Get public IPexport ENDPOINT_IP=$(nebius msp serverless v1alpha1 endpoint get $ENDPOINT_ID \--format jsonpath='{.status.public_endpoints[0]}')
curl http://$ENDPOINT_IP:8080/health returns a 200 response.
Get your Token Factory API key and configure OpenClaw to use it for inference.
# Set Token Factory credentials in OpenClawexport TF_API_KEY=<your-token-factory-api-key># Test the connection directlycurl https://api.studio.nebius.com/v1/chat/completions \-H "Authorization: Bearer $TF_API_KEY" \-H "Content-Type: application/json" \-d '{"model": "meta-llama/Meta-Llama-3.1-70B-Instruct","messages": [{"role": "user", "content": "Hello, world!"}],"max_tokens": 100}'
The curl command returns a valid chat completion response from Token Factory.
Set up JSON mode and structured outputs so your agent can reliably call tools and parse responses.
# Example: structured output for tool callingcurl https://api.studio.nebius.com/v1/chat/completions \-H "Authorization: Bearer $TF_API_KEY" \-H "Content-Type: application/json" \-d '{"model": "meta-llama/Meta-Llama-3.1-70B-Instruct","messages": [{"role": "user", "content": "Classify this email: Meeting tomorrow at 3pm"}],"response_format": {"type": "json_object"},"max_tokens": 200}'
Your agent returns valid JSON that matches your tool schema on 5 consecutive test runs.
Wire everything together into a real business workflow — email triage, lead scoring, data extraction, or your own use case.
Your agent handles 3 different inputs correctly end-to-end without manual intervention.
Benchmark costs, compare models, and plan your production deployment.
# Check endpoint logsnebius msp serverless v1alpha1 endpoint logs $ENDPOINT_ID# View endpoint statusnebius msp serverless v1alpha1 endpoint get $ENDPOINT_ID
You have a working agent, a cost estimate, and a deploy script you can run again anytime.
RSVP required. Spots are limited since we provide hands-on support for every attendee.
Register Now