OpenClaw on OCI — Self-Hosting Guide

Installation Guide

9-Step Setup

Follow in order. Total time is approximately 40–55 minutes. Tailscale installs early so you can switch to its SSH before finishing. OpenClaw runs as a native systemd service; Ollama runs in Docker.

OCI Instance

OCI Firewall

Harden Server

Tailscale

Docker

Ollama + Models

OpenClaw

Enable Access

Maintenance

Provision the ARM Instance

Create a VM.Standard.A1.Flex instance on OCI: Ubuntu 24.04 LTS, 4 OCPUs, 24 GB RAM, 200 GB boot volume.

Log in → Compute → Instances → Create Instance
Image: click Change image → select Ubuntu 24.04
Shape: click Change shape → Ampere → VM.Standard.A1.Flex → 4 OCPUs, 24 GB RAM
Networking: enable Assign a public IPv4 address
SSH keys: download both private and public keys
Boot volume: set to 200 GB
Click Create and wait ~2 minutes for Running state

Note: Record the Public IP address from the instance details — you'll need it to SSH in and to connect Tailscale in Step 7.

ARM A1 Flex instances are frequently out of capacity. The OCI Console silently fails with no retry option. This script uses the OCI CLI to cycle through all three Availability Domains every 60 seconds until provisioning succeeds. Requires the OCI CLI — install it first: docs.oracle.com/iaas/Content/API/SDKDocs/cliinstall.htm, then run oci setup config.

Edit and run the retry script

#!/bin/bash
# retry_oci.sh — Retries VM.Standard.A1.Flex provisioning across all ADs
# Fill in the four variables below, then: chmod +x retry_oci.sh && ./retry_oci.sh

# ── Fill these in  ─────────────────────────────────────────────────────────

COMPARTMENT_ID="YOUR_COMPARTMENT_OCID"

# Ubuntu 24.04 Minimal aarch64 image OCID for your region
IMAGE_ID="YOUR_UBUNTU_2404_ARM_IMAGE_OCID"

# Subnet OCID from your VCN (Networking → VCN → Subnets)
SUBNET_ID="YOUR_SUBNET_OCID"

# Availability Domains — format: "<tenancy-prefix>:<REGION>-AD-<N>"
# Get the exact strings with: oci iam availability-domain list --query 'data[*].name'
ADS=(
  "YOUR-PREFIX:YOUR-REGION-AD-1"
  "YOUR-PREFIX:YOUR-REGION-AD-2"
  "YOUR-PREFIX:YOUR-REGION-AD-3"
)

# ── Instance config (no changes needed) ──────────────────────────────────────

SHAPE="VM.Standard.A1.Flex"
SSH_KEY_FILE="$HOME/.ssh/id_ed25519.pub"
DISPLAY_NAME="openclaw-server"
OCPUS=4
MEMORY_GB=24
BOOT_VOLUME_SIZE_GB=200

# ── Retry loop ────────────────────────────────────────────────────────────────

attempt=0

while true; do
  attempt=$((attempt + 1))
  echo "[$(date '+%Y-%m-%d %H:%M:%S')] Attempt $attempt"

  for AD in "${ADS[@]}"; do
    echo "  Trying $AD..."

    if result=$(oci compute instance launch \
      --compartment-id "$COMPARTMENT_ID" \
      --availability-domain "$AD" \
      --image-id "$IMAGE_ID" \
      --subnet-id "$SUBNET_ID" \
      --shape "$SHAPE" \
      --shape-config "{\"ocpus\": $OCPUS, \"memoryInGBs\": $MEMORY_GB}" \
      --display-name "$DISPLAY_NAME" \
      --ssh-authorized-keys-file "$SSH_KEY_FILE" \
      --assign-public-ip true \
      --boot-volume-size-in-gbs "$BOOT_VOLUME_SIZE_GB" \
      --is-pv-encryption-in-transit-enabled true \
      2>&1); then
      echo ""
      echo "SUCCESS on $AD!"
      echo "$result" | python3 -c "
import sys, json
d = json.load(sys.stdin)['data']
print(f'  Instance ID : {d[\"id\"]}')
print(f'  State       : {d[\"lifecycle-state\"]}')
print(f'  Public IP   : (visible in OCI Console in ~30s)')
" 2>/dev/null || echo "$result" | grep -m1 '"id"'
      exit 0
    else
      echo "  Failed: $(echo "$result" | grep -oi '"message":"[^"]*"' | head -1)"
    fi
  done

  echo "  All ADs exhausted — waiting 60 seconds..."
  sleep 60
done

Configure OCI Firewall — Tailscale Only

OpenClaw is accessed entirely via Tailscale — no public web server, no domain, no certificates. Remove all public ingress rules and add only the single UDP port Tailscale needs.

In the OCI Console: Networking → Virtual Cloud Networks → [your VCN] → Security Lists → Default Security List
Delete any existing ingress rules for ports 80 and 443 (if present)
Click Add Ingress Rules and add exactly one rule:
- Source CIDR: 0.0.0.0/0 | Protocol: UDP | Destination Port: 41641
Port 22 (SSH) should already exist — keep it until Tailscale SSH is working

Do NOT open ports 80 or 443. OpenClaw's gateway binds to 127.0.0.1:18789 only — it is never exposed directly to the internet. Tailscale Serve (Step 8) creates the HTTPS tunnel privately. Opening 80/443 would expose an unprotected HTTP endpoint.

Harden the Server

Configure UFW, Fail2Ban, and swap. UFW rules only cover SSH — ports 80 and 443 are intentionally absent.

# Update system
sudo apt update && sudo apt upgrade -y
sudo apt install -y ufw fail2ban unattended-upgrades

# UFW: allow SSH only (Tailscale handles everything else)
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow 22/tcp
sudo ufw --force enable
sudo ufw status verbose

# Fail2Ban: protect SSH from brute-force
sudo tee /etc/fail2ban/jail.local > /dev/null << 'EOF'
[sshd]
enabled = true
port = ssh
filter = sshd
logpath = /var/log/auth.log
maxretry = 3
bantime = 3600
findtime = 600
EOF
sudo systemctl enable --now fail2ban

# Harden SSH: key-only auth, no root login
sudo sed -i 's/#PasswordAuthentication yes/PasswordAuthentication no/' /etc/ssh/sshd_config
sudo sed -i 's/PermitRootLogin yes/PermitRootLogin no/' /etc/ssh/sshd_config
sudo systemctl restart sshd

# Auto security updates
sudo dpkg-reconfigure -plow unattended-upgrades

# 16 GB swap — prevents OOM kills when loading large models
sudo fallocate -l 16G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
echo 'vm.swappiness=10' | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
free -h

Install Tailscale

Install Tailscale and join your Tailnet now — before the rest of the stack — so you can switch to Tailscale SSH early and drop the public port 22 before finishing setup.

# Install Tailscale
curl -fsSL https://tailscale.com/install.sh | sh

# Connect to your Tailnet and enable Tailscale SSH.
# This prints an auth URL — open it on your local machine to approve.
sudo tailscale up --ssh --hostname=openclaw

# Confirm it is connected and note the Tailscale IP
tailscale status

Tailscale SSH is now active. From this point you can SSH via ssh ubuntu@openclaw (or just ssh openclaw) from any device in your Tailnet instead of using the public IP. You can optionally disable port 22 in UFW now: sudo ufw delete allow 22/tcp && sudo ufw reload. The tailscale serve configuration that exposes OpenClaw happens in Step 8, once OpenClaw is running.

Install Docker

Docker is used exclusively for Ollama. OpenClaw itself runs as a native systemd service (installed in Step 7) — Docker is not needed for it.

# Install Docker Engine via official script (ARM64 compatible)
curl -fsSL https://get.docker.com | sudo sh

sudo usermod -aG docker $USER
newgrp docker
sudo systemctl enable docker

# Verify
docker --version
docker compose version

Deploy Ollama & Pull Models

Run Ollama in Docker with ARM64-optimised flags, then pull your models. Ollama binds to 127.0.0.1:11434 only — OpenClaw connects to it locally.

# Create Ollama project directory
mkdir -p ~/ollama && cd ~/ollama

Create ~/ollama/docker-compose.yml:

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    restart: unless-stopped
    volumes:
      - ollama_data:/root/.ollama
    environment:
      - OLLAMA_HOST=0.0.0.0:11434
      - OLLAMA_FLASH_ATTENTION=1
      - OLLAMA_NUM_PARALLEL=2
      - OLLAMA_KEEP_ALIVE=5m
    ports:
      - "127.0.0.1:11434:11434"
    deploy:
      resources:
        limits:
          memory: 20G

volumes:
  ollama_data:

# Start Ollama (first run pulls the image — ~2 GB)
docker compose up -d
docker compose ps

# Verify the API is reachable
curl http://localhost:11434/api/tags

Pull models — see the Models section for the full list with RAM requirements:

# Recommended: start with these three
docker exec -it ollama ollama pull gemma4:e4b      # ~3.0 GB — multimodal, recommended
docker exec -it ollama ollama pull gemma4:e2b      # ~1.5 GB — fastest variant
docker exec -it ollama ollama pull llama3.1:8b     # ~4.7 GB — general purpose workhorse

# Confirm they're available
docker exec -it ollama ollama list

# Quick smoke test
docker exec -it ollama ollama run gemma4:e4b "Respond in one sentence: what are you?"

Port binding: Ollama binds to 127.0.0.1:11434 — only reachable from the same machine. OpenClaw connects at http://localhost:11434. It is never exposed to the internet.

Install OpenClaw (Native)

Install OpenClaw via the official installer, configure it against your Ollama instance, then start it as a systemd service and pair your first device.

# Install Node 24 (OpenClaw requires Node 22.14+ or Node 24)
curl -fsSL https://deb.nodesource.com/setup_24.x | sudo -E bash -
sudo apt-get install -y nodejs
node --version   # should print v24.x.x

# Install OpenClaw
curl -fsSL https://openclaw.ai/install.sh | bash

# Reload shell to pick up the openclaw binary
source ~/.bashrc

# Verify
openclaw --version
openclaw doctor

Run onboarding — configures the Ollama provider interactively:

openclaw onboard
# When prompted for a provider, select: Ollama
# When prompted for the endpoint, enter: http://localhost:11434
# OpenClaw queries /api/tags and auto-discovers all pulled models

Critical Ollama endpoint: Use http://localhost:11434 with no path suffix. Do NOT use http://localhost:11434/v1 — the OpenAI-compatible /v1 endpoint breaks OpenClaw's tool calling and models output raw JSON instead of executing tools.

Configure the gateway and start as a systemd service:

# Bind gateway to loopback only (Tailscale Serve proxies to it)
openclaw config set gateway.bind loopback
openclaw config set gateway.trustedProxies '["127.0.0.1"]'

# Auth mode: none is fine behind Tailscale. Use token for stricter access control.
openclaw config set gateway.auth.mode none
openclaw config set gateway.controlUi.dangerouslyAllowHostHeaderOriginFallback true

# Enable Tailscale Serve integration (wired up in Step 8)
openclaw config set gateway.tailscale.mode serve

# Start as a user systemd service (auto-restarts on reboot)
systemctl --user enable --now openclaw-gateway.service
openclaw gateway status

# Health checks — both should return "ok"
curl -fsS http://127.0.0.1:18789/healthz
curl -fsS http://127.0.0.1:18789/readyz

Pair your first device:

# After Step 8, open https://openclaw.<tailnet-name>.ts.net in a browser.
# A pairing request will appear — approve it here on the server:
openclaw devices approve <requestId>

# Confirm the device is connected
openclaw devices list

Auto-discovery alternative: Set OLLAMA_API_KEY="ollama-local" in your environment to skip the onboarding flow — OpenClaw will auto-discover all models from /api/tags.

Enable Tailscale Access

Configure Tailscale Serve to proxy HTTPS traffic to the running OpenClaw gateway. This is the final step before you can open the dashboard from any device on your Tailnet.

# Proxy HTTPS → OpenClaw gateway (Tailscale provisions the cert automatically)
sudo tailscale serve https / http://127.0.0.1:18789

# Confirm the serve rule is active
tailscale serve status

# Confirm Tailscale is fully connected
tailscale status

Your access URL: OpenClaw is now reachable at https://openclaw.<tailnet-name>.ts.net from any device in your Tailnet. Tailscale automatically provisions and renews the HTTPS certificate — no Let's Encrypt or domain registration needed.

# Drop the public SSH port — Tailscale SSH handles all access from here on
sudo ufw delete allow 22/tcp
sudo ufw reload
sudo ufw status verbose   # should show no remaining ingress rules

Zero public ports: After removing port 22, the instance has no listening ports exposed to the internet. All access is over Tailscale's encrypted WireGuard tunnel via UDP 41641.

Ongoing Maintenance

Commands for keeping the stack up to date, adding models, and backing up your data.

Update Ollama and running containers:

# Pull latest Ollama image and restart
cd ~/ollama && docker compose pull && docker compose up -d

# Update an existing model to its latest version
docker exec -it ollama ollama pull gemma4:e4b

# List all models and their sizes
docker exec -it ollama ollama list

# Check resource usage
docker stats ollama

Update OpenClaw:

# Re-run the installer to get the latest version
curl -fsSL https://openclaw.ai/install.sh | bash

# Restart the systemd service to apply the update
systemctl --user restart openclaw-gateway.service
openclaw gateway status
openclaw --version

Backup and restore:

# Backup Ollama model data (Docker volume)
docker run --rm \
  -v ollama_data:/data \
  -v $HOME/backups:/backup \
  ubuntu tar czf /backup/ollama-$(date +%Y%m%d).tar.gz /data

# Backup OpenClaw config and sessions
tar czf $HOME/backups/openclaw-$(date +%Y%m%d).tar.gz ~/.openclaw

# Restore Ollama data
docker run --rm \
  -v ollama_data:/data \
  -v $HOME/backups:/backup \
  ubuntu tar xzf /backup/ollama-YYYYMMDD.tar.gz -C /

Local Model Library

AI Models Under 15 GB RAM

All models listed run on the OCI A1 Flex (24 GB RAM) via Ollama and are available as providers within OpenClaw. RAM figures are for 4-bit quantised (q4_K_M) variants.

⭐ Featured Model

Gemma 4 by Google DeepMind

Multimodal by default — all variants accept text and images. Designed for reasoning, coding, and agentic workflows. OpenClaw's tool calling works correctly via the native Ollama API endpoint.

e2b ~1.5 GB

e4b ~3.0 GB

27b MoE ~9 GB

128K
Context tokens

Text + 🖼️
Multimodal

3
Variants <15 GB

85.2%
MMLU Pro (27b)

# Balanced multimodal (recommended start)
docker exec -it ollama ollama pull gemma4:e4b

# Ultra-fast edge variant
docker exec -it ollama ollama pull gemma4:e2b

# High-quality MoE (~9 GB)
docker exec -it ollama ollama pull gemma4:27b

# OpenClaw uses the native API:
# http://localhost:11434  ← correct
# http://localhost:11434/v1  ← WRONG, breaks tools

General Purpose

Llama 3.1 8B

Meta's flagship 8B model. Excellent all-rounder and the most widely tested model for tool-calling workflows.

~4.7 GBllama3.1:8b

Reasoning

DeepSeek-R1 7B

Shows step-by-step reasoning. Auto-detected by OpenClaw as a reasoning model via naming heuristics. Great for maths and logic.

~4.7 GBdeepseek-r1:7b

Quality Chat

Gemma 3 12B

Previous-gen Google model — strong text quality at moderate memory. Good balance of speed and intelligence for conversational tasks.

~7.3 GBgemma3:12b

Fast

Mistral 7B

Very low latency with solid instruction-following. Ideal when speed matters more than intelligence — quick queries, high concurrency.

~4.1 GBmistral:7b

Multilingual

Qwen 2.5 7B

Alibaba's 7B model with exceptional multilingual support. Strong across 20+ languages and good coding ability.

~4.4 GBqwen2.5:7b

Technical

Phi-3.5 Mini

Microsoft's 3.8B efficiency model. Disproportionately strong at coding and technical questions for its tiny memory footprint.

~2.3 GBphi3.5:3.8b

Ultra Fast

Llama 3.2 3B

Tiny Meta model with fast responses. Best for simple tasks or running concurrently alongside larger models without eating all the RAM.

~1.9 GBllama3.2:3b

Efficient MoE

Gemma 4 27B

Mixture-of-experts with only 3.8B active params per token. Near-frontier quality at a fraction of the compute. Multimodal.

~9 GBgemma4:27b

Model	Pull Command	RAM (q4)	RAM Usage	Context	Best For
Gemma 4 E2B	`gemma4:e2b`	~1.5 GB	1.5	128K	Multimodal
Gemma 4 E4B	`gemma4:e4b`	~3.0 GB	3.0	128K	Multimodal
Phi-3.5 Mini	`phi3.5:3.8b`	~2.3 GB	2.3	128K	Coding
Llama 3.2 3B	`llama3.2:3b`	~1.9 GB	1.9	128K	Speed
Mistral 7B	`mistral:7b`	~4.1 GB	4.1	32K	General
Qwen 2.5 7B	`qwen2.5:7b`	~4.4 GB	4.4	128K	Multilingual
Llama 3.1 8B	`llama3.1:8b`	~4.7 GB	4.7	128K	General
DeepSeek-R1 7B	`deepseek-r1:7b`	~4.7 GB	4.7	64K	Reasoning
Gemma 3 12B	`gemma3:12b`	~7.3 GB	7.3	128K	Quality
Gemma 4 27B MoE	`gemma4:27b`	~9 GB	9	256K	Multimodal MoE

Performance Tuning

Model Optimization

The three Ollama environment flags in the docker-compose significantly impact performance on ARM64. Here's what each does and how to tune context length vs RAM.

Environment Flags

OLLAMA_FLASH_ATTENTION=1

Enables flash attention — reduces KV cache memory usage significantly. On ARM64 (Ampere A1) where memory bandwidth is a bottleneck, this is the highest-impact flag. Enables running larger models or longer context windows that would otherwise OOM.

OLLAMA_NUM_PARALLEL=2

Allows 2 concurrent inference requests. Useful if multiple OpenClaw sessions or users are active simultaneously. Set to 1 if you only need single-user performance and want all RAM dedicated to one request.

OLLAMA_KEEP_ALIVE=5m

Unloads the model from RAM after 5 minutes of inactivity. Without this, Ollama holds the model in memory indefinitely. Set to -1 to never unload (faster subsequent requests), or 1m if you switch models frequently.

Context Window vs RAM

The default num_ctx of 2048 tokens is conservative. Increase it based on your use case — but each doubling costs roughly 1–4 GB of additional RAM depending on model size.

num_ctx	Extra RAM	Use When	Set via Modelfile
`2048` (default)	baseline	Short chats, fast responses	No change needed
`8192`	+1–2 GB	Documents, longer conversations	See below
`32768`	+4–6 GB	Long code files, research	See below
`131072`	+10–16 GB	Maximum (Gemma 4 / Llama 3.1 only)	See below

# Create a custom Modelfile to set num_ctx for gemma4:e4b
docker exec -it ollama bash -c "
cat > /tmp/Modelfile << 'EOF'
FROM gemma4:e4b
PARAMETER num_ctx 32768
EOF
ollama create gemma4-32k -f /tmp/Modelfile"

# Use the custom model in OpenClaw by selecting gemma4-32k as provider
docker exec -it ollama ollama list

Quantization Guide

Ollama pulls q4_K_M by default. You can pull specific quantizations by appending the tag. Higher quantization = better quality, more RAM.

q4_K_M (default)

Recommended for ARM64. Best balance of model size, quality, and inference speed on Ampere A1. This is what ollama pull gemma4:e4b downloads.

q8_0

~2× the size of q4_K_M, noticeably better output quality. Use if your target model is ≤7B and you have RAM headroom. Pull with: ollama pull gemma4:e4b-q8_0

f16 (full precision)

~4× the size of q4_K_M. Only practical for models ≤3B parameters on the 24 GB instance. Gives maximum quality but minimal practical benefit over q8_0 for most tasks.

OpenClaw + Ollama tool calling: Quantization affects function/tool calling reliability. If a model produces malformed tool calls, try switching from q4 to q8 before switching models entirely. Reasoning models (deepseek-r1) are auto-detected by OpenClaw via name heuristics — no extra config needed.

Self-Host OpenClaw on Oracle Cloud

Prerequisites

Oracle Cloud Account

Tailscale Account

9-Step Setup

Provision the ARM Instance

Configure OCI Firewall — Tailscale Only

Harden the Server

Install Tailscale

Install Docker

Deploy Ollama & Pull Models

Install OpenClaw (Native)

Enable Tailscale Access

Ongoing Maintenance

AI Models Under 15 GB RAM

Gemma 4 by Google DeepMind

Model Optimization

Environment Flags

Context Window vs RAM

Quantization Guide

FAQ