Saykai | Safety & Change OS for Autonomous Systems
Safety and change OS for autonomous systems

Gate every behavior change behind real tests.

Saykai is a safety and change OS that plugs into your CI for agents, robots, and autonomous systems. When models, policies, or prompts change, Saykai reruns scenarios and log replays. If a change fails the Safety Spec, it does not ship.

In plain terms, Saykai is the safety gate in front of production that runs behavior changes through real scenarios and logs and blocks anything that does not meet your own safety standard.

Warehouse and logistics robots
Agents and copilots
Industrial and AV

Most teams start with one warehouse or industrial fleet, then expand the same safety and change control to agents and other autonomy stacks.

Why teams use Saykai

A real gate between change and production.

Today, behavior changes rely on checklists and long chats. Saykai gives you one consistent place where changes are tested, compared, and either approved or blocked.

Single decision point

All behavior changes go through the same gate.

Any change that can affect behavior, including models, prompts, policies, and tools, must pass the Safety Spec before it can deploy.

Grounded in reality

Scenarios and log replays, not toy examples.

Capture incidents and edge cases as scenarios. Saykai runs them every time, so new releases do not repeat old mistakes.

Evidence by default

A Safety Pack attached to each release.

Every approved change ships with a Safety Pack that explains what changed, what was tested, how it performed, and why it was allowed.

Product

Safety and change control as part of the stack.

Saykai plugs into your CI, simulation, and logging. You keep your existing models, agents, and infrastructure. We add a safety layer that is consistent and repeatable.

There is no new runtime to build on and no model lock in. Saykai is the gate you use for behavior changes and the record you keep for what you shipped.

At a glance

Inputs

New models, prompts, policies, tools, configs, and firmware that control behavior.

Saykai does

Runs scenarios and log replays, compares against the Safety Spec and previous versions, and returns a clear pass or fail decision.

Outputs

Approved deploys, blocked changes with reasons, and a Safety Pack for every release.

How it works

Three parts that fit into your pipeline.

Saykai does not replace your tools. It connects them and gives everyone the same view of what safe enough means.

Step 1

Connect Saykai to your stack.

We integrate with your CI, simulation harness, and log store. Saykai runs where you already run tests and analysis.

Step 2

Define a Safety Spec.

Together we write a Safety Spec for one system. It describes scenarios, metrics, thresholds, and drift limits for that system.

Step 3

Gate every behavior change.

Every change runs through Saykai. If it passes the Safety Spec, it ships with a Safety Pack. If not, the gate stays closed and the team sees why.

Use cases

For AI that touches real operations.

If a mistake can move money, move equipment, or move people, changes should go through a safety gate instead of only ad hoc checks.

Agents and operations copilots

Gate tool and policy changes for agents that can move cash, edit production plans, or book freight. Run full workflows in sandboxes before rollout.

Fintech and trading Supply chain Internal tools

Warehouse and mobile robots

Prevent autonomy updates that regress collisions, near misses, or throughput in warehouse, logistics, or inspection fleets.

Warehouse robots Inspection robots Mixed human traffic

Industrial and AV programs

Connect your Safety Spec to your safety case and risk process. Generate consistent Safety Packs for leadership and regulators.

Cobots Industrial cells AV and driver assist
Safety Spec and Safety Pack

Make safe enough something you can write down.

Instead of ad hoc rules and private notes, Saykai uses a Safety Spec and a Safety Pack. That way everyone sees the same standard and the same record.

A typical Safety Spec includes:

  • Scenarios that must pass on every change, both simulated and replayed from real incidents.
  • Metrics like collisions, near misses, unsafe actions, overrides, and latency bands.
  • Thresholds that define pass or fail and when human sign off is required.
  • Drift limits that cap how much behavior can move from the last approved version.
Example · saykai.yml
spec: system_id: "warehouse_amr_prod" targets: max_collisions_per_10k: 0 max_near_miss_25cm: 25 max_human_override_pct: 0.25 max_latency_p95_ms: 1200 scenarios: sim: - tight_aisle_crossing - human_in_aisle - forklift_cross_traffic replay: - 2025-11-21-night-shift - 2025-11-23-weekend-peak on_change: triggers: - "github:main:apps/warehouse-bot/**" actions: - run_safety_sweep - compare_to_baseline - gate_if_violates_spec
Example · Safety Pack (shortened)
{ "system": "warehouse_amr_prod", "change_id": "pr-4829", "spec_version": "scl-95-amr-v3", "decision": "approve", "metrics": { "collisions_per_10k": 0, "near_miss_25cm": { "prev": 18, "current": 12 }, "human_override_pct": { "prev": 0.11, "current": 0.19 }, "completion_time_p95_sec": { "prev": 74, "current": 68 } }, "scenarios_executed": [ "tight_aisle_crossing", "human_in_aisle", "forklift_cross_traffic", "2025-11-21-night-shift" ] }

Each Safety Pack answers three questions for every release:

  • What changed in the system.
  • What we tested before shipping it.
  • Why it was safe to put into production at that time.

You can attach Safety Packs to deploy notes, incident reviews, and audit documents without reconstructing the story by hand.

Start with one system and one clear before and after.

We usually begin with a single agent workflow or autonomy stack. In a focused pilot we define a Safety Spec, wire it into CI, and run your next few changes through Saykai.

If you lead autonomy, safety, or reliability for a system like this, a pilot is the fastest way to see Saykai running inside your stack.

Or email hello@saykai.com