Claude Sonnet 4.6 achieves new benchmarks /// OpenClaw hits 50K stars /// Vibe Coding surges 340% /// Claude Sonnet 4.6 achieves new benchmarks /// OpenClaw hits 50K stars /// Vibe Coding surges 340% ///

Speaking

Talks on agents in the real world: reliability, security, evaluation, and shipping.

  • Agent reliability: why demos lie
  • Action safety: permissions, audit trails, guardrails
  • What good evals look like