Documentation Β· Releases Β· Antikythera

Antikythera

v.G.1.3 β€” Antikythera

The hub measures CPU against the actual chainweb-node hot path, not an abstract sysbench number.

Antikythera is the cpu-benchmark-chainweb-tuning spec. The pre-v.G.1.3 bench ran sysbench (single + multi-thread) and called it done β€” a generic CPU number that didn't say much about whether your box would run chainweb well. The v.G.1.3 bench adds six workloads that mirror chainweb-node's hot path (Blake2s hashing, secp256k1 ECDSA verify, Ed25519 verify, random-memory access, single-thread Pact-proxy, hash-and-verify pipeline), promotes the IPC instructions-per-cycle counter from diagnostic to a scored signal at ~10% weight, blends the lot into a chainweb-aggregate score, and labels the result with one of four buckets β€” strong / good / adequate / weak β€” so operators can read at a glance whether a box is chainweb-friendly. The slice bench reaches full structural parity with the host bench: single-arbiter scoring formula via `computeServerScore` for both bench types. And along the way the rehaul caught a pre-existing bash-redirection bug that had been silently nulling IPC fleet-wide since the perf-stat code shipped β€” a detail that itself justified the entire rehaul.

What landed

Six chainweb-correlated workloads

  • Blake2s throughput β€” mirrors chainweb's hash-tree construction. Measured via openssl speed -evp blake2s256. Reported as MB/s.
  • secp256k1 ECDSA verify β€” mirrors transaction-signature verification. Measured via openssl speed ecdsap256. Reported as verifications/sec.
  • Ed25519 verify β€” mirrors block-signature verification. Measured via openssl speed ed25519. Reported as verifications/sec.
  • Random-memory access β€” mirrors RocksDB random-read latency. Measured via sysbench memory --memory-access-mode=rnd --memory-block-size=4K. Reported as MB/s.
  • Pact-proxy single-thread β€” mirrors Pact contract evaluation latency. Measured via sysbench cpu --threads=1 --cpu-max-prime=20000. Reported as events/sec.
  • Hash-and-verify pipeline β€” mirrors chainweb's block-validation hot path. Custom shell loop combining Blake2s hash + Ed25519 verify in a tight inner loop. Reported as ops/sec.

Total bench-time addition β‰ˆ 30 seconds. No new apt dependencies: every workload uses openssl, sysbench, and b2sum β€” all already part of the existing bench deps phase.

Chainweb-aggregate score + bucket label

The new chainwebScore blends:

  • 50% multi-thread sysbench events/sec (the legacy signal, retained)
  • 10% IPC ratio (instructions-per-cycle from the perf hardware counter)
  • 40% averaged chainweb-workload ratios (the six workloads above, equally weighted)

Each ratio is the measured value divided by the calibrated baseline for a reference 6c/12t modern desktop class (Ryzen 5 5600 / i5-12400). The blended score then maps to a four-state bucket per REQ-08:

  • β˜… Strong β€” score β‰₯ 1.0, at or above reference.
  • βœ“ Goodβ€” 0.75 ≀ score < 1.0, solid chainweb performance, slightly below reference.
  • ~ Adequateβ€” 0.5 ≀ score < 0.75, usable but not optimal.
  • ! Weakβ€” score < 0.5, significantly below reference; chainweb-node may struggle under load.

IPC promotion + the silent-null bug

The pre-v.G.1.3 perf-stat invocation contained a Bash redirection order-of-operations bug: (perf stat ... >/dev/null 2>&1) 2>&1 | tail -30 discarded perf's stderr (where the hardware-counter table lives) inside the subshell before the outer 2>&1 could capture it. Result: the IPC value has been silently NULL fleet-wide since this code shipped. v.G.1.3 fixes it via a capture-then-detect rewrite (no subshell-internal silencing) plus a sudoers refresh that adds /usr/bin/perf + /usr/lib/linux-tools-*/perf so sudo -n perf stat works on Ubuntu 22.04+ paranoid-3+ kernels. First re-bench after the upgrade may show a step-change in IPC for boxes that were silently failing before.

Slice bench full mirror β€” single-arbiter scoring

The pre-v.G.1.3 segregated-slice bench had its own parallel scoring path that bypassed computeServerScore β€” slice rows scored differently from host rows under the same hardware. v.G.1.3 retires the inline shortcut and routes both bench types through a single canonical formula. Score-formula bug fixes from now on affect both bench types simultaneously. The slice bench also gets the same six chainweb workloads (executed inside the slice's systemd-run --scope cgroup), the same IPC capture (with graceful fallback when perf-stat fails inside the cgroup), and the same phase markers that drive the new substep UI.

Operator UX β€” live substep list

The pre-v.G.1.3 bench-progress card showed a single CPU progress bar. v.G.1.3 replaces it with a vertical substep list (mirroring the v.G.1.2e bootstrap-stage UX): each of the nine substeps β€” cpu_single, cpu_multi, perf_stress, plus the six chainweb workloads β€” transitions β—‹ pending β†’ ● in-progress β†’ βœ“ complete with the measured value rendered inline. Workloads that are unavailable (older OpenSSL without Ed25519, missing b2sum, etc.) render as β—‹ skipped (unavailable) rather than blocking the bench.

Bucket-label visibility

  • Operator's own server score card β€” bucket-label tile next to the CPU score, with a hover hint explaining the bucket semantics.
  • Admin cluster overview / hub nodes index β€” bucket pill in the ServerScore column.
  • Public node listings β€” only the raw chainweb score is visible, NOT the bucket label (audited and locked with a regression test).

Operator notes

  • No forced re-bench. Existing bench rows survive the migration with NULL on the new chainweb fields β†’ legacy formula path keeps scoring them correctly. Operators who want the new bucket label run a fresh bench.
  • First re-bench surfaces the IPC fix. Boxes that were silently failing perf-stat capture before will now report real IPC values, potentially shifting the chainweb score by a few percent.
  • Calibrated baselines, not absolute thresholds. The seven baselines (sysbench, IPC, plus six chainweb workloads) are starting points calibrated against a reference modern desktop. The v.G.1.3 spec ships a runtime-configurable getBaseline(key) mechanism so recalibration is possible without redeploy via the system_state table.
  • REQ-11 reduced contention penalty. The IPC term and the six chainweb workload terms are single-thread by design; the contention penalty applied to them uses only steal- time + run-variance (drops the multi-thread scaling-efficiency component). The legacy multi-thread sysbench term keeps the full penalty.

Out of scope (deliberately)

  • No Geekbench / PassMark calibration. Early discussion explored translating the existing sysbench number into Geekbench/PassMark units. The operator explicitly rejected this mid-discovery: the goal is to measure chainweb's hot path directly, not to translate a generic CPU number into another generic CPU number.
  • No 7-Zip LZMA proxy and no Pact-evaluation runtime bench. Both were considered but deferred to later releases β€” the six workloads above cover chainweb's actual hot path; adding more would push the bench-time budget without proportionate signal gain.
  • No forced operator action. Re-benching is encouraged but never blocking.

Background β€” why "Antikythera"

The Antikythera mechanism is the world's first known analog computer. Constructed around 100 BCE, recovered in 1901 from a Greek shipwreck off the island of Antikythera, it used some 30 bronze gears to predict astronomical positions, eclipse cycles, and the timing of the ancient Olympic games β€” with a precision that wasn't matched until the mechanical clocks of ~1500 years later.

It is also literally the first device known to have measured computational behaviour: its gear-train computed celestial mechanics with the same kind of precision-against-a-known-reference contract that the v.G.1.3 chainweb bench applies to CPU hot paths. Multiple workloads, blended into a single output, calibrated against a reference. Two thousand years apart, the same underlying idea.

It also has an uncomfortable echo for this spec: the mechanism's true capability was hidden for centuries before being properly understood β€” not unlike the perf-stat IPC capture that v.G.1.3 finally surfaced as having been silently broken since the code shipped.

Further reading