Reliability semantics¶
This document defines the public contract for timeouts, retries, cache TTLs, and connection/readiness state. Web UI and rigctld documentation can reference this section for consistent behavior.
Timeouts¶
Default values and where they apply¶
| Context | Default | Where it applies | Override |
|---|---|---|---|
| Connect / general | 5 s | CLI --timeout; LanBackendConfig.timeout, SerialBackendConfig.timeout; IcomTransport.receive_packet() default |
CLI: --timeout N; backend config: timeout=N |
| Discovery | 1 s per attempt | transport.py: DISCOVERY_TIMEOUT; each discovery wait uses this; up to DISCOVERY_RETRIES (10) attempts |
Not configurable (constants) |
| Control-phase steps | 2 s total, 0.3 s per read | _control_phase.py: discovery/status steps use short receive timeouts (e.g. 0.1 s, 0.3 s) and 2 s deadlines |
Not configurable |
| CI-V GET | min(connection timeout, 2 s) | Single CI-V request/response: _civ_get_timeout in radio.py is min(timeout, 2.0) |
Set via backend timeout (capped at 2 s for GETs) |
| CI-V recovery wait | 12 s | _wait_for_civ_transport_recovery(): max wait before giving up |
ICOM_CIV_RECOVERY_WAIT_TIMEOUT_S (env) |
| CI-V data watchdog | 2 s | If no CI-V data for this long, open_close is sent to restart stream (_civ_rx.py: _CIV_DATA_WATCHDOG_TIMEOUT) |
Not configurable (constant) |
| Rigctld client idle | 300 s | TCP client disconnected after no activity for this long | RigctldConfig.client_timeout |
| Rigctld command | 2 s | Per-command execution; CI-V round-trip must complete within this | RigctldConfig.command_timeout |
| Scope assembly | 5 s | Incomplete scope frame discarded after this (scope.py: _DEFAULT_ASSEMBLY_TIMEOUT) |
ScopeAssembler(assembly_timeout=...) |
| Capture (CLI) | 10 s (spectrum) / 15 s (waterfall) | capture_scope_frame / capture_scope_frames when invoked from CLI |
--capture-timeout N |
| Watchdog (LAN) | 30 s | No control-packet activity for this long triggers reconnect (LanBackendConfig.watchdog_timeout) |
Backend config: watchdog_timeout=N |
| Circuit breaker recovery | 5 s | After circuit opens, wait this long before one probe (HALF_OPEN) | CircuitBreaker(recovery_timeout=N) |
Sync API¶
The sync wrapper (sync.py) uses a single operation timeout when running async code via asyncio.run; default is 5.0 seconds.
Environment variables (CI-V tuning)¶
ICOM_CIV_RECOVERY_WAIT_TIMEOUT_S— max wait for CI-V transport recovery (default12.0).ICOM_CIV_READY_IDLE_TIMEOUT_S— max idle time since last CI-V data forradio_readyto stay true (default5.0).ICOM_CIV_RETRY_SLICE_MS— slice used for retry/backoff (default150ms).ICOM_CIV_ACK_SINK_GRACE_MS— grace time for ACK sink (default120ms).
Cache TTLs¶
State cache (shared layer)¶
The shared state cache used by both the Web UI and rigctld is defined in rigctld/state_cache.py and consumed via _shared_state_runtime.is_cache_fresh().
- Default TTL:
DEFAULT_STATE_CACHE_TTLin_shared_state_runtime.pyis 0.2 seconds. This is the same asRigctldConfig.cache_ttland is used so that web and rigctld share the same freshness semantics for frequency, mode, and related fields. - “Stale” meaning: A cache field is considered stale when either:
- its timestamp is missing (never written), or
(time.monotonic() - field_ts) >= max_age_s. So “stale” = not fresh: too old or never set.- Fallback behavior: When the cache is not fresh, the handler (e.g. rigctld) performs a CI-V round-trip to the radio and then updates the cache. If that round-trip times out, the command returns an error (e.g. Hamlib
ETIMEOUT); the previous cached value is not returned as a fallback for that command.
Internal radio cache (freq/mode/rf_power)¶
IcomRadio keeps an internal _DEFAULT_CACHE_TTL dict (e.g. freq: 10 s, mode: 10 s, rf_power: 30 s) used for internal caching of last-known values. This is separate from the shared state cache TTL (0.2 s) used by the server layers.
radio_ready and connection states¶
When the radio is “ready”¶
radio_readyis a property on the radio protocol andIcomRadio. The radio is considered ready when:- Connected:
conn_state == RadioConnectionState.CONNECTEDand the CI-V transport is present and not in a UDP error state. - Not recovering:
_civ_recoveringis false and_civ_stream_readyis true. - Recent CI-V data:
_last_civ_data_receivedis set and(time.monotonic() - _last_civ_data_received) <= _civ_ready_idle_timeout(default 5 s).
So “ready” means: connection is up, CI-V stream is up and not in recovery, and the radio has sent CI-V data within the idle timeout window.
Relation to RadioConnectionState¶
RadioConnectionState (in _connection_state.py) is the high-level connection state:
- DISCONNECTED — not connected or cleanly disconnected.
- CONNECTING —
connect()in progress. - CONNECTED — authenticated and operational.
- DISCONNECTING —
disconnect()in progress. - RECONNECTING — connection lost; auto-reconnect is waiting to retry (e.g. after watchdog timeout).
radio_ready is true only when state is CONNECTED and the CI-V stream is healthy and recently active. So you can be CONNECTED but not ready (e.g. CI-V data stalled or recovering).
What consumers (web, rigctld) should expect¶
- Web: The Web UI and API expose
radio_ready(and optionallyconnection_state). Consumers should treatradio_ready === falseas “do not rely on live CI-V” (e.g. scope or tuning may be deferred or show last-known state). - Rigctld: Commands that hit the radio will get
ETIMEOUTor connection errors if the radio is not responding; the server does not gate commands onradio_ready, but the circuit breaker will open after consecutive timeouts and fail commands quickly until recovery.
The helper radio_ready(radio) in web/runtime_helpers.py normalizes readiness: it uses radio.radio_ready if it is a boolean, otherwise falls back to radio.connected; None is treated as not ready.
Fire-and-forget and retries¶
Fire-and-forget (CI-V)¶
- Commands that do not expect a response (e.g. some set operations) can be sent in a fire-and-forget way: the sender registers an ACK sink so that the corresponding ACK does not block the request queue. There is no automatic retry for a single fire-and-forget send; if the transport fails, the caller sees the exception.
- GET-style commands (request/response) are single-attempt per call: one send, wait up to
_civ_get_timeout; on timeout,TimeoutErroris raised (and the rigctld handler maps this to HamlibETIMEOUT). The library does not retry GETs internally.
Discovery¶
- Discovery uses retries: up to
DISCOVERY_RETRIES(10) attempts, each withDISCOVERY_TIMEOUT(1 s) wait. If no response after that, connect fails withTimeoutError.
Circuit breaker (rigctld)¶
- After failure_threshold consecutive timeouts (default 3), the circuit opens: subsequent commands fail immediately with
ETIMEOUTwithout calling the radio. - After recovery_timeout seconds (default 5) in the OPEN state, the circuit goes to HALF_OPEN and one probe command is allowed. Success → CLOSED; failure → OPEN again.
Watchdog and reconnect¶
- If the control connection has no activity for watchdog_timeout (default 30 s for LAN), the runtime transitions to RECONNECTING and will try to reconnect. This is not a per-command retry but a connection-level recovery.
Summary table¶
| Concept | Default / value | Config / override |
|---|---|---|
| Connect timeout | 5 s | CLI --timeout, backend timeout |
| CI-V GET timeout | min(5, 2) = 2 s | Backend timeout (capped at 2 s for GETs) |
| Rigctld command timeout | 2 s | RigctldConfig.command_timeout |
| Rigctld client idle timeout | 300 s | RigctldConfig.client_timeout |
| State cache TTL (web/rigctld) | 0.2 s | RigctldConfig.cache_ttl, CLI --cache-ttl |
| radio_ready idle window | 5 s | ICOM_CIV_READY_IDLE_TIMEOUT_S |
| CI-V data watchdog | 2 s | Constant |
| Circuit breaker recovery | 5 s | CircuitBreaker(recovery_timeout=...) |