Skip to content

Reliability semantics

This document defines the public contract for timeouts, retries, cache TTLs, and connection/readiness state. Web UI and rigctld documentation can reference this section for consistent behavior.


Timeouts

Default values and where they apply

Context Default Where it applies Override
Connect / general 5 s CLI --timeout; LanBackendConfig.timeout, SerialBackendConfig.timeout; IcomTransport.receive_packet() default CLI: --timeout N; backend config: timeout=N
Discovery 1 s per attempt transport.py: DISCOVERY_TIMEOUT; each discovery wait uses this; up to DISCOVERY_RETRIES (10) attempts Not configurable (constants)
Control-phase steps 2 s total, 0.3 s per read _control_phase.py: discovery/status steps use short receive timeouts (e.g. 0.1 s, 0.3 s) and 2 s deadlines Not configurable
CI-V GET min(connection timeout, 2 s) Single CI-V request/response: _civ_get_timeout in radio.py is min(timeout, 2.0) Set via backend timeout (capped at 2 s for GETs)
CI-V recovery wait 12 s _wait_for_civ_transport_recovery(): max wait before giving up ICOM_CIV_RECOVERY_WAIT_TIMEOUT_S (env)
CI-V data watchdog 2 s If no CI-V data for this long, open_close is sent to restart stream (_civ_rx.py: _CIV_DATA_WATCHDOG_TIMEOUT) Not configurable (constant)
Rigctld client idle 300 s TCP client disconnected after no activity for this long RigctldConfig.client_timeout
Rigctld command 2 s Per-command execution; CI-V round-trip must complete within this RigctldConfig.command_timeout
Scope assembly 5 s Incomplete scope frame discarded after this (scope.py: _DEFAULT_ASSEMBLY_TIMEOUT) ScopeAssembler(assembly_timeout=...)
Capture (CLI) 10 s (spectrum) / 15 s (waterfall) capture_scope_frame / capture_scope_frames when invoked from CLI --capture-timeout N
Watchdog (LAN) 30 s No control-packet activity for this long triggers reconnect (LanBackendConfig.watchdog_timeout) Backend config: watchdog_timeout=N
Circuit breaker recovery 5 s After circuit opens, wait this long before one probe (HALF_OPEN) CircuitBreaker(recovery_timeout=N)

Sync API

The sync wrapper (sync.py) uses a single operation timeout when running async code via asyncio.run; default is 5.0 seconds.

Environment variables (CI-V tuning)

  • ICOM_CIV_RECOVERY_WAIT_TIMEOUT_S — max wait for CI-V transport recovery (default 12.0).
  • ICOM_CIV_READY_IDLE_TIMEOUT_S — max idle time since last CI-V data for radio_ready to stay true (default 5.0).
  • ICOM_CIV_RETRY_SLICE_MS — slice used for retry/backoff (default 150 ms).
  • ICOM_CIV_ACK_SINK_GRACE_MS — grace time for ACK sink (default 120 ms).

Cache TTLs

State cache (shared layer)

The shared state cache used by both the Web UI and rigctld is defined in rigctld/state_cache.py and consumed via _shared_state_runtime.is_cache_fresh().

  • Default TTL: DEFAULT_STATE_CACHE_TTL in _shared_state_runtime.py is 0.2 seconds. This is the same as RigctldConfig.cache_ttl and is used so that web and rigctld share the same freshness semantics for frequency, mode, and related fields.
  • “Stale” meaning: A cache field is considered stale when either:
  • its timestamp is missing (never written), or
  • (time.monotonic() - field_ts) >= max_age_s. So “stale” = not fresh: too old or never set.
  • Fallback behavior: When the cache is not fresh, the handler (e.g. rigctld) performs a CI-V round-trip to the radio and then updates the cache. If that round-trip times out, the command returns an error (e.g. Hamlib ETIMEOUT); the previous cached value is not returned as a fallback for that command.

Internal radio cache (freq/mode/rf_power)

IcomRadio keeps an internal _DEFAULT_CACHE_TTL dict (e.g. freq: 10 s, mode: 10 s, rf_power: 30 s) used for internal caching of last-known values. This is separate from the shared state cache TTL (0.2 s) used by the server layers.


radio_ready and connection states

When the radio is “ready”

  • radio_ready is a property on the radio protocol and IcomRadio. The radio is considered ready when:
  • Connected: conn_state == RadioConnectionState.CONNECTED and the CI-V transport is present and not in a UDP error state.
  • Not recovering: _civ_recovering is false and _civ_stream_ready is true.
  • Recent CI-V data: _last_civ_data_received is set and (time.monotonic() - _last_civ_data_received) <= _civ_ready_idle_timeout (default 5 s).

So “ready” means: connection is up, CI-V stream is up and not in recovery, and the radio has sent CI-V data within the idle timeout window.

Relation to RadioConnectionState

RadioConnectionState (in _connection_state.py) is the high-level connection state:

  • DISCONNECTED — not connected or cleanly disconnected.
  • CONNECTINGconnect() in progress.
  • CONNECTED — authenticated and operational.
  • DISCONNECTINGdisconnect() in progress.
  • RECONNECTING — connection lost; auto-reconnect is waiting to retry (e.g. after watchdog timeout).

radio_ready is true only when state is CONNECTED and the CI-V stream is healthy and recently active. So you can be CONNECTED but not ready (e.g. CI-V data stalled or recovering).

What consumers (web, rigctld) should expect

  • Web: The Web UI and API expose radio_ready (and optionally connection_state). Consumers should treat radio_ready === false as “do not rely on live CI-V” (e.g. scope or tuning may be deferred or show last-known state).
  • Rigctld: Commands that hit the radio will get ETIMEOUT or connection errors if the radio is not responding; the server does not gate commands on radio_ready, but the circuit breaker will open after consecutive timeouts and fail commands quickly until recovery.

The helper radio_ready(radio) in web/runtime_helpers.py normalizes readiness: it uses radio.radio_ready if it is a boolean, otherwise falls back to radio.connected; None is treated as not ready.


Fire-and-forget and retries

Fire-and-forget (CI-V)

  • Commands that do not expect a response (e.g. some set operations) can be sent in a fire-and-forget way: the sender registers an ACK sink so that the corresponding ACK does not block the request queue. There is no automatic retry for a single fire-and-forget send; if the transport fails, the caller sees the exception.
  • GET-style commands (request/response) are single-attempt per call: one send, wait up to _civ_get_timeout; on timeout, TimeoutError is raised (and the rigctld handler maps this to Hamlib ETIMEOUT). The library does not retry GETs internally.

Discovery

  • Discovery uses retries: up to DISCOVERY_RETRIES (10) attempts, each with DISCOVERY_TIMEOUT (1 s) wait. If no response after that, connect fails with TimeoutError.

Circuit breaker (rigctld)

  • After failure_threshold consecutive timeouts (default 3), the circuit opens: subsequent commands fail immediately with ETIMEOUT without calling the radio.
  • After recovery_timeout seconds (default 5) in the OPEN state, the circuit goes to HALF_OPEN and one probe command is allowed. Success → CLOSED; failure → OPEN again.

Watchdog and reconnect

  • If the control connection has no activity for watchdog_timeout (default 30 s for LAN), the runtime transitions to RECONNECTING and will try to reconnect. This is not a per-command retry but a connection-level recovery.

Summary table

Concept Default / value Config / override
Connect timeout 5 s CLI --timeout, backend timeout
CI-V GET timeout min(5, 2) = 2 s Backend timeout (capped at 2 s for GETs)
Rigctld command timeout 2 s RigctldConfig.command_timeout
Rigctld client idle timeout 300 s RigctldConfig.client_timeout
State cache TTL (web/rigctld) 0.2 s RigctldConfig.cache_ttl, CLI --cache-ttl
radio_ready idle window 5 s ICOM_CIV_READY_IDLE_TIMEOUT_S
CI-V data watchdog 2 s Constant
Circuit breaker recovery 5 s CircuitBreaker(recovery_timeout=...)