RaceLink Developer Guide¶
Checklists for the recurring "I want to add X" tasks. Each checklist
walks you through every file that needs an update so a feature
addition doesn't accidentally land half-implemented (the
sendGroupControl ghost-method incident, where a renamed method
hid behind a broad except for over a year, is the cautionary
tale here).
For the why and the wire format, see:
- architecture.md — package layout + threading model.
- wire-protocol.md — wire-format reference.
- ui-conventions.md — button vocabulary, toast/confirm rules.
Adding a new scene-action kind¶
Scene actions are the building blocks of a scene (e.g. wled_preset,
startblock, delay, sync, offset_group). The kind name is the
canonical identifier across the validator, runner, and editor.
Files to touch (in dependency order):
- Constant in
racelink/services/scenes_service.py: - Validator in the same file: add a
_canonical_my_new_kind_actionhelper if the action has a non-trivial shape, and dispatch to it from_canonical_action. If your kind requires a target, validate it via the existing_canonical_target(group / device). - Editor schema in
get_action_kinds_metadata(): declare the kind with itsvars(UI inputs),supports_flags_override, etc. The WebUI consumes this to render the action body. - Dispatch plan in
racelink/services/dispatch_planner.py: add a branch inplan_action_dispatch(or extend_plan_effect) that produces oneWireOpper wire packet the action would emit. Each op carriessender(the symbolic adapter key — e.g."send_control"),payload(kwargs ready to spread into the named sender), andbody_bytessized via the canonical builder inracelink/protocol/packets.py. This is the single source of truth — the runner and the cost estimator both consume the resulting plan, so a kind is "done" once its planner branch is correct. - Runner adapter in
racelink/services/scene_runner_service.py: if your kind needs a new symbolic sender (uncommon — most kinds re-usesend_control/send_wled_preset/send_offset/send_sync), add the mapping in_dispatch_op. Otherwise just register the per-kind shim: The cost estimator picks up the new kind automatically — no changes there. - Capability mapping in
racelink/static/scenes.js(requiredCapForKind): if the new kind requires a device capability (WLED / STARTBLOCK / etc.), return the cap string. Without this entry the editor will not filter target dropdowns for the new kind — and you'll re-introduce the silent-success bug class C5 closed. - Frontend rendering in
scenes.js: - Add
KIND_MY_NEW_KIND(or just the string literal) toSCENE_KIND_LABELSandSCENE_KINDS_ORDER. - If the kind has parameters, add them to the editor schema in
step 3 — the generic
buildVarsRowwill render them. Custom widgets (e.g. the offset_group config panel) need their ownbuildXyzfunction. defaultActionForKind(kind): return the seed shape for the editor's "+ Add" button.- Tests in
tests/test_scenes_service.py(validator round-trip, edge cases) andtests/test_scene_runner_service.py(dispatch happy path + transport-missing degraded path). If the kind has cost characteristics worth pinning, also add atest_scene_cost_estimator.pytest. - Plan-file note if the addition is significant: append to the active plan at the maintainer's internal engineering ledger so the rationale stays linked to the change.
Checklist:
[ ] KIND_* constant in scenes_service.py
[ ] _canonical_*_action validator (if non-trivial shape)
[ ] get_action_kinds_metadata entry
[ ] dispatch_planner.py branch (the single source of truth — runner + estimator both consume it)
[ ] scene_runner_service.py: _dispatch_op mapping if a new sender is needed; per-kind shim that delegates to _plan_and_execute
[ ] scenes.js requiredCapForKind entry (if cap-gated)
[ ] scenes.js SCENE_KIND_LABELS + SCENE_KINDS_ORDER
[ ] scenes.js defaultActionForKind seed
[ ] tests/test_dispatch_planner.py — pin the planner output for the new kind
[ ] tests/test_dispatch_parity.py — runner + estimator agree on packet count & per-op sender
[ ] tests for validator + runner-side adapter (degraded paths, etc.)
[ ] plan-file note (if significant)
[ ] manual smoke: editor renders the kind, save+load round-trips, run produces the expected wire trace
Adding a new wire opcode¶
Adding an opcode means changing the wire format — coordinate across
all three repos (Host, Gateway, WLED). The tests/test_proto_header_drift.py
test will fail otherwise.
The catalog headers ride with
racelink_proto.h. Two WLED-neutral catalog headers —racelink_headless.h(scene catalog) andracelink_indicators.h(indicator catalog) — are distributed alongsideracelink_proto.hand must stay byte-identical across all four component repos. Drift in any of them counts as a wire-format break the same wayracelink_proto.hdrift does, since the symbolic ids carried byOPC_HEADLESSandOPC_INDICATEare looked up against the receiver's local copy of the catalog. Both files should be added to the drift-test equivalence list as the Host repo grows them.
Files to touch:
- C header
racelink_proto.h: - Add the value to the LP enum (
OPC_*). - Document the body layout and response policy in a comment
block above the matching struct (see
OPC_OFFSETfor the reference style). - Add the matching
static const uint8_t MAX_P_*for any variable-length body, plus astatic_assert(MAX_P_* <= BODY_MAX). - Add a
PacketRuleentry inRULES[](direction + response policy + max body length). - Mirror in Gateway + WLED firmware repos: copy the updated
racelink_proto.hbyte-identically to../RaceLink_Gateway/src/racelink_proto.hand../RaceLink_WLED/racelink_proto.h. Verify withpytest tests/test_proto_header_drift.py. - Auto-generated Python mirror
racelink/racelink_proto_auto.py: re-runpython gen_racelink_proto_py.pyto regenerate. Don't hand- edit the generated file. - Body builder in
racelink/protocol/packets.py: addbuild_my_new_opc_body(...). Return the body bytes (without the Header7); the framing code wraps it. - Reply parser in
racelink/protocol/codec.py: if the opcode has a reply (RESP_ACK or RESP_SPECIFIC), add the parse path. The dict shape returned is the event the listeners see. - Per-opcode rule in
racelink/protocol/rules.py: if you didn't include this in step 1's regen, add manually. - Transport entry-point in
racelink/transport/gateway_serial.py: addsend_my_new_opc(...)that calls_send_m2nwith the matchingLP.make_type(LP.DIR_M2N, LP.OPC_MY_NEW)and the body from step 4. - Service wrapper if the opcode needs orchestration (retries,
reply collection, post-ACK state mutation):
add a method to the appropriate service in
racelink/services/, typicallygateway_service.py(high-level dispatch) or a dedicated service if the surface is large enough. - Tests in
tests/test_protocol.pyfor the body builder + parser, and in the matching service test file for the orchestration. - Documentation in wire-protocol.md: add the opcode to the table and a body-layout subsection. The header is the source of truth, but the doc is what people read.
Checklist:
[ ] racelink_proto.h: OPC_* + PacketRule + struct/comment
[ ] Mirror to Gateway + WLED repo (byte-identical)
[ ] Re-run gen_racelink_proto_py.py
[ ] build_*_body in protocol/packets.py
[ ] reply parse path in protocol/codec.py (if reply expected)
[ ] transport.send_* in transport/gateway_serial.py
[ ] service wrapper (orchestration, retries, post-ACK)
[ ] tests/test_protocol.py round-trip
[ ] tests/test_<service>.py orchestration
[ ] tests/test_proto_header_drift.py passes (no manual change needed; just run it)
[ ] PROTOCOL.md: opcode table + body layout
[ ] firmware-side handlers in Gateway + WLED
Adding a new Headless scene to the catalog¶
The Headless-Mode scene catalog lives in
racelink_headless.h::SCENE_CATALOG[] and
is consumed by every receiver — adding a row means flashing every
node that needs to display it. Older firmware silently drops unknown
scene ids via findSceneById() == nullptr, so a partial roll-out is
safe (mixed-firmware fleets just see the new scene only on
up-to-date nodes).
Files to touch:
- Catalog header
racelink_headless.h: - Append a new
SCENE_*value to theHeadlessSceneIdenum. Append-only — never reuse or renumber existing ids; they are wire-stable. - Append a row to
SCENE_CATALOG[]carrying the visual spec (fxMode, speed, intensity, color1) plus the offset formula if the scene staggers across groups (SCENE_FLAG_USE_OFFSET+offsetMode,offsetBase,offsetStep). - Mirror to all three component repos byte-identically: Host,
Gateway, WLED. Drift here is as serious as
racelink_proto.hdrift (the receiver expects to look up the id against its local table). - Firmware-side expansion in
usermods/racelink_wled/racelink_wled.cpp::applyLocalScene— usually no code change needed; the catalog row drives the segment writes generically. Only add a code path when the scene needs a non-standard semantic (e.g.SCENE_RESTORE_BOOT_COLORuses a per-device boot snapshot that doesn't live in the catalog row). - Headless-master broadcast — none needed; the WLED's single-click cycles the catalog by index, so a new row is picked up automatically.
- Doc in
RaceLink_WLED/operator-setup.md§"Headless Mode" → Scenes table: add a row.
Persisting Headless-Master state across reboots¶
The Headless Master keeps a small amount of per-master state in
cfg.json so a power-cycle (or battery swap) does not lose the
pairing context. All fields live under RaceLink.overrides; the
operator-visible reference is
RaceLink_WLED/headless-mode.md §"Persistence".
Cardinal rules for changing or adding a persistence field:
- One save path per concern. A new persistent field should
either (a) trigger
configNeedsWrite = truesynchronously (rare operator action, "save now is the right UX") or (b) plug into the debounce pump. Mixing is a bug. - Debounce pairing-burst writes. Anything that can mutate
tens of times per minute during normal use (most notably the
Headless Slavesregistry mutations) MUST funnel throughmarkHeadlessPersistDirty()→serviceHeadlessPersist(now)instead of settingconfigNeedsWritedirectly. The 5-second debounce window (HEADLESS_PERSIST_DEBOUNCE_MS) collapses a 40-slave pairing burst into a single save. The LittleFS partition has ~120 000 saves of headroom; without the debounce a heavy-event-day operator burns through that budget in months. exitHeadlessMode()writes synchronously and wipes everything. The operator gesture for exit must survive an immediate battery pull, so the function clearsheadlessPersistDirty, mutates the relevant overrides (counter → 0, registry → empty,current.groupId→ 0,headlessPersistedActive→ false), and setsconfigNeedsWrite = truein one synchronous pass. Runtime- override paths (Gateway takeover, autosync detection) leave the registry intact — they are involuntary demotions where later manual re-promotion benefits from the preserved data.- Proactive use of the registry. On promotion (whether by
5-click or by auto-resume),
enterHeadlessMode()callsstartHeadlessReassign()which arms the cursor in theRaceLinkHeadless::ReassignStatestruct; the loop pumpserviceHeadlessReassign(now)sweeps the registry with oneOPC_SET_GROUPperHEADLESS_REASSIGN_INTERVAL_MS(currently 500 ms — tuned to give the addressed slave time to CAD + ACK before the next master TX). A 40-slave sweep takes ~20 s; the operator sees discreteIND_PAIRING_TXflashes per slave plusIND_PAIR_CONFIRMEDon the receiving end. If the TX queue is busy (scheduleSendreturns false — typically the post-promotion scene/SYNC broadcast still in flight), the sweep retries the same slot on the next interval (deferReassignRetry) instead of advancing — never silently drops a slave. - Scene rebroadcast after pairing. After a successful
SET_GROUP(proactive boot-burst OR individual reactive pairing),headlessAssignGroupTo()and the end ofserviceHeadlessReassign()both callscheduleSceneRebroadcast()which armsRaceLinkHeadless::SceneRebroadcastStatewith a 1 s debounce. Successive arms within the window collapse to oneOPC_HEADLESSpacket; the loop pumpserviceSceneRebroadcast(now)fires it once the deadline elapses. No-op when the master has no current scene yet (currentSceneIdx == 0xFF). - Group-id discipline.
HEADLESS_MASTER_GROUP_ID = 1is the master's own group while active;HEADLESS_FIRST_GROUP_ID = 2is the first id ever handed to a slave; 0 is the unconfigured pool; 255 is the broadcast pseudo-group. Use the header helperRaceLinkHeadless::reserveNextGroupId(counter)to pull the next free id — it clamps and exhausts correctly. PassinggroupId = 0tobuildSetGroupPacket()is a bug. - Master self-sync invariant. The master's own
strip.timebasemust equal-activePhaseOffsetMsfor its own segment effects to render in the same logical-time frame the slaves derive from incomingOPC_SYNCpackets. Without this invariant the master drifts on offset scenes while slaves stay synchronised with each other. Re-asserted at everyheadlessBroadcastSync()and once atenterHeadlessMode().
Where the headless state lives¶
All WLED-neutral headless state structs + helper functions live in
racelink_headless.h
under the RaceLinkHeadless:: namespace — reusable byte-identically
by external Gateway-side software (e.g. FPVGate). Notable members:
| Header export | Purpose |
|---|---|
HeadlessSlaveRec + findSlaveIdx / upsertSlave / clearSlaveTable |
persistent slave registry, pure data ops on a caller-owned array |
PersistState + markPersistDirty / persistDebounceElapsed / persistConsumed |
debounced flash-write pump state machine |
SceneRebroadcastState + scheduleSceneRebroadcast / sceneRebroadcastReady / sceneRebroadcastConsumed |
post-pairing rebroadcast scheduler |
ReassignState + startReassign / pickReassignTarget / reassignSweepCompleted / confirmReassignSent / deferReassignRetry / abortReassign |
re-bind cursor state machine |
shouldFirePairingBlip(lastAtMs, now, throttleMs) |
indicator-throttle decision |
reserveNextGroupId(counter) |
counter clamp + bump + exhaustion check |
HEADLESS_* constants |
timing, interval, throttle, debounce parameters |
The WLED-coupled side (enterHeadlessMode / exitHeadlessMode /
headlessBroadcastSync / headlessBroadcastCurrentScene /
headlessSendTx / headlessAssignGroupTo / serviceHeadless
probe state machine) stays in
racelink_wled.{h,cpp}
because it touches strip.timebase, bri, applyLocalIndicator,
configNeedsWrite, and the segment write API. The loop-pump methods
on UsermodRaceLink (serviceHeadlessPersist,
serviceSceneRebroadcast, serviceHeadlessReassign) are thin
wrappers — they consult the header decision helpers and execute the
WLED-side action.
Time-critical TX via scheduleSend(rl, buf, len, jitterMaxMs=0)¶
RaceLinkTransport::scheduleSend() in
racelink_transport_core.h
is the single TX-queue entry point shared byte-identically by Gateway,
Host and WLED. Its jitterMaxMs parameter has three modes:
jitterMaxMs |
lbtEnable |
Behavior |
|---|---|---|
== 0 |
(any) | Time-critical bypass: fire immediately, no random delay, no CAD scan. Use for low-frequency broadcasts where the in-packet timestamp must reflect the actual TX moment within single-digit ms. |
> 0 |
true |
LBT-polite: 50..300 ms random pre-delay (capped) + CAD scan, retries with backoff if busy. |
> 0 |
false |
Caller-controlled jitter, no CAD. Gateway-style — sole TXer, no spectrum-sharing needed but a small skew helps host-driven burst timing. |
Canonical use cases for the jitterMaxMs=0 bypass:
- OPC_SYNC keepalive: the slaves' drift-correction quality is dominated
by the precision of the
ts24timestamp in the SYNC body vs. the actual TX moment. With LBT's 50..300 ms random delay between caller'smillis()sample and the actual TX, slaves'lastSyncTbErrMsinflates to ~250 ms. With the bypass, the same metric stays in the Gateway-baseline range (~15 ms). Trade-off: skips collision avoidance, occasional loss tolerable (next SYNC re-anchors). - Stream fragments: existing Gateway behaviour — fragments must back-to-back to avoid the receiver de-fragmenter timing out, so inter-packet jitter would break the stream.
The trade-off is collision avoidance: with jitterMaxMs=0 the TX
fires the moment the radio leaves Standby, regardless of whether another
node is mid-transmission on the channel. Reserve this path for sends
where either occasional loss is acceptable (SYNC retries every 30 s
anyway) or the sender knows it's the only TXer on its side (Gateway).
Migrated 2026-05-19: the bypass branch lives directly in
scheduleSend() itself, replacing the older Gateway-side
rl_queueTxNoCad() toggle workaround that flipped lbtEnable to
false around the call. The previous WLED-side scheduleSendNoLbt()
parallel function (introduced briefly) was also removed in the same
unification pass. Cross-repo invariant: any change to scheduleSend()
must be replayed byte-identically into the Gateway and Host copies of
racelink_transport_core.h so tests/test_proto_header_drift.py
stays green.
Adding a new Indicator to the catalog¶
The status-indicator catalog lives in
racelink_indicators.h::INDICATOR_CATALOG[].
Same drift-discipline as the scene catalog. Existing receivers
silently drop unknown indicator ids — forward-compatible.
Files to touch:
- Catalog header
racelink_indicators.h: - Append a new
IND_*value toIndicatorType. Append-only. - Append a row to
INDICATOR_CATALOG[]. Animated only — fxMode must be BREATH (3), STROBE (23), or another animated mode. STATIC (0) violates the project's animation rule for indicators. Avoid pure RGB / W for the same reason. - Mirror to all three component repos byte-identically.
- Trigger — either local (
applyLocalIndicator(IND_*, dur)in firmware) or wire (Host / Gateway emitsOPC_INDICATE(type=IND_*, durationSec=...)). The duration is per-trigger, not in the catalog row, so the same indicator can run for 3 s in one context and 30 s in another. - Sub-second triggers — for indicators that need finer than
1 s resolution (e.g.
IND_PAIRING_TXwhich fires per SET_GROUP send with a 1500 ms display window), call the millisecond variantapplyLocalIndicatorMs(IND_*, durationMs)instead. Same semantics otherwise; the only difference is the deadline math. - High-frequency triggers must throttle. A trigger that
fires more often than once per ~200 ms should gate itself with
a
lastTriggerAtMstimestamp so consecutive triggers do not re-extend the indicator deadline into a sustained overlay. SeeheadlessSendTx()inracelink_wled.cppfor the canonical pattern (used to driveIND_PAIRING_TXon the Headless Master for SET_GROUP sends only). - Doc in
RaceLink_WLED/operator-setup.md§"Indicators" → Catalog table: add a row. Update the trigger column with the new locally-fired site (if any) or note that the indicator is "wire-only" if no local code path fires it.
Adding a new service¶
A service is a stateless or small-stateful module under
racelink/services/ that owns one
coherent piece of host logic.
Files to touch:
- Module at
racelink/services/my_service.py: - Module docstring (5–15 lines): purpose, public API, dependencies,
threading expectations. Use
gateway_service.pyas the template. - Module logger:
logger = logging.getLogger(__name__). - Class
MyServicewith__init__(self, controller, gateway_service)(or whatever dependencies it needs). - Public methods that return useful values (
boolfor send-style operations, dicts for query operations, raiseValueErrorfor bad input). - Service init in
racelink/services/__init__.py: re-export the class. - Wire-up in
controller.py::__init__: - Web routes in
racelink/web/api.pyif the service is operator-facing: route handler that validates input viarequest_helpers.require_int(or similar), calls the service, returns the response. Match the existingtry / except RequestParseError → 400andtry / except Exception → 500 with type+traceback logpatterns. - Tests at
tests/test_my_service.py: - Unit tests with a fake controller / fake transport.
- Coverage for the boolean return contract (transport-missing returns False; happy path returns True).
- Coverage for any error paths.
- Architecture doc at architecture.md: add a row to the Service Layer table.
Checklist:
[ ] racelink/services/my_service.py with module docstring + logger
[ ] services/__init__.py re-export
[ ] controller.py wiring
[ ] web/api.py route(s) (if operator-facing)
[ ] tests/test_my_service.py
[ ] ARCHITECTURE.md service-table entry
Adding a new task-manager-driven workflow¶
Long-running ops (multi-second, multi-stage) live in
racelink/web/tasks.py so the web
request returns immediately and the UI can subscribe to SSE
task events for progress.
Files to touch:
- Service method that does the work (likely a new service
per "Adding a new service" above, or a method on an existing
one). The method must accept a
task_managerparameter and calltask_manager.update(meta={"stage": "...", "message": "...", ...})at every stage transition. Themetashape is free-form but the existing operator-facing UI expects: stage— short uppercase tag (e.g.HOST_WIFI_ON,UPLOAD_FW).message— one-line operator-readable description.index/total— for per-device fan-outs.addr— current MAC if applicable.- Web route in
web/api.py: validate input, thenctx.tasks.start("my_task_name", target_fn, meta={...})wheretarget_fnis a closure that calls the service method with the task manager. Return{"ok": True, "task": ctx.tasks.snapshot()}. - Frontend handler in
racelink/static/racelink.js::updateTask: add a branch forname === "my_task_name"that updates the UI from themeta. Long-running ops with their own dialog (FW update is the reference) keep the dialog open and render the progress in-dialog; simpler ops can rely on the masterbar'staskDetailspan. - Tests: at minimum verify the route returns immediately
(
{"ok": True, "task": {...running...}}); deeper integration tests can exercise the meta-update path with a mock task manager.
Making the workflow cancellable¶
The TaskManager exposes a cooperative-cancel API: a worker thread
polls task_manager.is_cancel_requested() at safe-to-stop points, the
web layer flips the flag via POST /api/task/cancel. The pattern is
shipped end-to-end for the firmware-update and presets-download flows
(see racelink/services/ota_workflow_service.py); follow the same
shape for new workflows that are >5 s long or touch operator-affecting
state (host Wi-Fi, multi-device fan-outs).
- Pick the cancel granularity. Two flavours work:
- After-current-unit (preferred for multi-device fan-outs):
check the flag at loop-entry only. The currently-running unit
finishes cleanly; remaining units are skipped. Never produces
a half-completed unit. Used by
run_firmware_update— the per-device flash + verify + reconnect always runs to completion once it started. - Mid-step (only when half-state is benign): check the flag
before every long sleep / network call. Used by
download_presetsfor the AP-connect step where cancelling before the HTTP GET costs the operator nothing. - Always run cleanup in
finally. The cancel flag must not short-circuit Wi-Fi restore, device-state reset, or any other "we changed external state, now put it back" step. The reference pattern (run_firmware_update): - Extend the result shape. Add
cancelled: booland (for fan-outs)cancelled_after: int|Noneso the dialog's summary panel can render an honest "stopped after device N of M" line. Existing consumers ignore unknown keys, so this is additive. - Frontend: cancel button + summary phase. See the next
section ("Modal-locked dialogs"). The button calls
gateway.cancelTask()from the Pinia store; the dialog flips to a summary phase when the task lands indone/error. - Tests. Mirror
tests/test_ota_workflow_service.py::FwUpdateCancelTests: a_RecordingTaskManagerwith a programmable cancel-after-N counter, three tests pinning before-first, after-first, and after-all (no-op) behaviour. Verifyresult["cancelled"]and that thefinallycleanup ran.
Modal-locked dialogs (Cancel-with-summary pattern)¶
Long-running operations that change host state the operator cannot recover on their own (host Wi-Fi switched to a device AP, multi-device flash in flight, …) need to keep their dialog visible until the work either finishes naturally or is cancelled with a summary. The infrastructure is shipped — three call sites use it today: firmware-update, presets-download, discovery.
Components¶
| Layer | Where | What it does |
|---|---|---|
| Dialog prop | frontend/src/components/ui/dialog/DialogContent.vue |
lockClose: boolean. When true: @interactOutside.prevent, @escapeKeyDown.prevent, X button hidden. |
| Composable | frontend/src/composables/useTaskNavigationGuard.ts |
Wraps useBeforeUnloadGuard + onBeforeRouteLeave. Native confirm with caller's reason string while the task runs. |
| Store action | frontend/src/stores/gateway.ts (cancelTask) |
POST /api/task/cancel + optimistic local cancel_requested = true. |
| Store computeds | frontend/src/stores/gateway.ts (fwBusy, presetsBusy, discoverBusy) |
Per-task-name busy flags. Add one for each new long-running workflow. |
Wiring a new long-op dialog¶
- Add a busy computed to the gateway store keyed on your
task.name: - Lock the dialog while busy:
- Install the navigation guard in
<script setup>: - Cancel button: visible only while
myBusy, disabled whentask.cancel_requested(server has accepted the cancel; we're waiting for the worker to wind down). Label flips to "Cancelling…". - Summary phase: a
phaseref with'config' | 'progress' | 'summary'. A watcher ontask.stateflips to'summary'when the running task lands indoneorerror. The summary panel readstask.resultand renders success / failure / skipped counts plus any workflow-level side-effect status (e.g.hostWifi.restored). Only the summary phase exposes a real Close button. - Defensive close-on-Close: the in-dialog Close button must be
:disabled="myBusy"so the operator cannot dismiss the dialog even via the Close button while the task is running. The lockClose prop guards outside-click / Esc; the disabled Close button guards the obvious other path.
When to use the full pattern vs. the lighter variant¶
- Full pattern (lock + Cancel button + summary): long-running workflows that change reversible-but-operator-affecting state. Firmware update, presets download, future "bulk reflash group" or multi-device migration. Anything that touches host Wi-Fi qualifies automatically because Wi-Fi-restore failure strands the operator.
- Lighter variant (lock only, no Cancel): short ops (<30 s)
with no Wi-Fi / no half-state risk where Cancel is overkill but
outside-click dismiss is still annoying. Discovery sweep is the
reference. The footer Close button is
:disabled="<busy>"so the operator either waits for natural completion or accepts the navigation-guard prompt to leave.
Modifying threading-sensitive code¶
Anything that touches the gateway, the device repository, or the SSE layer crosses a thread boundary. Before submitting:
- Read architecture.md §Threading Model. Confirm which thread your code runs on.
- Confirm the lock contract: if you're mutating shared state,
use the existing locks (
state_repository.lock,_pending_config_lock,_pending_expect_lock,_tx_lock). If you're adding a new shared field, add a matching lock — and add a regression test intests/test_state_concurrency.pypinning the contract. - Never hold
state_repository.lockacross RF I/O — see the locking-rule note in ARCHITECTURE.md. The reference pattern is_apply_device_meta_updatesin api.py. - Name your threads (
name="rl-<purpose>"). This is a project convention; new threads without a name pollutethreading.enumerate()output and make py-spy traces illegible. - Daemon threads only via
ThreadPoolExecutorwhen you can bound concurrency (seegateway_service._auto_restore_executorfor the reference). One-shot daemons are still acceptable for truly singleton tasks (the RX reader, the reconnect worker).
Common patterns¶
Adding a request_helpers.require_int-style validator¶
Cross-cut input validation lives in
racelink/web/request_helpers.py.
The pattern: a helper raises RequestParseError (a ValueError
subclass) on bad input; the route catches it once and translates
to a 400. Adding a new helper:
def require_mac(body, key, *, label=None):
name = label or key
raw = require_str(body, key, label=name)
raw = raw.strip().upper()
if not _MAC12_RE.match(raw):
raise RequestParseError(f"{name} must be a 12-char hex MAC")
return raw
Then add a test in tests/test_web_request_helpers.py matching
the existing RequireIntTests style.
Adding a # swallow-ok: annotation¶
The exception-hygiene test (tests/test_exception_hygiene.py)
requires every except Exception block to either log, re-raise,
or carry a # swallow-ok: <reason> comment. The reason should be
substantive — "best-effort fallback; caller proceeds with safe
default" is the bare minimum, but a one-line why is better.
If you're tempted to swallow at an RF/persistence boundary,
prefer a logger.warning(..., exc_info=True) over a silent pass.
A previous project-wide sweep went through every
broad except in the project; aim to match that quality on new
code.
Returning a boolean from a send_* method¶
Every send_* method on control_service returns bool. True
means "the transport accepted the frame for queueing"; False
means "transport not ready / no target / nothing went out". The
a project-wide review traced silent-success bugs back to methods that
returned None instead. New send-style methods follow the
contract:
def send_my_new_opc(self, ...) -> bool:
transport = self._require_transport("sendMyNewOpc")
if transport is None:
return False
transport.send_my_new_opc(...)
return True
Regenerating WLED metadata after a firmware bump¶
Three RaceLink modules under racelink/domain/ are
fully auto-generated from the WLED checkout by
gen_wled_metadata.py. They must never be
hand-edited; the file headers say so and git blame will land on the
generator script, not a human commit.
| Generated file | Source in WLED checkout | What it carries |
|---|---|---|
wled_effects.py |
wled00/FX.h (effect IDs) + wled00/FX.cpp (_data_FX_MODE_*[] strings) |
Per-effect slot metadata: which sliders/toggles/colors/palette an effect uses, plus custom labels ("Bg", "Duty cycle", …). |
wled_palettes.py |
wled00/FX_fcn.cpp (JSON_palette_names[]) |
Palette id → display name. |
wled_palette_color_rules.py |
wled00/data/index.js (updateSelectedPalette()) |
The palette-conditional color slot rule: which built-in * Color… palettes (ids 2..5 in stock WLED) force-show extra color pickers regardless of effect metadata. |
The generator parses each source file with regexes pinned to the upstream
shape; if WLED ever reshapes one of them (e.g. moves
updateSelectedPalette or changes its if (s > 1 && s < 6) guard), the
generator raises RuntimeError with a pointer to the file/function it
failed on, rather than silently producing wrong output. The rule
extraction is also unit-tested in
tests/test_wled_effect_metadata.py
under ParsePaletteColorRuleTests.
When to regenerate¶
- You bumped the bundled WLED firmware (the checkout under
../WLED LoRa/WLED). - A WLED contributor added/renamed/removed an effect (changes
FX.h+FX.cpp). - A WLED contributor added/renamed/removed a built-in palette (changes
FX_fcn.cpp). - A WLED contributor reshaped
updateSelectedPalette()(changes the JS rule).
How¶
- Make sure the WLED checkout path matches what the generator expects.
The default points at a maintainer-local path; always pass
--wled <path>to override, pointing at the root of yourRaceLink_WLEDcheckout (the directory containingwled00/). - Run the generator:
It prints one line per output file, e.g.:
If any source-file shape check fails the script aborts with
Wrote racelink\domain\wled_effects.py (188 effects) Wrote racelink\domain\wled_palettes.py (72 palettes) Wrote racelink\domain\wled_palette_color_rules.py (palette-color rule: ...)RuntimeError; read the message, update the relevant regex ingen_wled_metadata.py, and rerun. - Run the parser tests:
The
ParsePaletteColorRuleTests::test_generated_module_matches_stock_thresholdspin will fire if the new firmware ships different palette thresholds — update the pin to match the new values and note the change in the WebUI smoke checklist below. - (Optional but recommended) Smoke-test the RL-preset editor in a
browser: open it, pick effect "Traffic Light", walk the palette
dropdown, and confirm that color-slot visibility still matches WLED's
own webui (
* Color 1→ 1+Bg,* Color Gradient→ 1+Bg+3, etc.).
How the generated data reaches the UI¶
WLED source ──► gen_wled_metadata.py ──► racelink/domain/wled_*.py
│
▼
racelink.domain.specials.serialize_rl_preset_editor_schema()
│
GET /racelink/api/rl-presets/schema
│
▼
racelink.static.racelink.js :: ensureRlPresetUiSchema()
│
▼
buildRlPresetForm() consumes options[].slots and
schema.paletteColorRules to drive the editor
No JS-side hardcoding remains: paletteForcesSlot reads the rule from
the schema (with a small literal fallback for the case where an old
backend hasn't shipped the field yet, intentionally matching the stock
values so behaviour is preserved during a rolling upgrade).
The deterministic-effects catalogue in
wled_deterministic.py is
the only WLED-derived module that is not auto-extracted — it
encodes a hand-audited subset of FX.cpp per the workflow below.
WLED OTA gate matrix¶
The four gates that WLED's /update handler enforces — same-subnet,
settings-PIN, OTA-lock, release-name — plus the five firmware-side
options to ship same-subnet=false live in
../reference/wled-ota-gates.md.
The recommendation is unchanged: ship the racelink_wled usermod
override (Option 1) on new firmware images and keep the host-side
auto-unlock (OTAService._wled_attempt_unlock, Option 5) as the
safety net.
Host-side per-device cleanup contract¶
Three load-bearing semantics in
racelink/services/ota_workflow_service.py
that are easy to break when refactoring the per-device loop —
AP-Enable retry shape (1.5 s × 2), conditional AP-Close (only on the
error-after-AP-open path), and the two-track per-device error
surface (dev_res["error"] + device_messages[addr_key]). The
canonical wording lives in the module docstring at the top of
ota_workflow_service.py so it travels with the code; the cleanup
contract is the kind of constraint a refactor commit author needs to
see while editing that file, not later via doc cross-reference.
If you're adding a step between AP-Enable and the success-path
dev_res["ok"] = True that opens any other long-lived host state
(a held lock, an external connection, etc.), the same
try/except/finally pattern must clean it up. The reference
implementation does this for the host's nmcli connection via
_restore_host_wifi in the outer finally.
task_manager.snapshot() adds a top-level elapsed_s field
(max(0, (ended_ts or now) - started_ts)) so the WebUI's live timer
can anchor on the server-computed value instead of
Date.now() / 1000 - started_ts, which would otherwise expose
host-vs-browser clock skew. Any new long-running task gets the
field for free; no per-workflow opt-in.
Updating the WLED-deterministic effects list¶
The RL-preset editor marks "deterministic" WLED effects with a leading
* and sorts them to the top of the dropdown so operators picking
offset-mode-safe effects see them first. Deterministic = the effect's
pixel output depends only on synced inputs (strip.now + segment
params), so two nodes with synchronised strip.timebase render
identically. The audited set is in
racelink/domain/wled_deterministic.py
(currently 19 effects); the source-of-truth catalogue lives in the WLED
fork at usermods/racelink_wled/docs/effects-deterministic.md
(the same content is also available in this consolidation at
reference/deterministic-effects.md).
When to update: a WLED release adds/changes an effect, or the catalogue doc grows a new "✓" entry.
How:
- Read the analysis doc, especially §"How to verify a new / unlisted
effect". Apply its 5-step grep checklist to the effect's body in
wled00/FX.cpp. - If passes: add the numeric ID to
WLED_DETERMINISTIC_EFFECT_IDSinwled_deterministic.pywith an inline comment naming the effect + FX.cpp anchor. - Update the pin test in
tests/test_wled_effect_metadata.py::WledDeterministicTaggingTests::test_deterministic_id_set_matches_analysis— same ID + bump thelen()assertion. py -m pytest tests/test_wled_effect_metadata.py -qshould still pass.- The frontend picks the change up automatically (no JS / CSS edit needed; backend ships the flag + the sort).
When removing: same flow in reverse — a WLED patch that introduces
RNG / beat*-without-timebase / per-frame SEGENV.step accumulation
into a previously-deterministic effect demotes it. Drop the ID from
both wled_deterministic.py and the pin test; update the catalogue's
table to move the effect from "✓" to "⚠ Looks deterministic but is
not" with the new failure mode.
The full step-by-step workflow (including the rationale, the
deterministic criteria, and the failure modes) lives in the module
docstring of wled_deterministic.py itself — anyone editing the file
sees it immediately.
Smoke-testing your change¶
Before opening a PR:
py -m pytest -q— full suite must pass.node --check racelink/static/racelink.jsandnode --check racelink/static/scenes.jsif you touched JS.py -m pytest tests/test_no_german_in_ui.py— confirms no accidental German strings in operator-facing UI.py -m pytest tests/test_proto_header_drift.py— if you touchedracelink_proto.h.py -m pytest tests/test_exception_hygiene.py— confirms everyexcept Exceptionyou added is either logged or annotated.
For features the test suite can't fully cover (frontend behaviour, RF-level interactions), add a manual smoke checklist to your PR description. The internal engineering ledger contains good examples — every shipped batch ends with a list of "open the app, click X, confirm Y" steps.