Error Handling & Fault Confinement
CAN's Five Error Detection Mechanisms
CAN achieves a residual error probability of less than 4.7 × 10⁻¹¹ through five independent mechanisms:
| Mechanism | Detected By | What It Catches |
|---|---|---|
| Bit Error | Transmitter | Transmitter reads back a different bit than it sent (outside arbitration) |
| Stuff Error | Receiver | More than 5 consecutive same-polarity bits |
| CRC Error | Receiver | Computed CRC doesn't match received CRC |
| Form Error | Receiver | Fixed-form fields contain unexpected values |
| ACK Error | Transmitter | No node drives the ACK slot dominant |
Key Concept: All five mechanisms are implemented in hardware. You don't write error detection code — the controller does it automatically for every frame.
Error Signaling: The Error Frame
When any node detects an error, it transmits an Error Frame: 6 dominant bits (Error Active) or 6 recessive bits (Error Passive) followed by 8 recessive delimiter bits. The dominant Error Flag deliberately violates bit stuffing to corrupt the frame and force all nodes to discard it.
Fault Confinement: TEC, REC, and Bus-Off
Two error counters prevent a faulty node from permanently disrupting the bus:
- TEC (Transmit Error Counter): Incremented on transmit errors
- REC (Receive Error Counter): Incremented on receive errors
| State | Condition | Behavior |
|---|---|---|
| Error Active | TEC < 128 AND REC < 128 | Normal operation. Sends dominant Error Flags. |
| Error Passive | TEC ≥ 128 OR REC ≥ 128 | Sends recessive Error Flags. Must wait extra 8 bits before retransmitting. |
| Bus Off | TEC ≥ 256 | Disconnects from bus entirely. Requires 128 × 11 recessive bits or hardware reset to recover. |
Counter rules: transmit error = TEC + 8, receive error = REC + 1, successful frame = counter − 1.
Key Concept: The 8:1 increment ratio means a faulty transmitter reaches Bus Off in ~32 errors (256/8), preventing it from flooding the bus indefinitely.
Bus-Off Recovery in Practice
- Automatic recovery: Wait for 128 × 11 recessive bits (~3.2 ms at 500 kbit/s), reinitialize CAN controller
- Limited retry: Track recovery attempts, stop after 3–5 failures, set permanent DTC
- Power cycle dependency: Some OEMs require key-off/key-on for safety-critical ECUs
Common Mistake: A recurring Bus Off almost always indicates a physical layer problem. Check wiring, termination, transceiver, and ground offset. Don't just increase the retry count.
Exercise: Find the TEC and REC registers in your CAN controller's register map. Write a function that logs both counters periodically. On a healthy bus, both should stay at 0 or very low values.