Error Handling & Fault Confinement

Module 3: CAN Data Link Layer40 min

Error Handling & Fault Confinement

CAN's Five Error Detection Mechanisms

CAN achieves a residual error probability of less than 4.7 × 10⁻¹¹ through five independent mechanisms:

MechanismDetected ByWhat It Catches
Bit ErrorTransmitterTransmitter reads back a different bit than it sent (outside arbitration)
Stuff ErrorReceiverMore than 5 consecutive same-polarity bits
CRC ErrorReceiverComputed CRC doesn't match received CRC
Form ErrorReceiverFixed-form fields contain unexpected values
ACK ErrorTransmitterNo node drives the ACK slot dominant
Key Concept: All five mechanisms are implemented in hardware. You don't write error detection code — the controller does it automatically for every frame.

Error Signaling: The Error Frame

When any node detects an error, it transmits an Error Frame: 6 dominant bits (Error Active) or 6 recessive bits (Error Passive) followed by 8 recessive delimiter bits. The dominant Error Flag deliberately violates bit stuffing to corrupt the frame and force all nodes to discard it.

Fault Confinement: TEC, REC, and Bus-Off

Two error counters prevent a faulty node from permanently disrupting the bus:

  • TEC (Transmit Error Counter): Incremented on transmit errors
  • REC (Receive Error Counter): Incremented on receive errors
StateConditionBehavior
Error ActiveTEC < 128 AND REC < 128Normal operation. Sends dominant Error Flags.
Error PassiveTEC ≥ 128 OR REC ≥ 128Sends recessive Error Flags. Must wait extra 8 bits before retransmitting.
Bus OffTEC ≥ 256Disconnects from bus entirely. Requires 128 × 11 recessive bits or hardware reset to recover.

Counter rules: transmit error = TEC + 8, receive error = REC + 1, successful frame = counter − 1.

Key Concept: The 8:1 increment ratio means a faulty transmitter reaches Bus Off in ~32 errors (256/8), preventing it from flooding the bus indefinitely.

Bus-Off Recovery in Practice

  • Automatic recovery: Wait for 128 × 11 recessive bits (~3.2 ms at 500 kbit/s), reinitialize CAN controller
  • Limited retry: Track recovery attempts, stop after 3–5 failures, set permanent DTC
  • Power cycle dependency: Some OEMs require key-off/key-on for safety-critical ECUs
Common Mistake: A recurring Bus Off almost always indicates a physical layer problem. Check wiring, termination, transceiver, and ground offset. Don't just increase the retry count.
Exercise: Find the TEC and REC registers in your CAN controller's register map. Write a function that logs both counters periodically. On a healthy bus, both should stay at 0 or very low values.