Introduction
Delivering consistent application performance over unreliable or congested links is a constant challenge for most networks. Even with great features like Enhanced Application Aware Routing or TCP Optimization, there are link conditions that go beyond what failover, load balancing or optimization can solve.
By adding a recovery mechanism at the packet level, FEC allows Cisco SD-WAN to mask packet loss and maintain application performance without relying on retransmissions.
In this post, we will explore how effective FEC can be in SD-WAN. By simulating lossy conditions, and measuring recovery rates.
If you evaluating FEC for your deployment or just curious about how it works, this post will walk you through both the theory and practice.
Let’s go!
What is Forward Error Correction (FEC)?
Forward Error Correction (FEC) is a technique that improves data transmission reliability by adding redundant information to packets before they’re sent across the network. Instead of waiting for retransmissions when packets are lost, the receiver uses this redundancy to reconstruct missing data on the fly.
In the Cisco SD-WAN implementation, FEC creates blocks of 4 data packets + 1 parity packet. If one of those 4 packets is lost along the way, the receiver can fill in the gaps using the parity packet by performing an XOR operation.
Let’s see it in a diagram
The sender transmits information to the receiver, but packet 3 is lost in transit. The receiver can use the parity packet to reconstruct packet 3 and avoid retransmissions and delays that would impact application experience.
Notice that if more than 1 packet is lost, including the parity packet, the reconstruction is not possible. The block size is always 4 and it cannot be changed. Blocks can contain packets from multiple flows.
Note The fact that FEC adds 1 parity packet for every block of 4 packets increases BW consumption.
There are 2 modes of operation
- Always: FEC will be applied to all traffic matching the policy sequence regardless of packet loss levels.
- Adaptive: Set a packet loss threshold to start using FEC. For example, with 2% or more packet loss start applying FEC to the traffic. Loss percent is computed with BFD packets.
FEC is particularly useful in real-time applications like voice, video, or interactive sessions, where waiting for retransmissions would cause severe delays.
Importantly, FEC operates between SD-WAN edge devices, making it completely transparent to the applications, there’s no need to modify client or server behavior. However, it only works when using IPSec encapsulation; it’s not supported over GRE tunnels.
One critical implementation detail is packet size: if packets are too large and end up being fragmented, FEC’s ability to reconstruct them is significantly reduced. To get the most out of FEC, make sure the payload size stays below the path MTU to avoid fragmentation.
Configuration
Using Policy Groups, we can configure FEC through a data policy that matches interesting traffic and applies an action of Loss Correction
In my case, I matched all the traffic between 172.16.10.0/24 and 172.16.100.0/24. Notice we have the two modes of operations available: Always and Adaptive
If FEC Adaptive is selected, the available thresholds are between 1%-5% loss.
Here is the full configuration of my policy:
vsmart_1# show running-config policy
policy
data-policy data_all_FEC
vpn-list vpn_Corporate_Users
sequence 1
match
source-ip 172.16.100.0/24
destination-ip 172.16.10.0/24
!
action accept
loss-protect fec-always
loss-protection forward-error-correction always
!
!
sequence 11
match
source-ip 172.16.10.0/24
destination-ip 172.16.100.0/24
!
action accept
loss-protect fec-always
loss-protection forward-error-correction always
!
!
default-action accept
!
!
lists
vpn-list vpn_Corporate_Users
vpn 10
!
site-list site_10_100
site-id 10
site-id 100
!
!
apply-policy
site-list site_10_100
data-policy data_all_FEC from-service
!
!
!
Verifying FEC
There aren’t a lot of commands, we can confirm FEC is operational with the following command:
Munich_DC100-1#show sdwan tunnel statistics fec
tunnel stats ipsec 21.101.0.2 21.11.0.2 12346 12346
fec-rx-data-pkts 16243
fec-rx-parity-pkts 4075
fec-tx-data-pkts 7
fec-tx-parity-pkts 1
fec-reconstruct-pkts 935
fec-capable true
fec-dynamic false
The fec-reconstruct-pkts indicate that 935 packets have been recovered.
Also, notice that we can easily see how many parity packets are sent and received which are roughly 1/4 of total sent/received data packets.
The same information is also available through the real-time information on the Manager’s UI
Testing FEC
Let’s run some tests to see FEC in action and the amount of packet loss that can be recovered. I will show different results to understand where FEC delivers better results.
Note there is loss outside of the SD-WAN routers I cannot control so in order to have more precise results I had to found the rate at which I got 0% packet loss most of the time with my iperf3 results and start introducing controlled loss from there.
iperf -c 172.16.100.11 -u -b 450k -t 30 -l 361 –dscp ef
A bandwidth of 450k is around 5 VoIP calls and using a payload of 361 bytes.
In this case, I am running unidirectional tests, but keep in mind FEC works in both directions.
Loss% introduced | Total Sent Packets | Total Received Packets | Recovered Packets | Effective Loss % |
---|---|---|---|---|
1 | 4693 | 4639 | 54 | 0 |
2 | 4694 | 4588 | 96 | 0,24 |
3 | 4693 | 4558 | 111 | 0,58 |
4 | 4694 | 4524 | 147 | 0,51 |
5 | 4693 | 4448 | 195 | 0,68 |
6 | 4693 | 4401 | 231 | 1,3 |
7 | 4693 | 4374 | 238 | 1,8 |
8 | 4693 | 4331 | 283 | 1,8 |
9 | 4693 | 4292 | 297 | 2,2 |
10 | 4693 | 4215 | 304 | 3,8 |
12 | 4693 | 4122 | 348 | 3,8 |
15 | 4695 | 3941 | 382 | 8 |
18 | 4696 | 3815 | 356 | 11 |
20 | 4696 | 3731 | 368 | 13 |
Let’s see some interesting visuals:
As packet loss increases, the number of recovered packets also grows - up to a certain point. This is expected: FEC adds redundancy, and the more packets are lost, the more recovery is needed. However, there’s a natural limit to this capability. If two or more packets within the same FEC block are lost, including the parity packet, recovery becomes impossible, and effective loss starts to climb.
It’s also important to note that FEC is a resource intensive feature, hence it should be activated for critical traffic and ideally using a packet loss threshold rather than always.
While this lab setup isn’t a perfect replica of real-world conditions, the results are still insightful. FEC was able to recover nearly all lost packets with up to 5% introduced loss and continued to recover around 70% of packets at ~9% loss. Beyond that, recovery efficiency starts to drop. That said, it’s uncommon to see consistent 10%+ loss in production WAN transports and even more so in both directions.
Finally, although these tests were unidirectional, it’s worth noting that FEC can be applied independently on each direction. This means a well-tuned deployment could tolerate about 5% packet loss per direction while maintaining good performance.
Conclusion
Forward Error Correction (FEC) is a proactive technique that adds redundancy before packet transmission, allowing the receiver to recover from certain losses without needing retransmissions. This makes it especially valuable for real-time applications like voice and video, where waiting for retries would introduce harmful delays.
Remember that FEC isn’t free as it introduces overhead. The receiving SD-WAN edge must use additional processing power to reconstruct lost packets, use it for critical traffic and ideally after a specified packet loss threshold.
FEC is not a replacement for fixing poor network links. Instead, it acts as a smart mitigation layer that helps smooth over transient or moderate packet loss, keeping user experience consistent even when the network isn’t perfect.
Overall, when deployed correctly, FEC can be a powerful tool in your SD-WAN toolbox — helping ensure consistent application performance over imperfect networks.
💭 What’s your take on Forward Error Correction in SD-WAN? Have you used it before? Do you have questions about how it works or when to enable it?
Drop your thoughts or doubts in the comments! I’d love to hear how others are approaching FEC in real world deployments. Let’s learn from each other!