Section 1
Transport of fragmented UDP packets
appears to be a poorly tested code path on network devices. Some
devices appear to be incapable of transporting fragmented UDP
packets, making it difficult to deploy RADIUS in a network where
those devices are deployed. [BA] In particular, filters routers
and firewalls often drop UDP fragments, since otherwise they would need to
reassemble them in order to apply the filter rules. However, this is not a “transport”
issue, so much as a forwarding/filtering issue. Instead of “Transport”
you might say “handling”. * Connectionless transport. Neither clients nor servers receive positive statements that a "connection" is down. This information has to be deduced instead from the absence of a reply to a request.
[BA] The same thing is also
true of TCP transport, unless you’re willing to wait for a Reset or connection
timeout. That’s why the Watchdog timer is needed. As RADIUS is widely deployed, and has been widely deployed for well over a decade, these issues have been minor in some use-cases, and problematic in others.. New systems may be interested in choosing a different set of trade-offs than those outlined in [RFC2865] Section 2.4. New systems may also be interested in choosing a more reliable transport for use-cases such as inter-server proxying. For those systems, we define RADIUS over TCP
[BA] Note double periods in first sentence, and no period in
the last sentence. As I read the document, it is really only suggesting a
different set of tradeoffs for inter-server proxying. So you might say “For
use-cases such as inter-server proxying, [RTLS] suggests an alternative transport
and security model -- RADIUS over TLS. This document describes the
transport implications of running RADIUS over TLS/TCP.” 1.1.
Applicability of Reliable Transport
The intent of this document is to address transport issues related to RADIUS over TLS [RTLS]. The use of "bare" TCP transport (i.e. without TLS) is NOT RECOMMENDED, as there has been little implementational or operational experience with it. Additionally, [RFC2865] Section 2.4 contains a list of reasons why UDP was originally chosen as the transport protocol for RADIUS. UDP SHOULD be used as transport protocol in all cases where the rationale given in [RFC2865] Section 2.4 applies.
Deployment experience with RADIUS over TLS indicates that it is most useful for inter-server communication, such as inter-domain communication between proxies. These situations benefit from the confidentiality and ciphersuite negotiation that can be provided by TLS. Since TLS is already widely available within the operating systems used by proxies, implementation barriers are low.
RADIUS over TCP has a similar set of use cases. Use of TCP as a transport between a NAS and RADIUS server is a poor fit, since as noted in [RFC3539], there is likely to be insufficient traffic for the congestion window to remain above the minimum value on a long- term basis. The result is an increase in packets due to ACKs as compared to UDP, without a corresponding set of benefits.
In server-server communications the traffic levels in both directions are typically high enough to support a larger congestion window as well as ACK piggy-backing. Through use of an application-layer watchdog as described in [RFC3539], it is possible to address the objections to reliable transport described in [RFC2865] Section 2.4. However, in these scenarios "bare" TCP does not provide for confidentiality or enable negotiation of stronger ciphersuites than are available in RADIUS.
As a result of these considerations, use of RADIUS over TCP SHOULD be restricted to situations where RADIUS over TLS is employed. RADIUS over "bare" TCP is NOT RECOMMENDED.
There are still a number of benefits to using a reliable transport. For example, when RADIUS is used to carry EAP conversions [RFC3579], the EAP exchanges may involve 5 round trips at the RADIUS application layer. We may assume a probability P of packet loss in each direction (with P having a value of 1% or less). Any one authentication attempt will then have at least one lost packet, with a probability of approximately (10 * P).
These lost packets require the supplicant and/or the NAS to re- transmit packets at the application layer. The difficulty with this approach is that retransmission implementations have historically been poor. Some implementations retransmit packets, others do not, and others send new packets rather then performing retransmission. Some implementations are incapable of detecting EAP retransmissions, and will instead treat the retransmitted packet as an error.
These retransmissions have a high likelihood of causing the entire authentication session to fail. For a system with a million logins a day, and having a packet loss probability of P=0.01%, we expect that 0.1% of connections will experience a lost packet. That is, 1,000 user sessions each day will experience authentication failure.
In addition, transport of fragmented UDP packets is a poorly tested code path on network devices. Some devices appear to be incapable of transporting fragmented UDP packets, meaning that the packet loss rate for fragmented packets approaches 100 percent. The net effect can be to prevent the deployment of authentication methods such as EAP-TLS that require large RADIUS packets.
Using a reliable transport method such as TCP means that RADIUS implementations can remove all application-layer retransmissions, and instead rely on the Operating System (OS) kernel's well-tested TCP transport to ensure reliable delivery. In addition, most TCP implementations discover Path MTU better than RADIUS application implementations, resulting in significantly fewer fragmented packets. Modern TCP implementations also implement anti-spoofing provisions, which is more difficult to do in UDP applications.
Transporting RADIUS over TCP means that the RADIUS applications can leverage these additional protections offered by TCP.
However, there are also some drawbacks to using TCP. RADIUS over TCP has some drawbacks, as noted in [RFC2865] Section 2.4. [RFC3539] Section 2 discusses further issues with using TCP as a transport for Authentication, Authorization, and/or Accounting (AAA) protocols such as RADIUS.
Specifically, as noted in [RFC3539] Section 2.1, for systems originating low numbers of RADIUS request packets, inter-packet spacing is often larger than the packet RTT. In those situations, RADIUS over TCP SHOULD NOT be used.
In general, RADIUS clients generating small amounts of RADIUS traffic SHOULD NOT use TCP. This suggestion will usually apply to most NASes, and to most clients that originate CoA-Request and Disconnect- Request packets.
RADIUS over TCP is most applicable to RADIUS proxies that exchange a large volume of packets with RADIUS clients and servers (10's to 1000's of packets per second). In those situations, RADIUS over TCP may be a good fit, and may result in increased network stability and performance.
[BA] Suggested rewrite: Section 1.1 The intent of this document is to address
transport issues related to RADIUS over TLS [RTLS] in inter-server
communications scenarios, such as inter-domain communication between
proxies. These situations benefit from the confidentiality and
ciphersuite negotiation that can be provided by TLS. Since
TLS is already widely available within the operating systems
used by proxies, implementation barriers are low. In scenarios where RADIUS proxies exchange a
large volume of packets (10+ packets per second), it is likely
that there will be sufficient traffic to enable the congestion window to be
widened beyond the minimum value on a long-term basis,
enabling ACK piggy-backing. Through use of an application-layer watchdog as
described in [RFC3539], it is possible to address the objections to reliable
transport described in [RFC2865] Section 2.4 without
substantial watchdog traffic, since regular traffic is
expected in both directions. In addition, use of RADIUS over TLS/TCP has
been found to improve operational performance when used with
multi-round trip authentication mechanisms such as RADIUS
over EAP [RFC3579]. In such exchanges, it is typical for EAP
fragmentation to increase the number of round-trips
required. For example, where EAP-TLS authentication [RFC5216] is attempted
and both the EAP peer and server utilize certificate chains of
8KB, as many as 15 round-trips can be required if RADIUS
packets are restricted to 1500 octets in size. Fragmentation of
RADIUS over UDP packets is generally inadvisable due to lack of
fragmentation support within intermediate devices such as filtering
routers, firewalls and NATs. However, since RADIUS over UDP
implementations typically do not support MTU discovery, fragmentation can
occur even when the maximum RADIUS over UDP packet size is
restricted to 1500 octets. These problems disappear if a 4096
application-layer payload can be used alongside RADIUS over
TLS/TCP. Since most TCP implementations support MTU discovery, the TCP
MSS is automatically adjusted to account for the MTU, and the larger
congestion window supported by TCP may allow multiple TCP
segments to be sent within a single window. As a
result, RADIUS/EAP traffic required for an EAP-TLS authentication
with 8KB certificate chains may be reduced to 7
round-trips or less, resulting in substantially reduced
authentication times. In addition, experience indicates that EAP
sessions transported over RTLS are less likely to abort
unsuccessfully. Historically, RADIUS over UDP implementations have exhibited
poor retransmission behavior. Some implementations retransmit
packets, others do not, and others send new packets rather then
performing retransmission. Some implementations are incapable of detecting
EAP retransmissions, and will instead treat the retransmitted packet
as an error. As a result, within RADIUS over UDP
implementations, retransmissions have a high likeilhood of causing an EAP
authentication session to fail. For a system with a million
logins a day running EAP-TLS mutual authentication with 15 round-trips, and
having a packet loss probability of P=0.01%, we expect that 0.3% of
connections will experience at least one lost packet. That is, 3,000
user sessions each day will experience authentication failure.
This is an unacceptable failure rate for a mass-market network service.
Using a reliable transport method such as TCP
means that RADIUS implementations can remove all
application-layer retransmissions, and instead rely on the Operating System (OS)
kernel's well-tested TCP transport to ensure reliable delivery. In
addition, most TCP implementations discover Path MTU better than
RADIUS application implementations, resulting in significantly
fewer fragmented packets. Modern TCP implementations also implement
anti-spoofing provisions, which is more difficult to do in UDP applications. In contrast, use of TLS/TCP as a transport
between a NAS and a RADIUS server is a poor fit. As noted in
[RFC3539] Section 2.1, for systems originating low numbers of RADIUS
request packets, inter-packet spacing is often larger than the
packet RTT, and as a result, the congestion window will t
ypically not remain above the minimum value on a long-term basis.
The result is an increase in packets due to ACKs as
compared to UDP, without a corresponding set of benefits. In
addition, the lack of substantial traffic implies the need for
additional watchdog traffic to confirm reachability. As a result, the objections to reliable
transport indicated in [RFC2865] Section 2.4 continue to apply to
NAS-RADIUS server communications and UDP SHOULD continue to be
used as the transport protocol in this scenario. In addition,
it is recommended that implementations of "RADIUS Dynamic
AUthorization Extensions" [RFC5176] SHOULD continue to utilize UDP transport, since
the volume of dynamic authorization traffic is usually
expected to be small. Since "bare" TCP does not provide for
confidentiality or enable negotiation of credible ciphersuites, its use
is not appropriate for inter-server communications where strong
security is required. As a result the use of "bare" TCP transport (i.e.
without TLS) is NOT RECOMMENDED for use in any situation, and there has been
little or no operational experience with it. |