Here is my review.
Abstract
RFC 2865 defines a Status-Server code for use in RADIUS, but labels it as "Experimental" without further discussion. This document describes a practical use for the Status-Server packet code, which is to let clients query the status of a RADIUS server. These queries, and responses (if any) enable the client to make more informed decisions. The result is a more stable, and more robust RADIUS architecture.
How about something more along the lines of the RFC 5176 abstract?
This document describes a deployed extension to the Remote Authentication Dial In User Service (RADIUS) protocol, enabling clients to query the status of a RADIUS server. This extension utilizes the Status-Server (12) Code, which was reserved for experimental use in RFC 2865.
Section 1
The RADIUS Working Group was formed in 1995 to document the protocol of the same name, and created a number of standards surrounding the protocol. It also defined experimental commands within the protocol, without elaborating further on the potential uses of those commands. One of the commands so defined was Status-Server ([RFC2865] Section 3.).
This document describes how some current implementations are using Status-Server packets as a method for querying the status of a RADIUS server. These queries do not otherwise affect the normal operation of a server, and do not result in any side effects other than perhaps incrementing an internal packet counter.
These queries are not intended to implement the application-layer watchdog messages described in [RFC3539] Section 3.4. That document describes Authentication, Authorization, and Accounting (AAA) protocols that run over reliable transports which handle retransmissions internally. Since RADIUS runs over the User Datagram Protocol (UDP) rather than Transport Control Protocol (TCP), the full watchdog mechanism is not applicable here.
Not sure the history is necessarily correct (e.g. I believe that the RADIUS Working Group was formed earlier). In any case, it is probably best to focus on the purpose of this document. How about this?
This document specifies a deployed extension to the Remote Authentication Dial In User Service (RADIUS) protocol, enabling clients to query the status of a RADIUS server. While the Status-Server Code (12) was defined as experimental in [RFC2865] Section 3, details of the operation and potential uses of the Code were not provided.
As with the core RADIUS protocol, the Status-Server extension is stateless, and queries do not otherwise affect the normal operation of a server, nor do they result in any side effects, other than perhaps incrementing of an internal packet counter. Most of the implementations of this extension have utilized it alongside implementations of RADIUS as defined in [RFC2865], so that this document focuses solely on the use of this extension with UDP transport.
Network Access Server (NAS) The device providing access to the network. Also known as the Authenticator (in IEEE 802.1x terminology) or RADIUS client.
"x" -> "X"
Proxy Server A RADIUS server that acts as a Home Server to the NAS, but in turn proxies the request to another Proxy Server, or to a Home Server.
I am not sure that the use of the term "Home Server" here adds clarity. The definition of proxy from RFC 2607 might be more applicable:
RADIUS proxy In order to provide for the routing of RADIUS authentication and accounting requests, a RADIUS proxy can be employed. To the NAS, the RADIUS proxy appears to act as a RADIUS server, and to the RADIUS server, the proxy appears to act as a RADIUS client.
1.1 Applicability
I think this document needs an applicability section, to explain potential differences between this specification and existing implementations, as well as why it is being published as Informational, as opposed to Experimental or Standards Track. Suggest the following:
1.1. Applicability
This protocol is being recommended for publication as an Informational RFC rather than as a standards-track RFC because of problems that cannot be fixed without creating incompatibilities with deployed implementations. This includes security vulnerabilities. While fixes are recommended, they cannot be made mandatory since this would be incompatible with existing implementations.
Existing implementations of this protocol do not support the Message-Authenticator attribute. This enables spoofing of Status-Server packets. In order to remedy this problem, this specification recommends the use of the Message-Authenticator attribute to provide per-packet authentication and integrity protection.
With existing implementations of this protocol, the potential exists for Status-Server requests to be in conflict with Access-Request or Accounting-Requests packets using the same Identifier. This specification recommends techniques to avoid this problem.
[Add information on other issues here]
2. Problem Statement
It is often useful to know if a RADIUS server is alive and responding to requests. The most accurate way to obtain this information is to query the server via application protocol traffic, as other methods are either less accurate, or cannot be performed remotely.
The reasons for wanting to know the status of a server are many. The administrator may simply be curious if the server is responding, and may not have access to NAS or traffic data that would give him that information. The queries may also be performed automatically by a NAS or proxy server, which is configured to send packets to a RADIUS server, and where that server may not be responding. That is, while [RFC2865] Section 2.6 indicates that sending Keep-Alives is harmful, it may be useful to send "Are you Alive" queries to a server once it has been marked "dead" due to prior unresponsiveness.
The occasional query to a "dead" server offers little additional load on the network or server, and permits clients to more quickly discover when the server returns to a responsive state. Overall, status queries can be a useful part of the deployment of a RADIUS server.
RFC 2865 Section 2.6 strongly discourages the use of keep-alives. From reading this section, I am unclear whether the intent is to refute the arguments made there, or to articulate how the uses of Status-Server defined here go beyond those of the "test RADIUS request" described in RFC 2865.
For example, unlike a RADIUS Access-Request, the Status-Server packet cannot be forwarded, and therefore the lack of a response can only be due either to packet loss or to a problem with the server to whom the packet is sent. In contrast, an Access-Request might not be answered because of a problem somewhere along the chain between the sender and the RADIUS server. This difference allows the Status-Server packet to be used as a diagnostic tool in ways that an Access-Request could not be.
Overall, I wonder whether some of the introductory material in Section 4.3 might be removed from that section and instead be revised and presented in this section. For example:
A common problem in RADIUS client implementations is the implementation of a robust fail-over mechanism between proxies. A client may have multiple proxies configured, with one proxy marked as primary and another marked as secondary. If the client does not receive a response to a request sent to the primary proxy, it can "fail over" to the secondary, and send requests to the secondary instead of to the primary proxy.
However, it is possible that the lack of a response to requests sent to the primary proxy was due not to a failure within the the primary, but to alternative causes such as a failed link along the path to the destination server, or the failure of a downstream proxy or server. In such a situation, it may be useful for the client to be able to distinguish between failure causes. For example, if the primary proxy is down, then quick failover to the secondary proxy would be prudent, whereas if a downstream failure is the cause, then the value of failing over to a secondary proxy will depend on whether packets forwarded by the secondary will utilize independent links, intermediaries or destination servers.
Since the Status-Server packet is non-forwardable, lack of a response may only be due to packet loss or the failure of the server in the destination IP address, not due to faults in downstream links, proxies or servers. It therefore provides an unambiguous indication of the status of a proxy or server.
Sections 2.1-2.2
I find these sections puzzling, because they suggest alternatives to the Status-Server packet that do not serve the same function. Given that Status-Server packets are not forwardable, they serve a different purpose than the "test RADIUS requests" which RFC 2865 recommends against. Given this, talking about "alternatives" in these sections is somewhat confusing. What might make more sense is to describe the function that Status-Server packets provide and why this is not provided by alternatives such as the RADIUS Access-Request packet.
Section 2.3
Since the packet is otherwise undefined, it does not cause interoperability issues to create implementation-specific definitions for it. The difficulty until now has been defining an interoperable method of performing these queries. While the Status-Server packet format was not defined in RFC 2865, it was implemented by Ascend and other vendors. As far as I know, a number of deployments did use NAS and RADIUS servers from different vendors so that "implementation-specific definitions" would indeed have resulted in interoperability problems.
In any case, as I understand it, the goal of this work item is to document existing implementations of the Status-Server extension, correct?
Section 2.3.1
Status-Server SHOULD be used instead of Access-Request to query the responsiveness of a server. In this use case, the protocol exchange between client and server is similar to the usual exchange of Access- Request and Access-Accept, as shown below.
NAS RADIUS server --- ------------- Status-Server/ Message-Authenticator -> <- Access-Accept/ Reply-Message
The Status-Server packet MUST contain a Message-Authenticator attribute for security. The response (if any) to a Status-Server packet sent to an authentication port SHOULD be an Access-Accept packet. Other response packet codes are NOT RECOMMENDED. The list of attributes that are permitted in the Access-Accept packet is given in the Table of Attributes in Section 6, below.
Given that the Status-Server packet is not forwardable, this section is a bit confusing. Also, I'm not clear how useful the diagram is. I'd suggest focusing on the basics of the exchange:
Status-Server Exchange
Status-Server packets are typically sent to the destination address and port of a RADIUS server or proxy. A Message-Authenticator attribute SHOULD be included so as to provide per-packet authentication and integrity protection. A single Status-Server packet MUST be included within a UDP datagram. RADIUS proxies MUST NOT forward Status-Server packets.
A RADIUS server or proxy implementing this specification SHOULD respond to a Status-Server packet with an Access-Accept. Other response packet codes (such as Access-Challenge or Access-Reject) are NOT RECOMMENDED. The list of attributes that are permitted in Status-Server and Access-Accept packets responding to Status-Server packets are provided in the Section 6. BTW, I'm curious as to whether a Status-Server packet can be sent to the address and port (3799) of a DAS, and if so, what the appropriate response is.
Section 2.3.2
The Status-Server packet MUST contain a Message-Authenticator attribute for security.
Given that implementations exist that did not support Message-Authenticator, my suggestion is that this become a SHOULD.
Section 3
This method MUST be used to avoid conflicts between Status-Server and other packet types.
Given that implementations exist that did not support this, my suggestion is that this become a SHOULD as well.
In addition to the above requirements, all Status-Server packets MUST include a Message-Authenticator attribute. Failure to do so would mean that the packets could be trivially spoofed.
Suggest MUST -> SHOULD here.
============================= This is a reminder of an ongoing RADEXT WG last call on the Status Server
specification, prior to sending this document on to the IESG for
publication as an Informational RFC. The document is available for
inspection here: http://tools.ietf.org/html/draft-ietf-radext-status-server
RADEXT
WG last call will last until August 7, 2009. Please send comments to
the RADEXT WG mailing list using the format described in the RADEXT
Issues list (http://www.drizzle.com/~aboba/RADEXT/).
|