[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RRG] draft-farinacci-lisp-05
A few comments:
Thanks for your comments.
There is no version field in the LISP header. There should be. At
least a few "set to zero on send, ignore on receive" bits that can
be used for extensions without bumping the version number is also a
good idea.
We decided to use control-plane type codes.
"In order to eliminate the need for a mapping lookup in the reverse
direction, the ETR gleans RLOC information from the LISP header."
I find this undesirable because this way, the behavior of the system
can be different depending on which end initiated the communication.
Trusting information supplied directly by the other end is also
problematic security-wise. I would much prefer it if both ends did
an independent mapping lookup.
We stated this because there was a requirement from big content
providers. I will loosen the language and say "MAY glean".
"LISP Locator Reach Bits: in the LISP header are set by an ITR to
indicate to an ETR the reachability of the Locators in the source
site."
I wonder how useful this is in practice. First of all, having 32
RLOCs is way too many.
We have received feedback that it may not be enough. ;-)
Second, depending on the way xTRs are deployed (see my next
message), it could be quite hard for one xTR to know the status of
any others for a particular EID. But more fundamentally, this
assumes that reachability is a binary property: something is
reachable or it isn't. In reality, it's more complex: a certain
location may be reachable from some parts of the network and not
from others. Even if it is reachable, it may be preferable to use a
different RLOC. As such, I think it makes much more sense to
determine which RLOC is going to be used by a given ITR for a given
EID on a case-by-case basis rather than broadcast some reachability
bits.
As I said at the IETF, for a CE deployment of xTRs, the most common
failure points in the network, which affects connectivity to the site,
is the CE router going down, the CE-to-PE link going down, or the PE
router going down. Other failures are rerouted in the core based on
richness of connectivity or are damped out due to aggregation.
In these cases, the loc-reach-bits are extremely effective and gets
the new status information to the other sites at data-plane rates.
There are also potential issues with synchronizing the numbering of
the RLOCs in the mapping system and in this field.
Right, there is no such thing as a free lunch. As mentioned at the
Chicago IETF, inserting new RLOCs don't have this problem. That is
appending is easy to do. Removing from the middle of the RLOC list
gets tricky but you could use a dummy slot for a certainly period of
time.
We are experimenting with this to see the best way to stale existing
cache entries.
We did consider sending a "free cache" bit in the data plane so sites
that have cached state could time out the state and re-request a new
mapping if they needed it, but as you might guess it would cause a
request implosion to the site.
So if you have any ideas on how to solve this, we would like to hear
with it and prototype the idea.
"Record TTL: The time in minutes the recipient of the Map-Reply will
store the mapping. If the TTL is 0, the entry should be removed from
the cache immediately. If the value is 0xffffffff, the recipient can
decide locally how long to store the mapping."
Don't think this is a good idea. Experience with the DNS TTL has
shown that people may be tempted to set both unreasonably low TTLs
or ignore TTLs. Mandated minimum and maximum caching times would
make the protocol more deterministic and remove an opportunity for
people to shoot themselves in the foot.
No one said the implementations would allow a user to set them. ;-)
But, umm, isn't determinism better?
" Priority: each RLOC is assigned a priority. Lower values are more
preferable. When multiple RLOCs have the same priority, they are
used in a load-split fashion. A value of 255 means the RLOC MUST NOT
be used."
Isn't it more natural to use 0 as the special case value?
We choose this way to be consistent with other priority values. Namely
administrative distance for routing protocols.
"Note that the destination RLOC address MAY be an anycast address if
the tunnel egress point may be via more than one physical device."
This makes me rather uncomfortable, as anycasting could get in the
way of reachability testing. We have room for a number of RLOCs, why
not put in RLOCs for all physical ETRs rather than anycast them?
Well it provides another of indirection at really no cost. The xTRs at
a site now if they are part of an anycast group and should know not to
clear a loc-reach-bit for another xTR that has gone down in it's
anycast group.
In fact, it's simpler than that. Each xTR sets the bit and it remains
set until they call go down. When they all go down, there is no access
to the site so there is no way to tell others about it. However, if
you have a mix of RLOCs that are anycast and unicast, they the above
paragraph needs to be implemented.
Is the "Loc-AFI" field the AFI for an RLOC? Why is this field only 8
bits while other AFI fields are 16 bits?
You found a bug, I will fix it. Thanks. Since I don't have an extra
bit for the R-bit, how about if we do this:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
+-+
/| Priority | Weight |R| Loc-
AFI |
Loc +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
+-+
\|
Locator |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
+-+
But since we won't use all the AFIs that are encoded, a byte should be
sufficient. Please comment if you think we should stick with 16-bits
or compress all occurences of AFI to 8 bits.
"2. Locator unreachability is determined by an ITR by receiving ICMP
Network or Host Unreachable messages."
These can be spoofed or filtered/not generated, so both their
presence and their absence don't necessarily mean anything. I
suggest the only action triggered by such a message should be a one-
time reduction in the time until a new mapping request is done.
(I.e., if this happens every 300 seconds, reduce by 240 seconds for
the first ICMP message but not subsequent ones, which means an
immediate request 80% of the time but a maximum of 1 request per 60
seconds.)
We enumerated the methods, we didn't say you should use all of them.
We recommend using only the loc-reach-bits.
"3. ETR unreachability is determined when a host sends an ICMP Port
Unreachable message."
Which host? The one holding the EID? Or the ETR? The former won't
happen if the ETR is unreachable, and an ETR being connected to the
network but not being able to function as an ETR seems like a corner
case to me.
Well if an ITR didn't have any data to know if the destination site
was LISP-capable and encapsulated the packet by copying the inner DA
to the outer DA, and the site was using PA addressing, the packet
would enter the destination site and travel to the host. The host
would not recognize a destination port of 4341, so it would respond
with a port unreachable.
But we would not do this anymore with the advent of the lisp-
interworking draft.
So basically the use of mapping requests/replies is the only
reliable (implicit) reachability detection mechanism. However, it is
largely unspecified how ITRs should use this mechanism to determine
reachability.
That and receiving data packets from the site.
"perform a route-returnability check"
Return routability check?
A simple anti-spoofing technique.
"the use of a 6-byte Nonce field in the LISP encapsulation header"
The nonce field is only 32 bits in the LISP header earlier in the
document.
Will fix. Nice find.
" In practice, this is not really a problem. Hosts typically do not
originate IP packets larger than 1500 bytes. And second, an informal
survey of ISPs has been taken where nearly all ISP link MTUs are
either 4470 bytes or support Ethernet jumbo frames of 9180 bytes.
Therefore, we don't anticipate any problems with prepending
additional headers."
Even if we assume that all the links within a transit AS support
sufficiently large MTUs, this doesn't address the situation where
ISPs interconnect over a shared ethernet infrastructure (i.e., an
internet exchange). Also, the placement of ITRs/ETRs is left fairly
open, with the suggestion that ETRs can be placed in end-user
networks. In that case, it's very likely that there is at least one
hop in the path that is limited to a 1500-byte MTU. I think it would
be helpful to explicitly signal the MTU and possibly the maximum
size of packets that can be fragmented that an ETR supports back to
ITRs.
I guess we have had enough discussion on this. ;-)
In my opinion, it would be beneficial to remove pretty much all of
the text that doesn't pertain to actual LISP operation from this
draft, and move that which is still useful (some of it is a bit
stale after five iterations) to a new "LISP architecture" document.
Can you list what is in this document doesn't pertain to LISP operation?
Thanks,
Dino
--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg