[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] draft-farinacci-lisp-05



A few comments:

Thanks for your comments.

There is no version field in the LISP header. There should be. At least a few "set to zero on send, ignore on receive" bits that can be used for extensions without bumping the version number is also a good idea.

We decided to use control-plane type codes.

"In order to eliminate the need for a mapping lookup in the reverse direction, the ETR gleans RLOC information from the LISP header."

I find this undesirable because this way, the behavior of the system can be different depending on which end initiated the communication. Trusting information supplied directly by the other end is also problematic security-wise. I would much prefer it if both ends did an independent mapping lookup.

We stated this because there was a requirement from big content providers. I will loosen the language and say "MAY glean".

"LISP Locator Reach Bits: in the LISP header are set by an ITR to indicate to an ETR the reachability of the Locators in the source site."

I wonder how useful this is in practice. First of all, having 32 RLOCs is way too many.

We have received feedback that it may not be enough.  ;-)

Second, depending on the way xTRs are deployed (see my next message), it could be quite hard for one xTR to know the status of any others for a particular EID. But more fundamentally, this assumes that reachability is a binary property: something is reachable or it isn't. In reality, it's more complex: a certain location may be reachable from some parts of the network and not from others. Even if it is reachable, it may be preferable to use a different RLOC. As such, I think it makes much more sense to determine which RLOC is going to be used by a given ITR for a given EID on a case-by-case basis rather than broadcast some reachability bits.

As I said at the IETF, for a CE deployment of xTRs, the most common failure points in the network, which affects connectivity to the site, is the CE router going down, the CE-to-PE link going down, or the PE router going down. Other failures are rerouted in the core based on richness of connectivity or are damped out due to aggregation.

In these cases, the loc-reach-bits are extremely effective and gets the new status information to the other sites at data-plane rates.

There are also potential issues with synchronizing the numbering of the RLOCs in the mapping system and in this field.

Right, there is no such thing as a free lunch. As mentioned at the Chicago IETF, inserting new RLOCs don't have this problem. That is appending is easy to do. Removing from the middle of the RLOC list gets tricky but you could use a dummy slot for a certainly period of time. We are experimenting with this to see the best way to stale existing cache entries.

We did consider sending a "free cache" bit in the data plane so sites that have cached state could time out the state and re-request a new mapping if they needed it, but as you might guess it would cause a request implosion to the site.

So if you have any ideas on how to solve this, we would like to hear with it and prototype the idea.

"Record TTL: The time in minutes the recipient of the Map-Reply will store the mapping. If the TTL is 0, the entry should be removed from the cache immediately. If the value is 0xffffffff, the recipient can decide locally how long to store the mapping."

Don't think this is a good idea. Experience with the DNS TTL has shown that people may be tempted to set both unreasonably low TTLs or ignore TTLs. Mandated minimum and maximum caching times would make the protocol more deterministic and remove an opportunity for people to shoot themselves in the foot.

No one said the implementations would allow a user to set them.  ;-)

But, umm, isn't determinism better?

" Priority: each RLOC is assigned a priority. Lower values are more preferable. When multiple RLOCs have the same priority, they are used in a load-split fashion. A value of 255 means the RLOC MUST NOT be used."


Isn't it more natural to use 0 as the special case value?

We choose this way to be consistent with other priority values. Namely administrative distance for routing protocols.

"Note that the destination RLOC address MAY be an anycast address if the tunnel egress point may be via more than one physical device."

This makes me rather uncomfortable, as anycasting could get in the way of reachability testing. We have room for a number of RLOCs, why not put in RLOCs for all physical ETRs rather than anycast them?

Well it provides another of indirection at really no cost. The xTRs at a site now if they are part of an anycast group and should know not to clear a loc-reach-bit for another xTR that has gone down in it's anycast group.

In fact, it's simpler than that. Each xTR sets the bit and it remains set until they call go down. When they all go down, there is no access to the site so there is no way to tell others about it. However, if you have a mix of RLOCs that are anycast and unicast, they the above paragraph needs to be implemented.

Is the "Loc-AFI" field the AFI for an RLOC? Why is this field only 8 bits while other AFI fields are 16 bits?

You found a bug, I will fix it. Thanks. Since I don't have an extra bit for the R-bit, how about if we do this:

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- +-+ /| Priority | Weight |R| Loc- AFI | Loc +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- +-+ \| Locator | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- +-+

But since we won't use all the AFIs that are encoded, a byte should be sufficient. Please comment if you think we should stick with 16-bits or compress all occurences of AFI to 8 bits.

"2. Locator unreachability is determined by an ITR by receiving ICMP Network or Host Unreachable messages."

These can be spoofed or filtered/not generated, so both their presence and their absence don't necessarily mean anything. I suggest the only action triggered by such a message should be a one- time reduction in the time until a new mapping request is done. (I.e., if this happens every 300 seconds, reduce by 240 seconds for the first ICMP message but not subsequent ones, which means an immediate request 80% of the time but a maximum of 1 request per 60 seconds.)

We enumerated the methods, we didn't say you should use all of them. We recommend using only the loc-reach-bits.

"3. ETR unreachability is determined when a host sends an ICMP Port Unreachable message."

Which host? The one holding the EID? Or the ETR? The former won't happen if the ETR is unreachable, and an ETR being connected to the network but not being able to function as an ETR seems like a corner case to me.

Well if an ITR didn't have any data to know if the destination site was LISP-capable and encapsulated the packet by copying the inner DA to the outer DA, and the site was using PA addressing, the packet would enter the destination site and travel to the host. The host would not recognize a destination port of 4341, so it would respond with a port unreachable.

But we would not do this anymore with the advent of the lisp- interworking draft.

So basically the use of mapping requests/replies is the only reliable (implicit) reachability detection mechanism. However, it is largely unspecified how ITRs should use this mechanism to determine reachability.

That and receiving data packets from the site.

"perform a route-returnability check"

Return routability check?

A simple anti-spoofing technique.

"the use of a 6-byte Nonce field in the LISP encapsulation header"


The nonce field is only 32 bits in the LISP header earlier in the document.

Will fix. Nice find.

" In practice, this is not really a problem. Hosts typically do not originate IP packets larger than 1500 bytes. And second, an informal survey of ISPs has been taken where nearly all ISP link MTUs are either 4470 bytes or support Ethernet jumbo frames of 9180 bytes. Therefore, we don't anticipate any problems with prepending additional headers."

Even if we assume that all the links within a transit AS support sufficiently large MTUs, this doesn't address the situation where ISPs interconnect over a shared ethernet infrastructure (i.e., an internet exchange). Also, the placement of ITRs/ETRs is left fairly open, with the suggestion that ETRs can be placed in end-user networks. In that case, it's very likely that there is at least one hop in the path that is limited to a 1500-byte MTU. I think it would be helpful to explicitly signal the MTU and possibly the maximum size of packets that can be fragmented that an ETR supports back to ITRs.

I guess we have had enough discussion on this.  ;-)

In my opinion, it would be beneficial to remove pretty much all of the text that doesn't pertain to actual LISP operation from this draft, and move that which is still useful (some of it is a bit stale after five iterations) to a new "LISP architecture" document.

Can you list what is in this document doesn't pertain to LISP operation?

Thanks,
Dino

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg