[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] draft-farinacci-lisp-05

To: Iljitsch van Beijnum <iljitsch@muada.com>
Subject: Re: [RRG] draft-farinacci-lisp-05
From: Dino Farinacci <dino@cisco.com>
Date: Tue, 18 Dec 2007 21:20:05 -0800
Cc: Routing Research Group list <rrg@psg.com>
In-reply-to: <5A0C2670-8696-41CC-8E72-2AE623BB8371@muada.com>
References: <5A0C2670-8696-41CC-8E72-2AE623BB8371@muada.com>

A few comments:

Thanks for your comments.

There is no version field in the LISP header. There should be. Atleast a few "set to zero on send, ignore on receive" bits that canbe used for extensions without bumping the version number is also agood idea.

We decided to use control-plane type codes.

"In order to eliminate the need for a mapping lookup in the reversedirection, the ETR gleans RLOC information from the LISP header."
I find this undesirable because this way, the behavior of the systemcan be different depending on which end initiated the communication.Trusting information supplied directly by the other end is alsoproblematic security-wise. I would much prefer it if both ends didan independent mapping lookup.

We stated this because there was a requirement from big contentproviders. I will loosen the language and say "MAY glean".

"LISP Locator Reach Bits: in the LISP header are set by an ITR toindicate to an ETR the reachability of the Locators in the sourcesite."
I wonder how useful this is in practice. First of all, having 32RLOCs is way too many.

We have received feedback that it may not be enough.  ;-)

Second, depending on the way xTRs are deployed (see my nextmessage), it could be quite hard for one xTR to know the status ofany others for a particular EID. But more fundamentally, thisassumes that reachability is a binary property: something isreachable or it isn't. In reality, it's more complex: a certainlocation may be reachable from some parts of the network and notfrom others. Even if it is reachable, it may be preferable to use adifferent RLOC. As such, I think it makes much more sense todetermine which RLOC is going to be used by a given ITR for a givenEID on a case-by-case basis rather than broadcast some reachabilitybits.

As I said at the IETF, for a CE deployment of xTRs, the most commonfailure points in the network, which affects connectivity to the site,is the CE router going down, the CE-to-PE link going down, or the PErouter going down. Other failures are rerouted in the core based onrichness of connectivity or are damped out due to aggregation.

In these cases, the loc-reach-bits are extremely effective and getsthe new status information to the other sites at data-plane rates.

There are also potential issues with synchronizing the numbering ofthe RLOCs in the mapping system and in this field.

Right, there is no such thing as a free lunch. As mentioned at theChicago IETF, inserting new RLOCs don't have this problem. That isappending is easy to do. Removing from the middle of the RLOC listgets tricky but you could use a dummy slot for a certainly period oftime.We are experimenting with this to see the best way to stale existingcache entries.

We did consider sending a "free cache" bit in the data plane so sitesthat have cached state could time out the state and re-request a newmapping if they needed it, but as you might guess it would cause arequest implosion to the site.

So if you have any ideas on how to solve this, we would like to hearwith it and prototype the idea.

"Record TTL: The time in minutes the recipient of the Map-Reply willstore the mapping. If the TTL is 0, the entry should be removed fromthe cache immediately. If the value is 0xffffffff, the recipient candecide locally how long to store the mapping."
Don't think this is a good idea. Experience with the DNS TTL hasshown that people may be tempted to set both unreasonably low TTLsor ignore TTLs. Mandated minimum and maximum caching times wouldmake the protocol more deterministic and remove an opportunity forpeople to shoot themselves in the foot.

No one said the implementations would allow a user to set them.  ;-)

But, umm, isn't determinism better?

" Priority: each RLOC is assigned a priority. Lower values are morepreferable. When multiple RLOCs have the same priority, they areused in a load-split fashion. A value of 255 means the RLOC MUST NOTbe used."

Isn't it more natural to use 0 as the special case value?

We choose this way to be consistent with other priority values. Namelyadministrative distance for routing protocols.

"Note that the destination RLOC address MAY be an anycast address ifthe tunnel egress point may be via more than one physical device."
This makes me rather uncomfortable, as anycasting could get in theway of reachability testing. We have room for a number of RLOCs, whynot put in RLOCs for all physical ETRs rather than anycast them?

Well it provides another of indirection at really no cost. The xTRs ata site now if they are part of an anycast group and should know not toclear a loc-reach-bit for another xTR that has gone down in it'sanycast group.

In fact, it's simpler than that. Each xTR sets the bit and it remainsset until they call go down. When they all go down, there is no accessto the site so there is no way to tell others about it. However, ifyou have a mix of RLOCs that are anycast and unicast, they the aboveparagraph needs to be implemented.

Is the "Loc-AFI" field the AFI for an RLOC? Why is this field only 8bits while other AFI fields are 16 bits?

You found a bug, I will fix it. Thanks. Since I don't have an extrabit for the R-bit, how about if we do this:

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/| Priority | Weight |R| Loc-AFI |Loc +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\|Locator |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

But since we won't use all the AFIs that are encoded, a byte should besufficient. Please comment if you think we should stick with 16-bitsor compress all occurences of AFI to 8 bits.

"2. Locator unreachability is determined by an ITR by receiving ICMPNetwork or Host Unreachable messages."
These can be spoofed or filtered/not generated, so both theirpresence and their absence don't necessarily mean anything. Isuggest the only action triggered by such a message should be a one-time reduction in the time until a new mapping request is done.(I.e., if this happens every 300 seconds, reduce by 240 seconds forthe first ICMP message but not subsequent ones, which means animmediate request 80% of the time but a maximum of 1 request per 60seconds.)

We enumerated the methods, we didn't say you should use all of them.We recommend using only the loc-reach-bits.

"3. ETR unreachability is determined when a host sends an ICMP PortUnreachable message."
Which host? The one holding the EID? Or the ETR? The former won'thappen if the ETR is unreachable, and an ETR being connected to thenetwork but not being able to function as an ETR seems like a cornercase to me.

Well if an ITR didn't have any data to know if the destination sitewas LISP-capable and encapsulated the packet by copying the inner DAto the outer DA, and the site was using PA addressing, the packetwould enter the destination site and travel to the host. The hostwould not recognize a destination port of 4341, so it would respondwith a port unreachable.

But we would not do this anymore with the advent of the lisp-interworking draft.

So basically the use of mapping requests/replies is the onlyreliable (implicit) reachability detection mechanism. However, it islargely unspecified how ITRs should use this mechanism to determinereachability.

That and receiving data packets from the site.

"perform a route-returnability check"

Return routability check?

A simple anti-spoofing technique.

"the use of a 6-byte Nonce field in the LISP encapsulation header"

The nonce field is only 32 bits in the LISP header earlier in thedocument.

Will fix. Nice find.

" In practice, this is not really a problem. Hosts typically do notoriginate IP packets larger than 1500 bytes. And second, an informalsurvey of ISPs has been taken where nearly all ISP link MTUs areeither 4470 bytes or support Ethernet jumbo frames of 9180 bytes.Therefore, we don't anticipate any problems with prependingadditional headers."
Even if we assume that all the links within a transit AS supportsufficiently large MTUs, this doesn't address the situation whereISPs interconnect over a shared ethernet infrastructure (i.e., aninternet exchange). Also, the placement of ITRs/ETRs is left fairlyopen, with the suggestion that ETRs can be placed in end-usernetworks. In that case, it's very likely that there is at least onehop in the path that is limited to a 1500-byte MTU. I think it wouldbe helpful to explicitly signal the MTU and possibly the maximumsize of packets that can be fragmented that an ETR supports back toITRs.

I guess we have had enough discussion on this.  ;-)

In my opinion, it would be beneficial to remove pretty much all ofthe text that doesn't pertain to actual LISP operation from thisdraft, and move that which is still useful (some of it is a bitstale after five iterations) to a new "LISP architecture" document.

Can you list what is in this document doesn't pertain to LISP operation?

Thanks,
Dino

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg

Follow-Ups:
- Re: [RRG] draft-farinacci-lisp-05
  - From: Iljitsch van Beijnum <iljitsch@muada.com>

References:
- [RRG] draft-farinacci-lisp-05
  - From: Iljitsch van Beijnum <iljitsch@muada.com>

Prev by Date: [RRG] Administrivia: save the date
Next by Date: [RRG] CFP: MobiArch'08 - ACM SIGCOMM workshop
Previous by thread: [RRG] draft-farinacci-lisp-05
Next by thread: Re: [RRG] draft-farinacci-lisp-05
Index(es):
- Date
- Thread