[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] draft-farinacci-lisp-05

To: Dino Farinacci <dino@cisco.com>
Subject: Re: [RRG] draft-farinacci-lisp-05
From: Iljitsch van Beijnum <iljitsch@muada.com>
Date: Sat, 22 Dec 2007 04:04:24 +0100
Cc: Routing Research Group list <rrg@psg.com>
In-reply-to: <4AEEC2E6-7205-49B2-ADDA-0874E4E98A02@cisco.com>
References: <5A0C2670-8696-41CC-8E72-2AE623BB8371@muada.com> <4AEEC2E6-7205-49B2-ADDA-0874E4E98A02@cisco.com>

On 19 dec 2007, at 6:20, Dino Farinacci wrote:

There is no version field in the LISP header. There should be. Atleast a few "set to zero on send, ignore on receive" bits that canbe used for extensions without bumping the version number is also agood idea.

We decided to use control-plane type codes.

Unless I missed something, this means you can never make backwards-incompatible changes to the LISP header without running the old andnew versions on different RLOCs.

"In order to eliminate the need for a mapping lookup in the reversedirection, the ETR gleans RLOC information from the LISP header."

I find this undesirable because this way, the behavior of thesystem can be different depending on which end initiated thecommunication. Trusting information supplied directly by the otherend is also problematic security-wise. I would much prefer it ifboth ends did an independent mapping lookup.

We stated this because there was a requirement from big contentproviders. I will loosen the language and say "MAY glean".

This is one of the big problems with GSE: if someone contacts you withEID=windowsupdate.com and RLOC=l33th4x0r, and you trust thisrelationship, an attacker gets to redirect traffic for that EID to arandom place. This is especially bad when the attacker can set up thisstate just as you're about to set up an outgoing connection to thatEID, because then they get to intercept your outgoing traffic.

In the case of content sites that only receive incoming sessions it isprobably possible to come up with a set of contraints within whichthere are no problems, but that would still make me EXTREMELYuncomfortable as people do stuff that they weren't planning on doingwhen they set up their networks all the time. Also, as someone who hasspent a fair bit of time debugging network problems, I am very much infavor of deterministic behavior. So one set of behavior when theconnection is set up from A to B and another when it's from B to A isnot good.

"LISP Locator Reach Bits: in the LISP header are set by an ITR toindicate to an ETR the reachability of the Locators in the sourcesite."

I wonder how useful this is in practice. First of all, having 32RLOCs is way too many.

We have received feedback that it may not be enough.  ;-)

As the LISP related documents that I've read are light on failuredetection and repair, it's hard to say anything definitive, but with32 ITRs and 32 ETRs TCP has probably long since given up when you getaround to learning that it's ITR 31 that can talk to ETR 31 but theother 1023 combinations don't work.

I'd be interested in learning the rationale behind that feedback,though.

[xTRs in ISP networks or (also) in end-user sites]

As I said at the IETF, for a CE deployment of xTRs, the most commonfailure points in the network, which affects connectivity to thesite, is the CE router going down, the CE-to-PE link going down, orthe PE router going down. Other failures are rerouted in the corebased on richness of connectivity or are damped out due toaggregation.

That's what the big ISPs tell us. Myself, I haven't had too muchtrouble with my last kilometers, so my experiences may not berepresentative, but I can tell you that routing failurs and brownoutsDO happen and anyone who cares enough about their connectivity to bemultihomed, wants to be protected against that, too. Especially havinga single ETR go down or function incorrectly (claiming incorrect(un)reachability for other ETRs serving the same EID) and then beingunreachable or having severely degraded reachability would be highlyunacceptable to any multihomer that I've ever known.

In these cases, the loc-reach-bits are extremely effective and getsthe new status information to the other sites at data-plane rates.

Only if the return traffic uses the ITR for the forward traffic as itsETR, which makes sense in your view. However, this is a good exampleof a more general tendency in LISP to commit to a narrow mode ofoperation rather than to make the whole thing more open so it's easyto change the protocol later for the IETF and to make differentdeployment tradeoffs for operators.

It's always difficult to come up with the right amount ofextensiblity: too much, and the protocol becomes unwieldy andinefficient, too little and obvious improvements can't be made orrequire cumbersome kluges. I would point to BGP as a mostly successfulexample: we didn't have to come up with BGP-5 for more than a decadeeven though we added tons of new stuff because there was an adequateamount of extensibility.

I'm currently not seeing this amount of extensiblity in LISP. The factthat all of this is happening in an IRTF wg and that we've hadnumerous previous efforts before that either failed to address theissue completely (IPv6) or created something that only solves part ofthe problem (shim6, and some would argue that I'm being generous)shows that closing off too many paths at this stage is probably not agood idea.

There are also potential issues with synchronizing the numbering ofthe RLOCs in the mapping system and in this field.

Right, there is no such thing as a free lunch. As mentioned at theChicago IETF, inserting new RLOCs don't have this problem. That isappending is easy to do. Removing from the middle of the RLOC listgets tricky but you could use a dummy slot for a certainly period oftime.We are experimenting with this to see the best way to stale existingcache entries.

Right.

We did consider sending a "free cache" bit in the data plane sosites that have cached state could time out the state and re-requesta new mapping if they needed it, but as you might guess it wouldcause a request implosion to the site.

This is the same argument that you used against my idea of having acode point in the LISP header to signal back reachability and otherinformation on request.

So if you have any ideas on how to solve this, we would like to hearwith it and prototype the idea.

Well, mostly just the same idea as I had before, but let's flesh itout a bit:

The point is to avoid state on the ETRs but still signal informationthat becomes available at the ETR back to the ITR as fast as possiblewithout making the ETR keep state or otherwise work too hard.

We do this by having the ITR ask the ETR to send updates of therelevant information (not important what exactly that information isright now) by setting a code point in the LISP header. To avoid havingto send back information that's already known, the ETR keeps a "tableversion" like value. I think we only need two or three bits for this.When the ITR asks for an update, it supplies the "table version" ofthe latest information that it learned. If the table version at theETR is the same as in the ITR's request, it doesn't do anything. Ifthe table version is different, the ETR sends back an update. The ITRkeeps an RTT estimate and makes sure there is only one request inflight per RTT so the ETR won't be sending unnecessary copies of theinformation packets.

(We'll have to figure out if we need to do this per-ETR or per-EID, orpossibly both.)

But, umm, isn't determinism better?

Yes, that was my point. (?)

" Priority: each RLOC is assigned a priority. Lower values are morepreferable. When multiple RLOCs have the same priority, they areused in a load-split fashion. A value of 255 means the RLOC MUSTNOT be used."

Isn't it more natural to use 0 as the special case value?

We choose this way to be consistent with other priority values.Namely administrative distance for routing protocols.

Hm, if you make it more like a HSRP priority or local preference then0 would be the right choice. :-)

Isn't higher == better also in DNS SRV records? I think that would bethe right thing to copy here.

"Note that the destination RLOC address MAY be an anycast addressif the tunnel egress point may be via more than one physical device."

This makes me rather uncomfortable, as anycasting could get in theway of reachability testing. We have room for a number of RLOCs,why not put in RLOCs for all physical ETRs rather than anycast them?

Well it provides another of indirection at really no cost. The xTRsat a site now if they are part of an anycast group and should knownot to clear a loc-reach-bit for another xTR that has gone down init's anycast group.

I guess... Wouldn't it be better to leave this open and come up with aseparate document that details how anycast would work here?

Is the "Loc-AFI" field the AFI for an RLOC? Why is this field only8 bits while other AFI fields are 16 bits?

You found a bug, I will fix it. Thanks. Since I don't have an extrabit for the R-bit, how about if we do this:

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/| Priority | Weight |R| Loc-AFI |Loc +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\|Locator |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

But since we won't use all the AFIs that are encoded, a byte shouldbe sufficient. Please comment if you think we should stick with 16-bits or compress all occurences of AFI to 8 bits.

Aren't multiprotocol BGP AFIs 16 bits? In that case, it saves IANAfrom creating another registry for something that they alreadyregister so if 16 bits is no imposition then that would be the naturalchoice. But if you need the bits for something else, then let IANAwork a bit harder for their money... Obviously, they should all be thesame size at least in the context of LISP.

We enumerated the methods, we didn't say you should use all of them.We recommend using only the loc-reach-bits.

That would be a good example of something I'd like to see moved toanother document.

Well if an ITR didn't have any data to know if the destination sitewas LISP-capable and encapsulated the packet by copying the inner DAto the outer DA, and the site was using PA addressing, the packetwould enter the destination site and travel to the host. The hostwould not recognize a destination port of 4341, so it would respondwith a port unreachable.

But we would not do this anymore with the advent of the lisp-interworking draft.

That's good, because it doesn't make much sense to me to tunnelpackets to destinations for which you don't know if they support thetunneling, even if we ignore for a moment how you would discover thenon-existant RLOC in that case.

So basically the use of mapping requests/replies is the onlyreliable (implicit) reachability detection mechanism. However, itis largely unspecified how ITRs should use this mechanism todetermine reachability.

That and receiving data packets from the site.

Sometimes traffic only flows in one direction. Although this is rarefor long periods unless you count asymmetric traffic flow, it's muchmore common for short times when a session goes from active to idle.

"perform a route-returnability check"

Return routability check?

A simple anti-spoofing technique.

Yes, but I think you used a less common ordering.

[MTU issue]

I guess we have had enough discussion on this.  ;-)

Well, let me put it this way: I'll gladly forego more discussion inlieu of more consensus. Unfortunately, most people haven't spoken outin favor of an approach towards the MTU thing.

In my opinion, it would be beneficial to remove pretty much all ofthe text that doesn't pertain to actual LISP operation from thisdraft, and move that which is still useful (some of it is a bitstale after five iterations) to a new "LISP architecture" document.

Can you list what is in this document doesn't pertain to LISPoperation?

I have a long list of other documents that I should review at somepoint, so I don't want to go over the LISP draft again at this time.If you agree with my suggestion, just keep your eye open for text thatqualifies on the next iteration of the document. If you don't, don't.

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg

Follow-Ups:
- Re: [RRG] draft-farinacci-lisp-05
  - From: Dino Farinacci <dino@cisco.com>
- Re: [RRG] draft-farinacci-lisp-05
  - From: Tony Li <tli@cisco.com>

References:
- [RRG] draft-farinacci-lisp-05
  - From: Iljitsch van Beijnum <iljitsch@muada.com>
- Re: [RRG] draft-farinacci-lisp-05
  - From: Dino Farinacci <dino@cisco.com>

Prev by Date: Re: [RRG] Tunnel fragmentation/reassembly for RRG map-and-encaps architectures
Next by Date: Re: [RRG] Tunnel fragmentation/reassembly for RRG map-and-encaps architectures
Previous by thread: Re: [RRG] draft-farinacci-lisp-05
Next by thread: Re: [RRG] draft-farinacci-lisp-05
Index(es):
- Date
- Thread