[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] Opportunistic Topological Aggregation in the RIB->FIB Calculation?



Of course all of this assumes that we keep the basic BGP paradigms  
intact...

RFC 4271: "BGP Identifier A 4-octet unsigned integer that indicates the BGP Identifier of the sender... A given BGP speaker sets the value of its BGP Identifier to an IP address assigned to that BGP speaker..."

Is the above going to change in IPv6?

Thanks,

Peter


--- On Wed, 7/23/08, Iljitsch van Beijnum <iljitsch@muada.com> wrote:

> From: Iljitsch van Beijnum <iljitsch@muada.com>
> Subject: Re: [RRG] Opportunistic Topological Aggregation in the RIB->FIB Calculation?
> To: "William Herrin" <bill@herrin.us>
> Cc: tony.li@tony.li, "Routing Research Group" <rrg@psg.com>
> Date: Wednesday, July 23, 2008, 4:23 AM
> On 23 jul 2008, at 4:15, William Herrin wrote:
> 
> > C. Even if it isn't practical to build such a
> tunnel, you can
> > generally pick an alternate link that offers a high
> probability of
> > reachability for for the impacted routes. On link
> failure, cut over to
> > the alternate while rebuilding the RIB and then FIB.
> 
> Hm, if you know your outage (= detection + convergence
> times) is short  
> enough, say, less than 30 seconds, and certainly if less
> than 10  
> seconds, it doesn't really matter where the packets go,
> you could even  
> drop them and _most_ applications will continue without too
> much  
> trouble.
> 
> If it's going to take a minute or more to restore
> reachability, you  
> can't play fast and loose and risk loops, because
> applications will  
> fail.
> 
> > None of this is perfect, but put together it enables a
> system that
> > isn't on the brink of collapse at 100 times the
> current number of
> > entries.
> 
> This is all highly optimistic. Let's assume you can get
> your 8 times  
> parallelization, so try the current 250k table on a system
> that's 12 x  
> slower than what's on the market now. Let's say a
> 133 MHz Pentium.
> 
> >> In fact, the PC hardware doesn't actually do
> that.  What we really  
> >> see is
> >> that DRAM memory speed grows at about 1.2X every
> two years and that  
> >> our
> >> growth rate is at least 1.3X every two years.
> 
> > That figure sounds fishy. As I recall, we were using a
> 100mhz memory
> > bus in 1998 and moving in to 133mhz memory bus. Today
> we're using a
> > 1300mhz memory bus and moving to a 1600mhz memory bus.
> 
> These busses are optimized for serial reading/writing of
> large cache  
> lines. If you need a single byte, it's still slow.
> 
> > AMD Opteron processors embed the memory controller in
> the CPU. Each
> > CPU manages its own bank of memory with a dedicated
> memory bus. They
> > share with each other via a
> "hypertransport." If the portion of the
> > RIB associated with part of the address space is
> intentionally placed
> > in memory managed by the CPU which will calculate that
> portion of the
> > RIB then the computation can proceed in parallel
> without contention on
> > the memory bus.
> 
> The problem is that all your BGP updates come in over a
> single TCP  
> session so they must be fanned out to the right CPU. It
> would be  
> better if we could make it such that you'd have 8
> sessions that each  
> carry the right updates to the right CPU.
> 
> >> 100 times the entries * 100 times the churn =
> 10000 times the  
> >> processing.
> >> I'm afraid that your DRAM isn't going to
> keep up with that in a  
> >> traditional
> >> design.
> 
> > You have the combinatorics wrong. Each entry has some
> probability of
> > churning each second. So if you have 100 entries,
> you're 100 times as
> > likely to see a single-entry churn event. You are NOT
> 100 times as
> > likely to see a 100-entry churn event. In fact,
> you're no more likely
> > to see a full-table churn event that you were when
> there was only 1
> > entry, and each such full-table churn consumes only
> 100 times the
> > processing.
> 
> The problem is that the single "100 times" figure
> is rather  
> meaningless as it can both mean what I said and what you
> said. We know  
> that both the number of entries is growing and the number
> of updates  
> per entry, so although it's not going to be as bad as I
> said, it's  
> also not going to be as good as you said.
> 
> Obviously the best way to get rid of volatility in the
> updating is to  
> not carry the prefixes in the first place, but we did flap
> dampening  
> 10 years ago, I see no reason why we can't do something
> similar that  
> works a bit better.
> 
> Of course all of this assumes that we keep the basic BGP
> paradigms  
> intact...
> 
> 
> --
> to unsubscribe send a message to rrg-request@psg.com with
> the
> word 'unsubscribe' in a single line as the message
> text body.
> archive: <http://psg.com/lists/rrg/> &
> ftp://psg.com/pub/lists/rrg


      

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg