[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Issue 13.6: Discard changes



On Mon, Feb 09, 2004 at 05:49:19PM -0800, Andy Bierman wrote:
> At 04:21 PM 2/9/2004, Rob Enns wrote:
> >On Mon, Feb 09, 2004 at 04:04:59PM -0800, Andy Bierman wrote:
> >> At 03:16 PM 2/9/2004, Rob Enns wrote:
> >> >> 13.16) <discard-changes>
> >> >> 
> >> >> 13.16.1) Clarifications [PROT]
> >> >> 
> >> >>   Says content 'automatic' is allowed for the <discard-changes>
> >> >>   operation.  This is not actually documented.
> >> >>    
> >> >>   Closed: accepted, protocol document needs to be updated
> >> >
> >> >Here is the source of the confusion: there is a <discard-changes>
> >> >operation which tells the device to discard any changes made to
> >> >the candidate the configuration.
> >> >
> >> >There is _also_ a <discard-changes> argument specified as part of the
> >> >lock operation which can indicate that the device should automatically
> >> >discard any changes to the configuration when the lock is released 
> >> >(either intentionally using <unlock/>, or implicitly, by session failure).
> >> >
> >> >The confusion can be fixed by renaming one of these elements. 
> >> >We could rename the <discard-changes> argument
> >> >to <rollback>, so that a lock request could look like:
> >> >
> >> ><lock>
> >> >  <target><candidate/></target>
> >> >  <rollback>automatic</rollback>
> >> ></lock>
> >> >
> >> >and this element could be conditional on the (tbd) #rollback capability.
> >> 
> >> I really don't like this approach at all.
> >> Rollback should not be coupled to the candidate config.
> >> This is a separate feature that is not dependent on whether
> >> the device supports writes to <candidate> or to <running>.
> > 
> >I am trying to propose _decoupling_ automatic rollback from the
> >candidate config. :-) That's why I said to make it conditional
> >on the rollback capability.
> >
> >> Why does the candidate config need its own extensions
> >> to the <lock> command?  This seems like really bad design.
> >> I think we should avoid special versions of the protocol
> >> operations unless absolutely required.
> >
> >Perhaps we are in agreement, maybe I was not clear. What I'm suggesting
> >is that we add the ability to specify that changes can be
> >rolled back (from either the candidate or running) configuration.
> >The reason to put it on the lock command is that lock/unlock
> >are one nice way NETCONF has to bracket a set of changes.
> >The other way is "all the changes in an <edit-config>", which
> >is okay too, but not as flexible. If we do rollback in
> >NETCONF, it should work both with a single <edit-config> 
> >granularity, as well as the set of changes between a lock and unlock.
> 
> I need to review the #candidate details more closely.
> The expected behavior isn't fully documented yet.
> Let me see if I understand.  Here's text from sec. 7.5.5.1:
> 
>    On devices implementing the #candidate capability the default target
>    of the <lock> and <unlock> operations is the candidate configuration
>    datastore.
> 
> This is bad design and can't be properly described in the XSD.
> Instead we should not have a default target for any command
> (minOccurs=1, no default)

Yes, I agree. We should ditch the default behavior in cases like this.
It was in there for brevity, but cleaning up the design should be
paramount.
 
>    Devices implementing the #candidate capability WILL NOT allow a
>    configuration lock to be acquired when there are outstanding changes
>    to the candidate configuration.  An error WILL be returned and the
>    status of the lock will remain unchanged.
> 
> This seems backwards and creates a race condition.  
> With no lock held, session A starts executing <edit-config>
> operations on the candidate config.  Session B then
> wants to obtain a lock and edit the candidate config,
> but the <lock> request fails because session A made changes.
> (The application for session B has to have special code to 
> deal with locked failed because it's already held or lock failed 
> because the candidate config has uncommitted changes in it.)

We designed NETCONF (XMLCONF) to work the way it does because
we felt it works well for both human users and automatons.
Since both share the same configuration database/system it's
important to have a locking strategy that isn't onerous
for either group.

In cases like the above, session A (the one doing things
without a lock) will be a human. If A was a program, it would
have a lock (automatons that don't take the lock aren't worth
worrying about).

Therefore, it wouldn't make sense to allow B to take the lock and
screw up A in the middle of using the CLI. A wouldn't know what
hit him.

B on the other hand is a program, and can simply back off and
try later. B should not log in without the lock and discard changes,
since that would toss A's work. In other words: if you're running 
your network and you allow both automatons and human operators
to change configuration (today most networks work this way), 
you don't want the automatons blasting the operators' work.

If you're running a purely automated network, life is also good,
because all your programs will ask for the lock and the race you
describe isn't an issue.

> So session B needs to complete 2 successive RPCs
> (a <discard-changes> followed by <lock>) without session A
> executing an <edit-config> (after the <discard-changes>
> but before the <lock>).  
>
> It seems simpler to have <lock> succeed, and then session B
> can choose to execute <discard-changes> before editing
> the candidate config.  Subsequent <edit-config> or 
> <discard-changes> operations executed by session A would fail.
> 
>    When a client fails with outstanding changes to the candidate
>    configuration, recovery can be difficult.  To facilitate easy
>    recovery, the #candidate capability adds a <discard-changes> element
>    to the <lock> operation.  If this element contains the value
>    "automatic", any outstanding changes are discarded when the lock is
>    released, whether explicitly with the <unlock> operation or
>    implicitly from session failure.
> 
> This is confusing.  The <discard-changes> is added to
> the <lock> operation, but doesn't discard previous
> changes, it discards any future changes by the lock owner
> if that session terminates without issuing an <unlock>.
> It seems to me that terminating without a <commit> is either
> intentional or a rare event, not worth optimizing.

Not at all. This also covers the case where the session was terminated
ungracefully. This feature allows programs to know that the device
will do the right thing (discard 1/2 finished configuration changes)
in the event of a dropped connection. This is a big win for NETCONF.
 
> I think we should get rid of this extra parameter on <lock>.  
> If the session terminates without releasing the lock, then any
> uncommitted changes get discarded.  

We could do this as a simplification, but it would preclude intentionally 
unlocking the configuration w/out committing changes. That may be rare,
but it should be supported.

thanks,
 Rob



> 
> 
> >thanks,
> > Rob
> 
> Andy
> 
> 
> 
> >> 
> >> >thanks,
> >> > Rob
> >> 
> >> Andy
> >> 
> >> 
> >> >--
> >> >to unsubscribe send a message to netconf-request@ops.ietf.org with
> >> >the word 'unsubscribe' in a single line as the message text body.
> >> >archive: <http://ops.ietf.org/lists/netconf/> 
> 
> 
> --
> to unsubscribe send a message to netconf-request@ops.ietf.org with
> the word 'unsubscribe' in a single line as the message text body.
> archive: <http://ops.ietf.org/lists/netconf/>

--
to unsubscribe send a message to netconf-request@ops.ietf.org with
the word 'unsubscribe' in a single line as the message text body.
archive: <http://ops.ietf.org/lists/netconf/>