[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE:cross-protocol locks

To: "Harrington, David" <dbh@enterasys.com>
Subject: RE:cross-protocol locks
From: Andy Bierman <abierman@cisco.com>
Date: Tue, 04 May 2004 14:02:39 -0700
Cc: "Bobby Krupczak" <rdk@krupczak.org>, <j.schoenwaelder@iu-bremen.de>, <netconf@ops.ietf.org>
In-reply-to: <6D745637A7E0F94DA070743C55CDA9BA01977655@NHROCMBX1.ets.ent erasys.com>
References: <6D745637A7E0F94DA070743C55CDA9BA01977655@NHROCMBX1.ets.enterasys.com>
At 11:41 AM 5/4/2004, Harrington, David wrote:
>Hi Andy, 
>
>Lots of question on the locks. Would this be easier to break out into
>separate issues to resolve?

I don't see any new issues here.


>Comments inline.
>
>
>> -----Original Message-----
>> From: Andy Bierman [mailto:abierman@cisco.com] 
>> 
>> It could be the CLI holding the lock, not netconf.
>
>Absolutely. Never a question that it could be any of the protocols that
>can modify configurations. Remember this point because I will raise it
>again later in this message.
>
>> I don't think the duration of the lock is the important
>> point.  This just increases the probability of a race
>> condition occurring.
>
>I don't think it would be possible to define an absolute duration, and
>that's not what I'm looking for. I'm looking to understand the
>intentions of how long a lock might last. The -02- document says it is
>intended to be "short-lived". I question that wording in the
>specification. If it is expected that an application can lock a
>configuraton and then run a script that takes over an hour to complete
>before doing an unlock, I don't consider that "short-lived", and I think
>the document needs to be modified to indicate the expected life of a
>lock, given different anticipated use-cases. Is that a better way to ask
>the question?

Do you want the word "short-lived" removed from the
document?  Is this your concern?   The lock duration
is up to the entity taking out the lock.   I don't
think the NETCONF WG needs to be concerned because
somebody might write a script that holds a lock a long
time.   The network operator can be responsible for
deciding how long scripts should hold locks.   This isn't
a protocol issue, it's an administrative issue.


>> 
>> 
>> >When a global lock is set, what level of non-netconf access 
>> is allowed,
>> >if any? Obviously I cannot SET values to a locked config, 
>> but can I read
>> >the values, even though they may be in the middle of being changed? 
>> 
>> The lock is for operations that write the database
>> (edit, copy, delete).  Read access is not affected.
>
>OK. So any data read while a lock is set is potentially garbage, since
>the transaction has not yet completed. I consider that acceptable. But,
>so that the data-gatherer can tell that the data might be garbage
>because the read is being don ein the middle of a configuration session,
>is it intended that the lock will be externally accessible, so the data
>gatherer can query the current state of the configuration process?

Why is everything read from a locked database garbage?
Why do you assume this?  Is it any more or less garbage
than any data ever read with SNMP (since SNMP has no
locking at all)?


>> 
>> If netconf is implemented correctly, then SNMP writes will
>> fail while the lock is held by a different entity.
>> Other writes will fail while the SNMP engine is processing
>> a Set transaction.
>
>Understood, and any non-snmp protocol will fail to write to a
>configuration locked by snmp. Will the lock be externally accessible so
>the other protocol can tell whether the failure is the result of a
>locked configuration, and is it intended that another entity (of the
>same or different protocol) wishing to do configuration can see who/what
>holds the lock?

The lock is an implementation requirement.  There is
no NETCONF requirement to create non-NETCONF mechanisms
to access the lock implementation.

>> 
>> 
>> >Is a locked netconf operation atomic, with as-if-simultaneous
>> >application of functionality at commit or at copy-config?
>> 
>> nope -- netconf operations are serialized.  There are no
>> atomic operations at all.
>
>OK. That response will be referenced later in this message.
>> 
>> 
>> >If I'm executing a large script, say on bootup, and it is 
>> important that
>> >all aspects of the device be configured before starting 
>> operation, can a
>> >lock span the whole script, or must you turn the lock on and off for
>> >each individual operation? The netconf protocol only says locks are
>> >intended to be short-lived; it does not specify what 
>> short-lived is, and
>> >it doesn't say MUST. If a script is running, and locks are truly
>> >short-lived within a script, can I do an SNMP SET to the same
>> >configuration target that the script is in the middle of 
>> configuring? So
>> >doesn't this argue that locks need to live as long as is required to
>> >complete the script, whether that is short- or long-lived? 
>> 
>> The client decided how long the transaction will take.
>
>Right, and that was my argument against the claim of "short-lived". The
>locks need to exist however long it takes to complete the transaction. 

What is your point?! 
The network operator decides what is short-lived.
The protocol document shouldn't decide the maximum
duration of a lock.


>And an **application** can decide that a whole script is the transaction
>in question, and can determine how long the transaction will take. So
>again, "short-lived" comes into question, it depends on the use case. 
>
>Now, a quick side trip regarding a question I asked and you didn't
>answer in this last section - "If a script is running, and locks are
>truly
>> >short-lived within a script, can I do an SNMP SET to the same
>> >configuration target that the script is in the middle of 
>> configuring? "
>
>Here's are two scenarios:
>
>Scenario A: a netconf script is written using short-term locks within a
>script, i.e. the section A of the script <lock>s a config, performs a
>transaction, <unlock>s the config, and then section B of the script
><lock>s the config, performs some transaction, then <unlock>s, and so
>on, and then performs a <commit> at the end of the script.
>
>Scenario B: a netconf script is written using short-term locks within a
>script, i.e. the section A of the script <lock>s a config, performs a
>transaction, <commits>, <unlock>s the config, and then section B of the
>script <lock>s the config, performs some transaction, <commits>, then
><unlock>s, and so on.
>
>If an SNMP SET came into the system while section A was executing, and
>got queued up by the implementation, would it be acceptable for the
>implementation to time-slice the SNMP SET between sections A and B of
>the netconf script, since the locks have been released? Is the answer
>different for scenario A and scenario B?

The netconf engine doesn't care how scripts are organized
on the host machine.  If a session releases a lock, then
any other entity (netconf or not) is free to grab the lock.
These are RPCs over the wire.  How is the agent supposed to
know there's a section B that will attempt a lock operation later?
Why should it care?


>In your answer that netconf is serialized, not atomic, what I heard was:
>it is perfectly acceptable that SNMP and other protocols can get their
>chance to modify the configuration between these sections of the script
>(as long as the script unlocks the target config). 

yep -- if you release a lock, some other entity can get it after you
release it.


>OR... Does the text in -02- that says "the target configuration has
>already been modified and these changes have not been committed"
>override the explicit <unlock> command in the scenario A script, and
>force the system to reject the SNMP engine's request for a lock?

the text about candidate-specific locking needs to be fixed.
We already discussed this at the last IETF.


>End of side-trip.
>
>> Short vs. not-so-short lock duration does not remove
>> the race condition.  SNMP (and CLI, etc.) agent code
>> needs to be modified to utilize the configuration lock.
>> A conformant implementation cannot ignore the locks
>> because they're supposed to be short.
>
>I wasn't suggesting that SNMP or anybody else should ignore a lock. I
>want to understand the locking well enough to be able to write my code
>to respect the locks, and behave accordingly, but I'm not sure the
>current documentation gives me enough information about expected
>behaviors, so I'm asking questions about facilities available to
>understand the locking state and configuration state at run-time. 

The lock duration is controlled by the application writer.

>> 
>> 
>> >If the locks will impact other protocols, then netconf needs 
>> to be more
>> >specific about the scope/length of a lock, including 
>> expectations about
>> >the behavior of other protocols that share the lock. This is 
>> also true
>> >since other protocols can presumably set the locks, possibly remotely
>> >via a mib or other mechanism, and netconf needs to protoect 
>> itself from
>> >poorly applied locks. Doesn't this then call for the ability to
>> >monitor/examine the lock-state by any of the impacted protocols to
>> >improve cross-protocol coordination? 
>> 
>> No, netconf doesn't have to specify the duration of the lock.
>> Repeat again: The transaction is started and finished by the
>> application, not the agent.
>
>I think you've gotten hung up on "duration" - I'm not looking for
>absolutes. I'm looking for intended behaviors and anticipated use cases.
>So let me rephrase: 
>"> >If the locks will impact other protocols, then netconf needs 
>> to be more specific about the **expected** scope/length of a lock,
>including 
>> expectations about the behavior of other protocols that share the
>lock."

I think the phrase "short duration" is very subjective.
It's up to the script writer to decide. 


>So, for example, having COPS/PR set and hold the lock for the duration
>of its relationship between a policy server and the managed entity would
>be a usage that would not be considered a good use case for these locks.
>Right? Or does netconf not care, since that is really an
>application/deployment/administrative question that doesn't impact the
>protocol specification?

This situation is not the same as COPS-PR.  This is a shared
resource with access controlled by a lock.  The COPS-PR design
did not treat the config as a shared resource.


>If netconf doesn't care, then why state in -02- that it is intended that
>locks be short-lived?

Short-lived is a relative and non-normative term.
We can remove it from the text if it is the cause of 
too much confusion.



>> 
>> 
>> >If a netconf script gets into an infinite loop (and we all know that
>> >sooner or later this might happen), how will an operator 
>> <kill-session>?
>> >Shouldn't they have access to the global lock via, say, the CLI?
>> 
>> Start another netconf session, read the netconf-state objects,
>> figure out which session holds the lock, invoke kill-session
>> on that session number.
>> 
>> There are no CLI commands specified by the netconf standard. 
>
>And I'm not asking for any CLI commands. I'm asing for enough
>information about these locks we are designing to understand how I could
>write my own CLI command or SNMP application or other non-netconf
>protocol application to deal with these locks (locks that will constrain
>my non-netconf protocols).
>
>So in the information/data model that contains netconf-state objects,
>there are objects that indicate which sessions exist, and which session
>holds the lock? And that data model will be accessible via other netconf
>sessions? And other netconf sessions will be able to kill a session
>holding a lock? Is that correct? 

Read the text for <kill-session>.
Access control notwithstanding -- yes to all these questions.


>My knowledge of XML schemas is very limited, so help me understand this
>information model. I look at the netconf-state and find the
>NetconfSessionInfo (which has session-id, user, and login-time), and I
>look up NetconfConfigInfo (which contains configname and lockstatus).
>LockStatus contains the lock-state and locked-by. So "locked-by"
>contains the session-id? Is that correct? It doesn't actually say that,
>only that locked-by is a PositiveInteger.

yes - lockedBy contains the session ID of the lock owner,
or zero if a non-netconf entity holds the lock.  This text 
will be part of -03 I think.


>This model doesn't show the locked-by identity of non-netconf protocol
>engines. Should it?

zero


>You mention the <kill-session> command as the way to forcibly release a
>lock. <Kill-seesion> only works with other netconf sessions from the
>same client. 
>
>Can a lock be forcibly released by netconf if set by a non-netconf
>protocol? 

no

>Can a lock set by netconf be forcibly released by a
>non-netconf protocol? 

no

>If this happens, how should netconf respond/behave
>in this situation?

the session should be killed that thinks it owns the lock.

>> 
>> 
>> >Therefore I assume this is not just an internal 
>> instrumentation issue,
>> >but should be externally visible/accessible to all impacted 
>> protocols.
>> 
>> Any changes to SNMP which may be desirable to
>> improve SNMP error codes (or whatever) should be
>> done by a different WG. 
>
>I wasn't suggesting that any group do anything to change the SNMP
>protocol. I don't think SNMP needs to be changed to accommodate this at
>all, only applications that use SNMP.
>
>I am asking for enough description of expected behaviors regarding
>locks, and the expectations of both netconf and non-netconf protocol
>behaviors, that non-netconf protocol applications can be modified to
>know what to expect, and how to behave properly regarding these locks. 
>
>I think the current specification is inadequate regarding locks at this
>time, and am asking for some clarifications. 

ok, we will try to clarify the text.


>(Or is clarifying your documents also out of scope for this wg?) ;-)
>
>
>> 
>> 
>> >dbh
>> 
>> Andy

Andy


>> 
>> 
>> 
>> >> -----Original Message-----
>> >> From: Bobby Krupczak [mailto:rdk@krupczak.org] 
>> >> Sent: Tuesday, May 04, 2004 12:42 AM
>> >> To: Harrington, David
>> >> Cc: Andy Bierman; j.schoenwaelder@iu-bremen.de; 
>> netconf@ops.ietf.org
>> >> Subject: Re: unified transaction model proposal
>> >> 
>> >> Hi1
>> >> 
>> >> > global. 
>> >> > 
>> >> > I think it will be imperative to be able to manipulate different
>> >> > portions of the configuration simultaneously, so globally 
>> >> locking global
>> >> > configurations, like running, are a problem. We discussed 
>> >> this at the
>> >> > interim and it has been raised multiple times on the 
>> mailing list. I
>> >> > believe we will need partial configuration support, with locks to
>> >> > support that.
>> >> 
>> >> I thought David's questions on cross-management-platform 
>> locks to be
>> >> right on the money.
>> >> 
>> >> However, in my own dealing with this issue, I always 
>> assumed this was
>> >> an internal matter and that SNMP agents, operating systems 
>> (e.g. IOS),
>> >> and netconf agents, would do this behind the scenes rather 
>> than making
>> >> this locking mechanism externally visible.  Is this not the case?
>> >> 
>> >> Bobby
>> >> 
>> 
>> 


--
to unsubscribe send a message to netconf-request@ops.ietf.org with
the word 'unsubscribe' in a single line as the message text body.
archive: <http://ops.ietf.org/lists/netconf/>
References:
- RE:cross-protocol locks
  - From: "Harrington, David" <dbh@enterasys.com>
Prev by Date: RE:cross-protocol locks
Next by Date: Re: Yahoo!
Previous by thread: RE:cross-protocol locks
Next by thread: RE:cross-protocol locks
Index(es):
- Date
- Thread