[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unique identifiers and privacy



> I am concerned with the general statement that we should merely "do no
> worse than the current state of the art". I am specifically concerned
> with the use of long lived unique identifiers. We have already got
> significant feedback on such identifiers in a number of products, e.g.
> identifiers of CPU chips, identifiers of users of audio-video players,
> host identifiers in IPv6, use of social security numbers in data bases,
> and the list goes on. Any unique identifier is a privacy time bomb.

I was confused by your use of "unique" until I realized it can have two
different meanings:
 - unique as in one host having only one identifier
 - unique as in "globally unique" i.e. not being assigned to any other host

For a privacy attacker to be able to correlate different communication
to a single host (which might in turn be used by a single or a few
users) the host would need to use only a single identifier and that
identifier can't be used by too many other hosts in the Internet.
(But global uniqueness probably isn't required since the attacker
might very well be able to cope with a handful of hosts using the same
identifier.)

> Obviously, there are places where unique identifiers are unavoidable.
> For example, one cannot receive mail without publishing a mail address
> of some kind. But there are many places where identifiers are in fact
> not needed. For example, a vast majority of Internet connections involve
> resolving the name of a server, obtaining the server address or
> "locator", and exchanging a few packets between a single pair of
> locators. A cautious design would not mandate use of any identifier in
> such circumstances.

Agreed.

> If we do use identifiers, we should obviously allow systems to create
> short-lived identifiers, and to use different identifiers for different
> activities. However, we should be very concerned with the default
> behavior. In practice, many application developers don't bother with
> advanced API and just use whatever is the default behavior of the stack.
> A cautious design would be to err on the side of privacy, and to make
> sure that by default, an application's traffic will use an identifier
> that is both short-lived and specific to that application. The use of
> long lived global identifiers should be reserved to those applications
> that specifically request them.

I think the default is subject to debate, which is most likely 
a repeat of the debate we had in IPv6 WG whether RFC 3041 addresses should
be preferred by default.

If we have the default err on the side of privacy the risk is that
some applications will fail to communicate (since the identifiers might be
too short-lived for the applications need).
And as you point out, if we have the default err on the side of identifier 
stability, the risk is increased privacy exposure due to
applications/middleware not using the API to choose the short-lived
identifiers.

I personally think we do need the applications to get involved, since
both defaults are suboptimal.

   Erik