[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Duplicate In-Reply-To entries in reply buffer



> 1) Aren't message-id header created by mail transfer agents rather
> than mail user agents? Doesn't this make crazy headers less likely?

Accodring to RFC2822, MUA should create Message-ID, (although I don't
so).  There may be the case that inappropriate settings result in
inappropriate Message-ID header.  But, the problem whould not be the
format of the header but such as uniqueness.  I don't think MUA or MTA
is more problematic than other at least about the format of the
header.

> 2) The "strict" regex we've been discussing shouldn't have that
> dollar sign near the end. It conflicts with the \' in the tests I've
> done.

Of cource, $ should be removed.  Thanks for pointing out.

> 3) Looking again at RFC 5322, I'm dismayed to see that comments (set
> off by parentheses) seem to be allowed in the message-id header field
> both before and after the actual message ID.

(snip)

> 4) the std11 module of flim seems to provide some machinery for
> handling this.

I know FLIM has lexical analyzers, but I didn't know about a comment
on Message-ID: header.  For example, the below code could extract
Message-ID more strictly.

(let ((string "<zzz@example.com>"))
  (let* ((tokens (std11-parse-msg-ids-string string))
	 (id (assq 'msg-id tokens)))
    (setq id
	  (unless (assq 'msg-id (delq id tokens))
	    (std11-addr-to-string (cdr id))))
    ;; Return nil when result is "".
    (when (> (length id) 0) id)))

But FLIM's lexical analyzer is really strict.  If string is invalid
Message-ID, e.g. "<zzz.@example.com>", nil is returned.  I think we
does not have to support invalid Message-ID, but more tolerant would
be better. Therefore, if we use FLIM's lexical analyzer, combination
with other extracting method would be better.

(let ((string "<zzz.@example.com>"))
  (or
   (let* ((tokens (std11-parse-msg-ids-string string))
	  (id (assq 'msg-id tokens)))
     (setq id
	   (unless (assq 'msg-id (delq id tokens))
	     (std11-addr-to-string (cdr id))))
     ;; Return nil when result is "".
     (when (> (length id) 0) id))
   (and (string-match "\\`[ \n\t]*\\(<.+>\\)[ \n\t]*\\'" string)
	(match-string 1 string))))

As you decribed, it is more costly method than current.  I will post
another message to ML about performance issue.

-- 
Kazuhiro Ito