At Fri, 03 Aug 2012 14:41:47 +0200, David Maus wrote: > > [1 <text/plain; US-ASCII (7bit)>] > At Fri, 03 Aug 2012 20:33:27 +0900, > Kazuhiro Ito wrote: > > > > > Actual bugs are; > > > > > > 1-1. elmo-msgdb-get-message-id-from-buffer assumes buffer is narrowed > > > to a message header, but actually it is called from unnarrowed buffer > > > in some places. > > > > > > 1-2. We have multiple extracting methods for Message-ID and they can't > > > extract from some valid header. > > (snip) > > > Fix (my opinion); > > (snip) > > > 1-2. David's refactoring is efficient. As far as I tested, no major > > > performance problem was observed except using lexical analyzer. So, > > > if I can customize to disable lexical analyzer in this part, the most > > > strict method (such as the combination of lexical analyzer and strict > > > regexp in [wl-en: 05189]) would be permissive. > > > > I wrote the patch to fix 1-2. It is basically based on David's > > refactoring. There are two steps for extracting Message-ID. At > > first, more strict regexp matching is tested. Some valid headers, > > e.g. with comment, whould slip out of this test. But second step > > which uses lexical analyzer could pick them up. Because almost > > headers would be extracted in first step, I think there are little > > performance problem and no custamization for disabling lexical > > analyzer is needed. > > Thanks! I already applied the previous patch to WL main repository and > check this one on the weekend (likely tomorrow). Hi, Finally merged the patch to Wanderlust master and added some commits on top of it (see attached diff). Most notably: - customizable `elmo-always-prefer-std11-parser' (defaults to nil) - removal of `strict' parameter, dynamically bind `elmo-always-prefer-std11-parser' instead The modifications to elmo-nntp looked good and according to some tests w/ posts to alt.test seem to work as intended. FYI: Google adds Message-Id when the header field is missing or (considered to be) invalid. Best, -- David -- OpenPGP... 0x99ADB83B5A4478E6 Jabber.... dmjena@jabber.org Email..... dmaus@ictsoc.de
diff --git a/elmo/ChangeLog b/elmo/ChangeLog index 3a74b5d..3271260 100644 --- a/elmo/ChangeLog +++ b/elmo/ChangeLog @@ -1,3 +1,48 @@ +2012-08-15 David Maus <dmaus@ictsoc.de> + + * elmo-util.el (elmo-get-message-id-from-field): Fallback to std11 + parser if regexp didn't match. + + * elmo-vars.el (elmo-always-prefer-std11-parser): Renamed from + `elmo-prefer-std11-parser', default to nil. + +2012-08-14 David Maus <dmaus@ictsoc.de> + + * elmo-nntp.el (elmo-nntp-post): Dynamically bind + `elmo-prefer-std11-parser'. + (elmo-nntp-setup-crosspost-buffer): Dto. + + * elmo-util.el (elmo-get-message-id-from-field) + (elmo-get-message-id-from-header) + (elmo-get-message-id-from-buffer): Remove `strict' parameter, + dynamically bind `elmo-prefer-std11-parser' instead. + +2012-08-08 David Maus <dmaus@ictsoc.de> + + * elmo-util.el (elmo-parse-msgid-field): Renamed from + `elmo-normalize-msgid-field'. + (elmo-get-message-id-from-field): Dto. + +2012-08-05 David Maus <dmaus@ictsoc.de> + + * elmo-util.el (elmo-extract-std11-msgid-tokens): Retain order of + message-id values. + + * elmo-vars.el (elmo-prefer-std11-parser): New + customizable. Prefer std11 parser over regexp. + + * elmo-util.el (elmo-get-message-id-from-field): Use customizable. + +2012-08-05 David Maus <dmaus@w500.major-mode.org> + + * elmo-util.el (elmo-extract-std11-msgid-tokens): New + function. Extract msg-id tokens from std11-parsed field. + (elmo-normalize-msgid-field): New function. Return list of + message-id field values in field. + (elmo-get-message-id-from-field): New function. Return message-id + field value. + (elmo-get-message-id-from-header): Refactored. Use new functions. + 2012-08-01 Kazuhiro Ito <kzhr@d1.dion.ne.jp> * elmo-nntp.el (elmo-nntp-post): Remove the invalid Message-ID diff --git a/elmo/elmo-nntp.el b/elmo/elmo-nntp.el index 16735b4..5e3ecde 100644 --- a/elmo/elmo-nntp.el +++ b/elmo/elmo-nntp.el @@ -984,7 +984,8 @@ Don't cache if nil.") (goto-char (point-min)) (if (search-forward mail-header-separator nil t) (delete-region (match-beginning 0)(match-end 0))) - (setq has-message-id (elmo-get-message-id-from-buffer 'none t)) + (setq has-message-id (let ((elmo-prefer-std11-parser t)) + (elmo-get-message-id-from-buffer 'none))) (elmo-nntp-send-command session "post") (if (string-match "^340" (setq response (elmo-nntp-read-raw-response session))) @@ -1423,11 +1424,12 @@ Returns a list of cons cells like (NUMBER . VALUE)" ;; it is remembered in `temp-crosses' slot. ;; temp-crosses slot is a list of cons cell: ;; (NUMBER . (MESSAGE-ID (LIST-OF-NEWSGROUPS) 'ng)) - (let (newsgroups crosspost-newsgroups message-id) + (let ((elmo-prefer-std11-parser t) + newsgroups crosspost-newsgroups message-id) (save-restriction (std11-narrow-to-header) (setq newsgroups (std11-fetch-field "newsgroups") - message-id (elmo-get-message-id-from-header 'none t))) + message-id (elmo-get-message-id-from-header 'none))) (when newsgroups (when (setq crosspost-newsgroups (delete diff --git a/elmo/elmo-util.el b/elmo/elmo-util.el index 864ace1..b4ee5f1 100644 --- a/elmo/elmo-util.el +++ b/elmo/elmo-util.el @@ -259,7 +259,7 @@ Return value is a cons cell of (STRUCTURE . REST)" (elmo-condition-parse-error))) ;; or-expr ::= and-expr / -;; and-expr "|" or-expr +;; and-expr "|" or-expr (defun elmo-condition-parse-or-expr () (let ((left (elmo-condition-parse-and-expr))) (if (looking-at "| *") @@ -2235,32 +2235,37 @@ If ALIST is nil, `elmo-obsolete-variable-alist' is used." (elmo-replace-in-string (buffer-substring beg (point)) "\n[ \t]*" "")))))))) -(defun elmo-get-message-id-from-header (&optional when-invalid strict) - (let ((msgid (std11-fetch-field "message-id"))) - (when msgid - (or (and (null strict) - (string-match "\\`[ \n\t]*\\(<[^<>]+>\\)[ \n\t]*\\'" msgid) - (match-string 1 msgid)) - (let* ((tokens (std11-parse-msg-ids-string msgid)) - (id (assq 'msg-id tokens))) - (setq id (unless (assq 'msg-id (delq id tokens)) - (std11-msg-id-string id))) - ;; Return nil when result is "<>". - (when (> (length id) 2) id)) - ;; Invaild message-id. - (cond - ((eq when-invalid 'none) - nil) - ((eq when-invalid 'msgdb) - (concat "<" (std11-unfold-string msgid) ">")) - (t - (std11-unfold-string msgid))))))) - -(defun elmo-get-message-id-from-buffer (&optional when-invalid strict) +(defun elmo-extract-std11-msgid-tokens (msgid-string) + (let (ids) + (dolist (token msgid-string ids) + (when (eq (car token) 'msg-id) (setq ids (cons token ids)))) + (nreverse ids))) + +(defun elmo-parse-msgid-field (field) + (mapcar #'std11-msg-id-string (elmo-extract-std11-msgid-tokens (std11-parse-msg-ids-string field)))) + +(defun elmo-get-message-id-from-field (field) + (if (and (not elmo-always-prefer-std11-parser) + (string-match "\\`[ \n\t]*\\(<[^<>]+>\\)[ \n\t]*\\'" field)) + (match-string 1 field) + (let ((msgid-list (elmo-parse-msgid-field field))) + (when (null (cdr msgid-list)) (car msgid-list))))) + +(defun elmo-get-message-id-from-header (&optional when-invalid) + (let ((msgid-field (std11-fetch-field "message-id"))) + (when msgid-field + (let ((msgid (elmo-get-message-id-from-field msgid-field))) + (or msgid + (cond + ((eq when-invalid 'none) nil) + ((eq when-invalid 'msgdb) (concat "<" (std11-unfold-string msgid-field) ">")) + (t (std11-unfold-string msgid-field)))))))) + +(defun elmo-get-message-id-from-buffer (&optional when-invalid) (save-excursion (save-restriction (std11-narrow-to-header) - (elmo-get-message-id-from-header when-invalid strict)))) + (elmo-get-message-id-from-header when-invalid)))) (defun elmo-msgdb-get-message-id-from-header () (or (elmo-get-message-id-from-header 'msgdb) diff --git a/elmo/elmo-vars.el b/elmo/elmo-vars.el index 5872699..f971904 100644 --- a/elmo/elmo-vars.el +++ b/elmo/elmo-vars.el @@ -57,6 +57,11 @@ :prefix "elmo-" :group 'elmo) +(defcustom elmo-always-prefer-std11-parser nil + "Always prefer std11 parser over regexp." + :type 'boolean + :group 'elmo) + (defcustom elmo-digest-flags '(unread) "Flags which are treated as `digest'." :type '(repeat (symbol :tag "flag"))
Attachment:
pgpnZHOT2BYoS.pgp
Description: OpenPGP Digital Signature