[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Duplicate In-Reply-To entries in reply buffer



At Fri, 03 Aug 2012 14:41:47 +0200,
David Maus wrote:
>
> [1  <text/plain; US-ASCII (7bit)>]
> At Fri, 03 Aug 2012 20:33:27 +0900,
> Kazuhiro Ito wrote:
> >
> > > Actual bugs are;
> > >
> > > 1-1. elmo-msgdb-get-message-id-from-buffer assumes buffer is narrowed
> > > to a message header, but actually it is called from unnarrowed buffer
> > > in some places.
> > >
> > > 1-2. We have multiple extracting methods for Message-ID and they can't
> > > extract from some valid header.
> > (snip)
> > > Fix (my opinion);
> > (snip)
> > > 1-2. David's refactoring is efficient.  As far as I tested, no major
> > > performance problem was observed except using lexical analyzer.  So,
> > > if I can customize to disable lexical analyzer in this part, the most
> > > strict method (such as the combination of lexical analyzer and strict
> > > regexp in [wl-en: 05189]) would be permissive.
> >
> > I wrote the patch to fix 1-2.  It is basically based on David's
> > refactoring.  There are two steps for extracting Message-ID.  At
> > first, more strict regexp matching is tested.  Some valid headers,
> > e.g. with comment, whould slip out of this test.  But second step
> > which uses lexical analyzer could pick them up.  Because almost
> > headers would be extracted in first step, I think there are little
> > performance problem and no custamization for disabling lexical
> > analyzer is needed.
>
> Thanks! I already applied the previous patch to WL main repository and
> check this one on the weekend (likely tomorrow).

Hi,

Finally merged the patch to Wanderlust master and added some commits on top of it (see attached
diff). Most notably:

- customizable `elmo-always-prefer-std11-parser' (defaults to nil)
- removal of `strict' parameter, dynamically bind `elmo-always-prefer-std11-parser' instead

The modifications to elmo-nntp looked good and according to some tests w/ posts to alt.test seem
to work as intended.

FYI: Google adds Message-Id when the header field is missing or (considered to be) invalid.

Best,
  -- David
--
OpenPGP... 0x99ADB83B5A4478E6
Jabber.... dmjena@jabber.org
Email..... dmaus@ictsoc.de
diff --git a/elmo/ChangeLog b/elmo/ChangeLog
index 3a74b5d..3271260 100644
--- a/elmo/ChangeLog
+++ b/elmo/ChangeLog
@@ -1,3 +1,48 @@
+2012-08-15  David Maus  <dmaus@ictsoc.de>
+
+	* elmo-util.el (elmo-get-message-id-from-field): Fallback to std11
+	parser if regexp didn't match.
+
+	* elmo-vars.el (elmo-always-prefer-std11-parser): Renamed from
+	`elmo-prefer-std11-parser', default to nil.
+
+2012-08-14  David Maus  <dmaus@ictsoc.de>
+
+	* elmo-nntp.el (elmo-nntp-post): Dynamically bind
+	`elmo-prefer-std11-parser'.
+	(elmo-nntp-setup-crosspost-buffer): Dto.
+
+	* elmo-util.el (elmo-get-message-id-from-field)
+	(elmo-get-message-id-from-header)
+	(elmo-get-message-id-from-buffer): Remove `strict' parameter,
+	dynamically bind `elmo-prefer-std11-parser' instead.
+
+2012-08-08  David Maus  <dmaus@ictsoc.de>
+
+	* elmo-util.el (elmo-parse-msgid-field): Renamed from
+	`elmo-normalize-msgid-field'.
+	(elmo-get-message-id-from-field): Dto.
+
+2012-08-05  David Maus  <dmaus@ictsoc.de>
+
+	* elmo-util.el (elmo-extract-std11-msgid-tokens): Retain order of
+	message-id values.
+
+	* elmo-vars.el (elmo-prefer-std11-parser): New
+	customizable. Prefer std11 parser over regexp.
+
+	* elmo-util.el (elmo-get-message-id-from-field): Use customizable.
+
+2012-08-05  David Maus  <dmaus@w500.major-mode.org>
+
+	* elmo-util.el (elmo-extract-std11-msgid-tokens): New
+	function. Extract msg-id tokens from std11-parsed field.
+	(elmo-normalize-msgid-field): New function. Return list of
+	message-id field values in field.
+	(elmo-get-message-id-from-field): New function. Return message-id
+	field value.
+	(elmo-get-message-id-from-header): Refactored. Use new functions.
+
 2012-08-01  Kazuhiro Ito  <kzhr@d1.dion.ne.jp>
 
 	* elmo-nntp.el (elmo-nntp-post): Remove the invalid Message-ID
diff --git a/elmo/elmo-nntp.el b/elmo/elmo-nntp.el
index 16735b4..5e3ecde 100644
--- a/elmo/elmo-nntp.el
+++ b/elmo/elmo-nntp.el
@@ -984,7 +984,8 @@ Don't cache if nil.")
       (goto-char (point-min))
       (if (search-forward mail-header-separator nil t)
 	  (delete-region (match-beginning 0)(match-end 0)))
-      (setq has-message-id (elmo-get-message-id-from-buffer 'none t))
+      (setq has-message-id (let ((elmo-prefer-std11-parser t)) 
+			     (elmo-get-message-id-from-buffer 'none)))
       (elmo-nntp-send-command session "post")
       (if (string-match "^340" (setq response
 				     (elmo-nntp-read-raw-response session)))
@@ -1423,11 +1424,12 @@ Returns a list of cons cells like (NUMBER . VALUE)"
 ;;         it is remembered in `temp-crosses' slot.
 ;;         temp-crosses slot is a list of cons cell:
 ;;         (NUMBER . (MESSAGE-ID (LIST-OF-NEWSGROUPS) 'ng))
-  (let (newsgroups crosspost-newsgroups message-id)
+  (let ((elmo-prefer-std11-parser t)
+	newsgroups crosspost-newsgroups message-id)
     (save-restriction
       (std11-narrow-to-header)
       (setq newsgroups (std11-fetch-field "newsgroups")
-	    message-id (elmo-get-message-id-from-header 'none t)))
+	    message-id (elmo-get-message-id-from-header 'none)))
     (when newsgroups
       (when (setq crosspost-newsgroups
 		  (delete
diff --git a/elmo/elmo-util.el b/elmo/elmo-util.el
index 864ace1..b4ee5f1 100644
--- a/elmo/elmo-util.el
+++ b/elmo/elmo-util.el
@@ -259,7 +259,7 @@ Return value is a cons cell of (STRUCTURE . REST)"
       (elmo-condition-parse-error)))
 
 ;; or-expr      ::= and-expr /
-;;	            and-expr "|" or-expr
+;;		    and-expr "|" or-expr
 (defun elmo-condition-parse-or-expr ()
   (let ((left (elmo-condition-parse-and-expr)))
     (if (looking-at "| *")
@@ -2235,32 +2235,37 @@ If ALIST is nil, `elmo-obsolete-variable-alist' is used."
 		  (elmo-replace-in-string
 		   (buffer-substring beg (point)) "\n[ \t]*" ""))))))))
 
-(defun elmo-get-message-id-from-header (&optional when-invalid strict)
-   (let ((msgid (std11-fetch-field "message-id")))
-    (when msgid
-      (or (and (null strict)
-	       (string-match "\\`[ \n\t]*\\(<[^<>]+>\\)[ \n\t]*\\'" msgid)
-	       (match-string 1 msgid))
-	  (let* ((tokens (std11-parse-msg-ids-string msgid))
-		 (id (assq 'msg-id tokens)))
-	    (setq id (unless (assq 'msg-id (delq id tokens))
-		       (std11-msg-id-string id)))
-	    ;; Return nil when result is "<>".
-	    (when (> (length id) 2) id))
-	  ;; Invaild message-id.
-	  (cond
-	   ((eq when-invalid 'none)
-	    nil)
-	   ((eq when-invalid 'msgdb)
-	    (concat "<" (std11-unfold-string msgid) ">"))
-	   (t
-	    (std11-unfold-string msgid)))))))
-
-(defun elmo-get-message-id-from-buffer (&optional when-invalid strict)
+(defun elmo-extract-std11-msgid-tokens (msgid-string)
+  (let (ids)
+    (dolist (token msgid-string ids)
+      (when (eq (car token) 'msg-id) (setq ids (cons token ids))))
+    (nreverse ids)))
+
+(defun elmo-parse-msgid-field (field)
+  (mapcar #'std11-msg-id-string (elmo-extract-std11-msgid-tokens (std11-parse-msg-ids-string field))))
+
+(defun elmo-get-message-id-from-field (field)
+  (if (and (not elmo-always-prefer-std11-parser)
+	   (string-match "\\`[ \n\t]*\\(<[^<>]+>\\)[ \n\t]*\\'" field))
+      (match-string 1 field)
+    (let ((msgid-list (elmo-parse-msgid-field field)))
+      (when (null (cdr msgid-list)) (car msgid-list)))))
+
+(defun elmo-get-message-id-from-header (&optional when-invalid)
+  (let ((msgid-field (std11-fetch-field "message-id")))
+    (when msgid-field
+      (let ((msgid (elmo-get-message-id-from-field msgid-field)))
+	(or msgid
+	    (cond
+	     ((eq when-invalid 'none) nil)
+	     ((eq when-invalid 'msgdb) (concat "<" (std11-unfold-string msgid-field) ">"))
+	     (t (std11-unfold-string msgid-field))))))))
+
+(defun elmo-get-message-id-from-buffer (&optional when-invalid)
   (save-excursion
     (save-restriction
       (std11-narrow-to-header)
-      (elmo-get-message-id-from-header when-invalid strict))))
+      (elmo-get-message-id-from-header when-invalid))))
 
 (defun elmo-msgdb-get-message-id-from-header ()
   (or (elmo-get-message-id-from-header 'msgdb)
diff --git a/elmo/elmo-vars.el b/elmo/elmo-vars.el
index 5872699..f971904 100644
--- a/elmo/elmo-vars.el
+++ b/elmo/elmo-vars.el
@@ -57,6 +57,11 @@
   :prefix "elmo-"
   :group 'elmo)
 
+(defcustom elmo-always-prefer-std11-parser nil
+  "Always prefer std11 parser over regexp."
+  :type 'boolean
+  :group 'elmo)
+
 (defcustom elmo-digest-flags '(unread)
   "Flags which are treated as `digest'."
   :type '(repeat (symbol :tag "flag"))

Attachment: pgpnZHOT2BYoS.pgp
Description: OpenPGP Digital Signature