[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: problems decoding base64 text/plain utf-8



At Wed, 18 Apr 2012 23:04:05 +0900, Kazuhiro Ito <kzhr@d1.dion.ne.jp> wrote:
Subject: Re: problems decoding base64 text/plain utf-8
> 
> > > >I just tried to read a message with the following MIME headers:
> > > >	Content-Type: text/plain; charset="utf-8"
> > > >	Content-Transfer-Encoding: base64
> > > >	MIME-Version: 1.0
> > > >but I was told:
> > > >	Can't decode current entity.
> (snip)
> > Here's a reference to the original message:
> > 
> > 	List-ID: <git.vger.kernel.org>
> > 	Message-ID: <657A681BEF27534399890012B8C8E50E1AD63D2268@lcs-exchange01.Lantekcs.com>
> 
> FLIM's base64 decoder does not accept a base64 encoded entity which
> has an unencoded footer, which is mostly added by ML engines.
> 
> Please see *1 and try workaround.
> 
> (*1) http://thread.gmane.org/gmane.mail.wanderlust.general.japanese/7628/focus=7640

Ah ha, thank you!

Sorry, I should have checked the list archives for further replies to my
original message -- it seems I was disconnected from the list at the
time you sent your original reply.

Indeed the problem is exactly as you described -- a non-MIME-friendly
(and non-standards friendly in general, and an especially non-BCP-
friendly) mailing list program has blindly stuffed an un-encoded bunch
of "footer" text on the tail of a message for which the sender's MIME
headers claim should consist entirely of base-64 encoded content.

Your work-around does indeed allow me to read such mangled messages.

Perhaps the implementation is a bit too pedantic though -- maybe the end
of the valid base-64 text should be detected simply by looking for
either a blank line or a line that does not start with a valid base-64
encoding character.  (GNU Mailman, IIUC, will always insert a line
consisting only of two minus sign characters ("--") before the
msg_footer text when it mangles a message being forwarded to the mailing
list members in non-digest mode.  Digest mode is a whole other nightmare.)

For a permanent fix in FLIM I think a conservative approach to parsing
base-64 text, while still allowing for semi-structured mangling of
messages, would be more in line with the spirit of the Security
Considerations section of RFC 3548.  I.e. don't ignore invalid ("non-
alphabet") characters, but rather simply stop decoding when encountering
such and present the rest of the text as-is (if possible).

I am indeed partially contradicting myself here -- I'm not sure what
should be done if a non-alphabet character is encountered mid-line.  It
would be nice to see as much of the decoded text as possible if indeed
the content-type is of a textual nature, but if it's a PDF or image or
other application file type then it's probably safest to put a big red
stop-sign-shaped button on the screen warning that the content was
invalid (while still showing the message headers and any other message
text).  Throwing an elisp error though is unfriendly and not the right
thing to do in any case.

-- 
						Greg A. Woods
						Planix, Inc.

<woods@planix.com>       +1 250 762-7675        http://www.planix.com/