This Korean spam

AMAKAWA Shuhei sa264 at
Thu Feb 7 22:08:11 GMT 2002

At Thu, 07 Feb 2002 12:26:06 +0200,
gabi veneat wrote:
> Probably  you can add all the non-latin char sets
> i don't think any of the users of _UK_ freebsd list will feel like reading all the stuff in
> kanji vietnamese cyrillic hebrew ...or any other encodings,even if there are readers who
> natively use these . Anyway , it's a uk users group after all
> Gabi
> Paul Civati wrote:
> > Without wishing to start a massive thread on the spam problems (I am
> > an advocate of the delete key but this korean crap is starting to get
> > annoying now because exmh doesn't handle it at all well)..
> > 
> > This looks like a good header to filter on..
> > 
> >   Content-type: text/html; charset="ks_c_5601-1987"
> > 
> > Fair game I think, as this is an english speaking list.
> > 
> > Also, most if not all of it, seems to originate from,
> > which is Korea Telecom, wonder if it's worth complaining or not..
> > 
> > -Paul-

It's not a brilliant idea to filter based on the charset.
It assumes

(i) The specified charset is correct.
(ii) The specified charset is the lowest upper bound of the charset
that is actually used.

(i) is OK(ish).
But (ii) is a bit more dodgy.

Since us-ascii is a subset of many other charsets, including
"exotic" ones, it is valid to send an email written completely in
us-ascii characters with the header "charset=foo-bar", if foo-bar
includes us-ascii as a subset.

Doing this is certainly not good practice, and the actual
lowest-upper-bound charset should be given in the header instead.
But many MUAs generate the Content-type: line according to the user
setting, not according to the actual content of the message.  So,
it's quite common bad practice.


