Re: Stripping HtML and/or MIME-encoded messages?

From: Ramsay D. Seielstad (mrhuey@wizvax.wizvax.net)
Date: Sat, 18 Dec 1999 01:23:56 -0500

-----BEGIN PGP SIGNED MESSAGE-----

Howdy fellow yarn-sters, I've half-heartedly been following the
thread and am pretty amazed at the lack of suggestions, and aware-
ness of the capabilities of yarn & souper. I would have to
disagree there is a need for Chin to release the source code just
for you to adequately filter out unwanted e-mail and newsgroup
posts.

The first option I'd suggest is to try using YARN's internal
scoring options. 'Scoring' will be the least destructive method,
allowing you to refine your search until you are satisfied the
search parameters are catching only the news and mail you don't
want to see. Essentially, scoring will simply hide the articles
as if you have already read them.

The second option is 'filter.exe' and this is a middle ground
option that either allows you to move the selected material to
a specific newsgroup or outright deletion. If you delete the
material here, it never makes it into the inbox or news-base but
should still be in the downloaded article file if you do need
to get at the material.

The third option is a 'killfile' for souper, or vsoup, and this
IS THE MOST DESTRUCTIVE OPTION!!! This option uses your
filtering expressions as you download news and mail to eliminate
downloading unnecessary bandwidth you're not going to read. And
I do mean *DELETE* in the case of e-mail, if someone sends just
the right phrase, you'll never see it and never be able to
retrieve it.

One of the really neat options is that you can use a combination
of all three, Scoring can be used globally for all newsgroups, as
well as individually in each specific newsgroup, Filtering can move
questionable items to a pseudo-newsgroup you have created for
and a killfile for thse you never want to hear from at all.

The drawback is that it helps to understand UNIX regular
expression matching and it takes time to learn this. You can
start simply and increase the complexity of the regular
expression as you learn. I'm still learning it after 15 or
so years, but I can set up some pretty interesting and
complex expressions to do the filtering with.

So, as an example, my global score file contains the regular
expression for a newsgroup posting with 12 or more groups, my
individual newsgroup score files contain Subjects, Authors or
specific items I'm not interested in seeing, but might have a
need to look.

My filter expressions have pretty much passed muster and
are there to filter my selected material to the 'junk' newsgroup
for a couple of months until I am positive I can safely move them
to the souper killfile. Then there is the souper killfile, what
I don't see, I don't miss.

If you are truly interested in never seeing an newsposting or
e-mail containing either HTML or MIME encoded messages, the most
final option is to simply give up e-mail and newsgroups!!! :-o

In article <PQCU4oQ0mA/F092yn@scn.org>, you wrote:

- ->In article <199912092004.MAA24762@ shell1.ncal.verio.com>,
- ->Howard Schwartz <theo@ncal.verio.com> wrote:
- ->>
- ->>It is probably not wise to automatically delete ALL MIME-encoded messages
- ->>since some mailers introduce plain text as MIME-encoded, and you might
- ->>want a friend on occasion to send you a picture, or a binary program
- ->>in the mail.
- ->
- -> The people who e-mail me know that I don't accept binaries in
- ->e-mail ever. I just delete them outright and don't respond to the
- ->message(s).
- ->
- ->>similar program automatically skip the DISPLAY on the screen of a
- ->>particular type of MIME message:
- ->
- -> Actually, the MIME messages don't bother me all that much. It's
- ->really those HTML-ified messages that drive me nutty.
- ->
- ->>If you want html or other parts of messages actually deleted from your
- ->
- -> I don't want the html parts of messages deleted. I want the
- ->WHOLE message deleted if it has *any* HTML tags in it. There isn't any
- ->reason for that HTML stuff in messages and I'd prefer not to have to
- ->bother with them at all.
- ->
- -> I suspect I'm going to have to roll my own solution until (or
- ->unless) Chin decides to release the source to Yarn. But thanks for your
- ->replies. I appreciate them.
- ->
- ->

- --

+-----------------------------------------------------------------------------+
| Ramsay D. Seielstad | mrhuey@wizvax.net; Af029@Detroit.Freenet.Org |
| Schenectady, NY | ~~~~~~~~~~~~~~~~~ |
+-=-=-=-=-=-=-=-=-=-=-=+=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+
| "No fancy terminators or trailers, No opinion, Just an average, everyday |
| guy with a bunch of unrelated hobbyist activities that have no significant |
| use or value other than to amuse myself and occupy me free time ... and |
| trust me, these ain't MY employer's opinions or views" |
+-----------------------------------------------------------------------------+
| To obtain my PGP Public Key: finger mrhuey@wizvax.net |
+-----------------------------------------------------------------------------+

-----BEGIN PGP SIGNATURE-----
Version: 2.6.2

iQCVAwUBOFs6hBh6cbDiY22VAQGFBQQAnuFlXuEDb7ah4naHpaCex/+qSdx04bLr
RvaCdipjid2lmD1d09MpQJNCGrTKPIToqc/gLaYYyJSopAhicT9jUPtath5bLnAD
90Yqg4KeyQHlA47/NiOvQMHRUzl11POP8DKZk4LVZeccS3posK2FGWKVy0o81YjL
H0h85Ap6hDs=
=sIuL
-----END PGP SIGNATURE-----