How to undigest yarn

From: Howard Schwartz (theo@ccnet.com)
Date: Wed, 09 Sep 1998 21:58:31 -0400

I while back a listserv member asked how to undigest posts from
the yarn-digest, using Bob Rush's program, ddigest. I find
the following (ddigest) configuration file will separate yarn
digest's posts:

FindHeader=To:
FindDigest="Yarn Support List"
FirstArticleBreak=^$.*YARN-LIST__digest_.*$$
ArticleBreak=^$.*YARN-LIST__digest_.*$$
NewsGroup=yarn-list

In general, the documentation for ddigest is difficult to digest, itself,
and leaves out some important details. Moreover, Bob Rush's ftp site, and
bob rush himself seem to have vanished from the e-waves. From what
I can tell, a config. file can contain the following fields:

Field Value for this field
----- --------------------
Find Header Simple string, NOT a regular expression

FindDigest Regular Expression
FirstArTicleBreak Regular Expression
ArticleBreak Reqular Expression
ArticleHeader Regular Expression

MustUseArticleHeader Yes/No

There may be other fields but he does not mention them. The findHeader
field appears to be defined as the beginning words in the (single) header
of the entire digest, of the line that contains the ``FindDigest'' string.
Thus, above, I tell ddigest to look at lines starting with, ``To:''
and see if one of those lines has the string, ``"Yarn Support List"''

The FirstArticleBreak and ArticleBreak fields should usually be set to
(regular expression) values that span 32 lines -- usually a blank line,
then a line with some characters in it, then another blank line. The
document tells you how to do this. The reason is that ddigest removes
these lines from the digest when it finds them, and thus you get rid
of unnecessary blank lines.

The ArticleHeader field specifies some beginning characters of EVERY header
line in every message. The usual use for this is to get rid of the
beginning characters (e.g., ``>'').

I find the program slightly buggy in that, sometimes it will fail to
separate messages, or separate them incorrectly, depending on mistakes
and variations in the SOUP or Yarn folder formatting, which ddigest
should not care about, but does.

Hope all this helps someone.