Re: Points about yarn

From: Howard Schwartz (theo@ncal.verio.com)
Date: Tue, 2 Nov 1999 02:00:01 -0800 (PST)

Lulu wrote:
> One thing one could do to automate things further is to use a "custom
> editor" that includes preprocessing. I do this along with Yep, but you
> don't need to.

You dont need a custom editor to preprocess. Any version of the vi
editor (well -- except the real tiny ones) will execute ``modeline''
commands at a beginning of a file that will preprocess the rest of the
text in the file. Also, many of the vi clones are designed specifically
to do pre-processing. The clone, elvis, for example (freeware) provides
setup files that tell elvis to execute commands: a) before it starts
b) before it reads in a file c) right after it reads in a file d) right
before it writes a file to disk e) right after it writes a file to
disk --- all of this automatic. This is a bit excessive but many
ordinary editors now have preprocessing capability.

> My "editor" is a batch file, called YARNEDIT.CMD. In
> this, I do a little preprocessing:
>
> ::-- Next, clean commas from To:
> sed "s/\(^To:[ ]*([^,]*\),\([^,]*)\)/\1 -\2/" %1.1 > %1.2
>
>
> In terms of the second 'sed' -- the purpose was to deal with a class of
> probably RFC822 invalid addresses I was receiving. I was getting
> messages from a sender like:
>
> From: (John Doe, Acme Enterprises) jdoe@acme.com
>
> My sed command eliminates the commas from fullnames surrounded by
> parentheses.

No, it only eliminates a single comma. Addresses like:

(Bob Doe, Cliff Doe, John Doe) people@isp.com

will still cause you problems.

> Of course, it is imperfect, since it will only fix the
> first such problem address on the To: line, not subsequent ones.
>
> If anyone has a better regex, I'd be curious.

There are lots of bad addresses out there, some of the GNKSA software
may already try to identify and correct address errors of many types.
Perl or awk may be better preprocessors for your jobs, since they let
you isolate text between paranentheses, (or brackets) to edit, instead of
entire lines.

I can suggest sed commands that may work better for you.
However, we better take this kind of dialogue off-list, since it is probably
of little interest to general yarn users. (Other users: you were warned!)
I am fairly good at regex commands, and feel free to mail me privately
if I can help with additional commands, editors, software, etc.

There are many versions of sed with different levels of ability. But
assuming a sed with a minimal command set consider this script:

/^[TtBbCc][oCc][Cc]*: .*(/b doit # Send only lines that begin with
b # To: ,Cc:, or Bcc: and have a ``(''
# in them, to the commands after
# the line, ``: doit''
: doit
: loop
s/\(([^,]*\),.*)/\1/ # Look in the first pair of parentheses
t loop # and remove the first comma you find
#
# If the line still has a comma
# inside a pair or parantheses, loop
# back and do the search/replace
# command again

The above sed commands process all text lines that normally
contain internet mail addresses. They remove every comma that occurs
between any and every set of parantheses, on any of these lines. The
commands can go in a script file or on a single sed command line.

Enjoy!