Re: find and replace strange characters

From: Paul Jarc (prj_at_po.cwru.edu)
Date: 09/28/04


Date: Mon, 27 Sep 2004 21:35:10 -0400

Rick <inkswamp@hotmail.com> wrote:
> I'd love to write a shell script to run through each night's web
> pages and replace those characters with the proper HTML code for
> bullets (&#8226;) but I'm not sure how to isolate what Unix is
> seeing when it "sees" these characters.

Open one of those files in your text editor and remove everything but
one of the garbage characters. Then add this around it:
s/garbage/\&#8226;/g
Make a similar line for each of the other kinds of garbage
characters. (If any of them comprise multiple bytes, and one of them
is "/", thne you can use a different character for the three
separators within the line.) Save this to a new file fixhtml.sed, and
then run:
$ sed -f fixhtml.sed < quarkxpressfile > file.html
to create the HTML file.

paul