Re: Removing non-text/whitespace chars from a text file: How ?
From: Ed Morton (morton_at_lsupcaemnt.com)
Date: 03/19/05
- Next message: Heiner Steven: "Top 10 subjects comp.unix.questions"
- Previous message: Juha Laiho: "Re: Makefile does not run on hp unix box"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Sat, 19 Mar 2005 07:45:07 -0600
Al Dykes wrote:
> When I highlight and copy test from a web browser and past it into a
> text file I frequently get extended ascii non-text bytes that I'd like
> to strip out. I like to remove anything above octal 126. What's the
> right tool for this ?
I don't know if there's a better tool for it and I don't know if this
exactly matches your request to "remove anything above octal 126", but a
POSIX sed will let you strip out any control (non-printable) characters:
sed 's/[^[:print:]]//g' file > tmp
mv tmp file
or with GNU sed:
sed -i 's/[^[:print:]]//g' file
You learn more about POSIX character classes like [:print:] at
http://www.gnu.org/software/gawk/manual/html_node/Character-Lists.html
or just google for it.
Regards,
Ed.
- Next message: Heiner Steven: "Top 10 subjects comp.unix.questions"
- Previous message: Juha Laiho: "Re: Makefile does not run on hp unix box"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]