Re: Read strings from one file and search for them in a directory containing htm files
From: Ed Morton (morton_at_lsupcaemnt.com)
Date: 11/28/05
- Next message: luke: "Env variable whit asterisk"
- Previous message: Chris F.A. Johnson: "Re: Unix shell script for folders and files moving"
- In reply to: Meghavvarnam: "Re: Read strings from one file and search for them in a directory containing htm files"
- Next in thread: Meghavvarnam: "Re: Read strings from one file and search for them in a directory containing htm files"
- Reply: Meghavvarnam: "Re: Read strings from one file and search for them in a directory containing htm files"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Mon, 28 Nov 2005 08:53:39 -0600
Meghavvarnam wrote:
> Ed Morton wrote:
<snip>
>>gawk 'NR==FNR{strings[$0]++;next}
>> { for (string in strings}
>> if (index($0,">"string"<") {
>> usedStrings[string]++
>> delete strings[string] # for efficiency
>> }
>> }
>> END { for (string in usedStrings)
>> print string
>> }' allStrings.txt directory/*.htm > usedStrings.txt
<snip>
> This is the script that I tried -
>
> # listused
> # lists strings that are used in all .htm files
>
> gawk 'NR==FNR{strings[$0]++;next} {
> for (string in strings) #}
> print string
> if (index($0,">"string"<") || index($0,"\""string"\"")
> || index($0,">"string"\n")) {
> usedStrings[string]++
> delete strings[string] # for efficiency
> }
Note that the above is now:
for (string in strings)
print string
if (index...) {
}
By adding "print string" between the "for.." and the "if..", you've
taken the "if..." outside of the loop. Add parens to make what you want
explicit {...}.
> }
> END {
> for (string in usedStrings)
> print string
> }' allStrings.txt htm/*.htm > usedStringsfile
>
> Please let me know, if there is any mistake in this.
Yes, there is. You now only have "print string" in the "for" loop. The
"if ..." is outside of it.
I gave execute
> permission for the file that contained this script and ran it from the
> shell.
>
> usedStringsfile was empty at the end of it.
>
> Any pointers will be of great help.
>
>
>>If you'd like the awk script to tell you which strings are/aren't used,
>>that's trivial, e.g.:
>>
>>gawk 'NR==FNR{strings[$0]++;next}
>> { for (string in strings}
>> if (index($0,">"string"<") {
>> usedStrings[string]++
>> delete strings[string] # for efficiency
>> }
>> }
>> END {
>> print "Used Strings:"
>> for (string in usedStrings)
>> printf "\t%s\n",string
>> print "Unused Strings:"
>> for (string in strings)
>> printf "\t%s\n",string
>> }' allStrings.txt directory/*.htm
>>
>
> I modified the script above to remove all parse errors.
What parse errors? There may be some since it's untested, but I don't
see any.
Here is the
> script that I used to try out -
>
> gawk ' NR==FNR{strings[$0]++;next}
> { for (string1 in strings)
> string = sprintf("<%s>", string1)
Here again you've added a line and so taken the subsequent block (the
"if...") out of the loop.
> if (index($0,">"string"<")) {
> usedStrings[string]++
> delete strings[string] # for efficiency
> }
> }
> END {
> print "Used Strings:"
> for (string in usedStrings)
> printf "\t%s\n", string
> print "Unused Strings:"
> for (string in strings)
> printf "\t%s\n", string
> }' allStrings.txt htm/*.htm
>
> I see the same behaviour with this as with the earlier script.
By that do you mean that "usedStringsfile" is empty? Well, yes, it would
be since no-where above do you direct any output to it, but additionally
you've broken the loop again.
Would we
> need a different approach for this thing at all ??
No.
> What does the line - NR==FNR{strings[$0]++;next} do.
See Janis' response.
> Thank you in advance so much for your help.
You're welcome,
Ed.
- Next message: luke: "Env variable whit asterisk"
- Previous message: Chris F.A. Johnson: "Re: Unix shell script for folders and files moving"
- In reply to: Meghavvarnam: "Re: Read strings from one file and search for them in a directory containing htm files"
- Next in thread: Meghavvarnam: "Re: Read strings from one file and search for them in a directory containing htm files"
- Reply: Meghavvarnam: "Re: Read strings from one file and search for them in a directory containing htm files"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|