Re: sometimes awk works and sometimes /usr/xpg4/bin/awk works ..
From: Dennis Clarke (dclarke_at_blastwave.org)
Date: 08/12/03
- Next message: Dave Uhring: "Re: partition sizes"
- Previous message: perl user: "partition sizes"
- In reply to: Richard L. Hamilton: "Re: sometimes awk works and sometimes /usr/xpg4/bin/awk works .."
- Next in thread: Richard L. Hamilton: "Re: sometimes awk works and sometimes /usr/xpg4/bin/awk works .."
- Reply: Richard L. Hamilton: "Re: sometimes awk works and sometimes /usr/xpg4/bin/awk works .."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Mon, 11 Aug 2003 18:24:35 -0700
On Mon, 11 Aug 2003, Richard L. Hamilton wrote:
>In article <Pine.GSO.4.53.0308111201590.3426@blastwave>,
> Dennis Clarke <dclarke@blastwave.org> writes:
>>
>>
>> just a rant ..
>>
>> Sometimes it seems like I should just link /usr/bin/awk to /usr/xpg4/bin/awk
>>
>> $ grep "1000A-10-ENC" foo.dat | awk 'BEGIN{FS=";"}{print $1 "\t" $4 }'
>> awk: record `1000A-10-ENC; ;2990....' has too many fields
>> record number 1
>>
>>
>> $ grep "1000A-10-ENC" foo.dat | /usr/xpg4/bin/awk 'BEGIN{FS=";"}{print $1 "\t" $4 }'
>> 1000A-10-ENC 4
>
>Here's the results of some brute force testing of a line like
>
> perl -e 'print "x " x '"${f}"';' | ${awk} '{x=$'"${f}"'}'
>
>for values of ${f} starting with 1 and different flavors of awk for ${awk}:
>
>
>awk: trying to access field 100
>oawk failed with 100 fields
yep .. I get the same problem. I have extracted data from a database and
there are 768 fields to a record. kaboom.
>
>nawk: trying to access field 500
> source line number 1
> context is
> >>> {x=$500 <<< }
>nawk failed with 500 fields
I did not try anything else other than /usr/xpg4/bin/awk so I wouldn't know.
You seem to have nailed down the situation quite neatly though.
>
>/usr/xpg4/bin/awk: line 0 (NR=1): Too many fields (LIMIT: 4000)
>/usr/xpg4/bin/awk failed with 4001 fields
That is enough for most demanding situations.
>
>/usr/xpg4/bin/awk does fairly well compared to the first two. But with
>gawk, I knew it would be hopeless to simply increment, so I went to doubling.
>It handled 2097152 fields just fine, but took long enough on the next doubling
>(4194304) that I didn't feel like waiting and killed it.
okay .. well it is reasonable to say that gawk would work for even those
situations where the record size is completely ridiculous. I would guess that
it would work given enough RAM. If you like I can test it on a V880 with 8Gb
of RAM. Just to see if it breaks at some reasonable ( or unreasonable )
point.
> Probably VM was
>the limitation; on a larger system with more than 1 1/8 GB RAM, it would've
>probably kept going somewhat longer (with enough RAM, nominally until it hit the
>limits of 32 bit (signed or even unsigned) numbers, but more realistically
>until it maxed out a 32-bit address space; I haven't needed to explore the
>possibility of building a 64-bit gawk executable).
I'll check with blastwave.org to see what's up with a build of gawk for 64 bit
scenarios. I don't know how relevant it will be though. Seems like everyone
is going Intel these days and the 64-bit architecture is a great idea but not
needed.
>
>So if you want a version of awk that for most practical purposes doesn't have
>a maximum number of fields limitation, it's rather clear to me which one that
>would be.
I agree.
>
>Recently I ran into a case where nawk was bombing during patchadd (too
>darn many patches installed, I guess). So I just moved it over and replaced
>it with a symlink to gawk. Thus far, nothing has broken, and I can once
>again install patches without that particular problem.
More than 500 patches?
>
>Of course you do have to scrounge and build gawk yourself (unless it's on
>the freeware CD, or you're willing to get it from one of the sites with
>prebuilt binaries for Solaris), but if you need it, it's well worth having.
Probably in the GNU textutils from blastwave.org site.
>
>But I think the real answer for patching problems would be if Sun rewrote
>the patching (and any other package related) scripts in perl, which could
>also cut the number of child processes (since perl can do pretty much
>everything that sh, awk, sed, etc. can do and then some) and considerably
>speed up patch installation. Since a stable (for a given version of the
>OS) version of perl is pretty much a core part of Solaris >= 8, I can't see
>any good reason (aside from the manpower needed) _not_ to.
The source to awk probably has not been touched since Solaris 2.5.1 days.
Dennis
- Next message: Dave Uhring: "Re: partition sizes"
- Previous message: perl user: "partition sizes"
- In reply to: Richard L. Hamilton: "Re: sometimes awk works and sometimes /usr/xpg4/bin/awk works .."
- Next in thread: Richard L. Hamilton: "Re: sometimes awk works and sometimes /usr/xpg4/bin/awk works .."
- Reply: Richard L. Hamilton: "Re: sometimes awk works and sometimes /usr/xpg4/bin/awk works .."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|
|