Re: SORT by text fields

From: Kevin Collins (spamtotrash_at_toomuchfiction.com)
Date: 12/23/03


Date: 22 Dec 2003 17:14:46 -0800

Tapani Tarvainen <gn20031221T182819@tt.oma.it.jyu.fi> wrote in message news:<n6ad5myvy2.fsf@tt.oma.it.jyu.fi>...
> Gnarlodious <gnarlodiousNULL@VOID.invalid.yahoo.com> writes:
>
> > I want to sort lines according to fields containing variable-length text.
> >
> > <td class=Name>variableLengthText<td class=Artist>variableLengthText<td
> > class=Album>variableLengthText<td class=Genre>variableLengthText<td
> > class=Size>variableLengthText<td class=Year>variableLengthText
> >
> > Let's say I want to sort using the "Genre" cell, what "sort" options will do
> > that?
>
> Try
>
> sort -t'>' -k5

Assuming that would work as requested (in my opinion, it wouldn't work
very well unless the other HTML output could be removed), you would be
sorting _starting_ from field 5 through the end of the line, not just
on that field, which could yield wrong results.

To do what you want requires more than a simple one-liner... This
could be accomplished with various tools, although I would use Perl.

This will do what you need assuming you only have lines of a format as
described above and no other HTML:

#!/usr/bin/perl

while (<>)
{
    chomp;
    my @F = split(/<td class=[^>]*>/);

    $lines{"$F[4]"} = $_;
}

foreach (sort(keys(%lines)))
{
    print $lines{"$_"} . "\n";
}

In this example '$F[4]' is the 5th field (delimited by a <td ...>). A
one liner might look like this:

perl -an -F'<td class=[^>]*>' -e '$lines{"$F[4]"} = $_; END {print
$lines{"$_"} foreach (sort(keys(%lines)));}'

Since this is part of an HTML table definition you are asking for
trouble unless you have some way to pull these lines out of the rest
of the HTML and then put them back in... it can be done :)

Good luck,

Kevin



Relevant Pages

  • Re: Perl DBI Module: SQL query where there is space in field name
    ... there is no reason to open let alone save the garbage part this way at ... simply ignore than huge pile of utterly useless HTML crap. ... huge amount of total crap and thus causes negativity - as I can see ... that is indeed my opinion on this case of the meaning for "my way". ...
    (comp.lang.perl.misc)
  • Re: Event Handling via CSS
    ... Ask your fellow jQuery mavens. ... html, that responds for the vast majority of living websites. ... XHTML served as HTML is not "current web technology", ... About not being "too hot" with HTML, that's a personal opinion. ...
    (comp.lang.javascript)
  • Re: Which spec to use?
    ... am venturing out into the design area and need an opinion please. ... HTML 4.01 or XHTML 1.0 both being strict. ... Is that still true with IE8? ... I asked because I have read some things that indicate that IE8 is so W3C compliant that sites built to use IE6-type proprietary technology probably won't work with IE8. ...
    (alt.html)
  • Re: CSS software tools sought
    ... I don't see that as horrible but of course you're entitled to your opinion. ... What four lines of Perl can do the same thing? ... HTML to their students and find it very useful. ... I said was rubbish. ...
    (comp.infosystems.www.authoring.stylesheets)
  • Re: Delphi Blogsphere
    ... Text only is not vulnerable for the viri, spyware and other baddies ... MIME attachments are text only, not HTML, and they can ... It may be a well-informed opinion motivated by an uncanny grasp ...
    (borland.public.delphi.non-technical)