Re: Sorting and then removing sort-by cols from a fixed-width flat file
- From: aditya.chaudhary@xxxxxxxxx
- Date: 29 Jun 2006 02:20:26 -0700
Hey Logan,
Thanks for your inputs. I have some doubts and concerns rgding same. I
have answer your questions (comments) below:
Logan Shaw wrote:
aditya.chaudhary@xxxxxxxxx wrote:
Basically I would be merging 3 flat files and wanted to sort and group
it's records so that they can be transmitted in a specified format.
Transmitted? I thought you were just merging them.
I will first merge them, then do the sorting, then remove the cols
which were added just for making sorting easy....finally I have to ftp
this file.
But
the problem is the group by col is situated at different position in
each of the 3 flat files.
Does the "group by col" mean the column which contains the sort key?
No. 'Group by col' means that there is a col exisiting in the records
which can be used for sorting the data alongwith Record_Type, but the
issue is its existing at different positions in each of 3 flat files'
records. If for each file it had been exisiting from say position 15-18
then I could have used it for sorting.
I don't get what you meant by this. Maybe because I'm not familiar with
So I thought of adding 2 common cols in the
beginning of each of the 3 files and then sort the records using the
same 2 cols so that they can be grouped and then remove those cols
after sorting operation as they are not required to be present in the
final file.
That's one way to do it. The other way to do it is to do all that
work in your comparison function, so that the information it is never
added to any files but is temporarily created only when you are
comparing two elements.
much shell scripting techniques.
Yes. Common format is already defined - it has to be fixed-width file.I would be creating a shell script to first merge the 3
files, then sorting them and then removing the cols.
Merging has a specific meaning when you are talking about sorting.
I believe what you are saying is that you will first convert all
three files into a common format, then sort them, then remove
the extra columns.
So the 3 files have to be merged and then sorted so that data appears
in some 'order by' fashion when it goes to the user.
I cannot use a delimiter for just 2 new cols. The file is inThe 2 cols are Request_Id (string of rec length 15, i.e. position 1-15
in file) and Serial_Number (string of rec length 5, i.e. position 16-20
in file).
So I need to know:
1) how to write the 'sort' command of Unix so that I can use both these
cols for sorting the recs of the file.
Sorting by character column numbers is generally not the easiest thing
in Unix, at least if you are using the "sort" command. The "sort"
command expects a field separator character rather than using fixed-
length fields. There may be a way to "trick" it into using a fixed
range of columns, but it's much easier to use some character, like
":" as a separator. Then you can do
sort -t: +0 +1
in order to sort by the 1st and 2nd colon-delimited fields. Of course
you can use any character instead of ":" as long as that character
does not occur within your sort keys.
fixed-width format and I have to format such a file. So kindly suggest
the sort syntax for fixed width.
2) the remove command or process so that I can drop/remove these 2 cols
after all the recs have been sorted out.
Removing the first 20 characters from every line of a file is easy.
It's just as easy as this:
sed -e 's/.\{20\}//'
That matches the pattern
.\{20\}
which is 20 of any character ("." stands for any character) and
replaces the pattern it matches with the empty string.
Thanks for this solution. I would try it once and let u know if it
works fine or not.
However, above I recommended that you use a delimiter instead of
fixed-width fields. In that case, to remove the first two
colon-delimited fields, you would instead want to use the cut
command:
cut -d: -f3-
That prints field 3 and following ("3-") with ":" as the delimiter.
Note that this is 1-based indexing, whereas the sort command (at
least in the syntax I gave -- it accepts more than one syntax)
uses 0-based indexing.
- Logan
.
- Follow-Ups:
- Re: Sorting and then removing sort-by cols from a fixed-width flat file
- From: Logan Shaw
- Re: Sorting and then removing sort-by cols from a fixed-width flat file
- From: Jim Cochrane
- Re: Sorting and then removing sort-by cols from a fixed-width flat file
- References:
- Sorting and then removing sort-by cols from a fixed-width flat file
- From: aditya . chaudhary
- Re: Sorting and then removing sort-by cols from a fixed-width flat file
- From: Logan Shaw
- Sorting and then removing sort-by cols from a fixed-width flat file
- Prev by Date: Re: UDP recvfrom unable to get senders address
- Next by Date: how to use "sed" to replace old directory to new directory in a file
- Previous by thread: Re: Sorting and then removing sort-by cols from a fixed-width flat file
- Next by thread: Re: Sorting and then removing sort-by cols from a fixed-width flat file
- Index(es):
Relevant Pages
|