View previous topic :: View next topic
|
Author |
Message |
dinfeo
New User
Joined: 07 Jun 2012 Posts: 20 Location: India
|
|
|
|
Hi,
I have a requirement where I have to SORT a FILE having records like that of shown below :
000164319079.010.H.0000XXYYYY.00001.MM/DD/YYYY.MM/DD/YYYY HH:MM:SS. .001000ZZZZ.
000164319079.010.L.00001.000.00000.009. .000000000000.0000000
000164319079.010.M.00001.0000.01.00000000000000.00000000000000.0000000000000.000
000164319079.010.T.00001.001.00001.001.TX001.
000164319079.010.L.00002.006.00600.028. .000000000000.0000000
000164319083.010.H.0000XXYYYY.00001.MM/DD/YYYY.MM/DD/YYYY HH:MM:SS. .001000ZZZZ.
000164319083.010.L.00000.0000000000000000000000000000 .000000000000.0000000
000164319083.010.V.00001.00001.0010000022.1.09999
000164319080.010.H.0000XXYYYY.00001.MM/DD/YYYY.MM/DD/YYYY HH:MM:SS. .001000ZZZZ.
000164319080.010.L.00001.014.02409.038. .000000000000.0000000
000164319080.010.L.00002.006.00600.246. .000000005996.0000000
Here 'H' is the Header record which is the only line in the file having all my key fields in it and followed by some set of records till the next header attached to it. (File Size is 470 bytes)
I want to SORT the file based on 1) XX, 2) YYYY, 3) MM/DD/YYYY 4) ZZZZ.
You can also find that there is a 12 digit number in the front of each record, but that number will be unique only of one such XX & YYYY.
(i.e. another XX, YYYY combination can have the same number).
Can this be done using a SORT ? Please help. Thank You. |
|
Back to top |
|
|
dinfeo
New User
Joined: 07 Jun 2012 Posts: 20 Location: India
|
|
|
|
To add another point : There will be 2 million records. |
|
Back to top |
|
|
Escapa
Senior Member
Joined: 16 Feb 2007 Posts: 1399 Location: IL, USA
|
|
|
|
Welcome to forum...
What is sort product you have at your shop? Solution may vary based on product. |
|
Back to top |
|
|
dinfeo
New User
Joined: 07 Jun 2012 Posts: 20 Location: India
|
|
|
|
DFSORT & SYNCSORT; |
|
Back to top |
|
|
Escapa
Senior Member
Joined: 16 Feb 2007 Posts: 1399 Location: IL, USA
|
|
|
|
Either you are surprising me or your shop is one of very few who spent(when not required) on both...
OK.... in what product you are expecting solution? give level\release information of that product. |
|
Back to top |
|
|
dinfeo
New User
Joined: 07 Jun 2012 Posts: 20 Location: India
|
|
|
|
Can we go with the SYNCSORT ?
SYNCSORT FOR Z/OS 1.3.2.1R U.S. PATENTS: 4210961, 5117495 (C) 2007
z/OS 1.11.0
SYNCSORT LICENSED FOR CPU SERIAL NUMBER 05134, MODEL 2098 O02 |
|
Back to top |
|
|
Naish
New User
Joined: 07 Dec 2006 Posts: 82 Location: UK
|
|
|
|
Since you have both the products you can try this on DFSORT and try and make on SYNCSORT.
Two confusion -
1. You have not mentioned which MM/DD/YYYY you want to sort on out of two (doesn't matter you can change the sort fields)
2.
Quote: |
You can also find that there is a 12 digit number in the front of each record, but that number will be unique only of one such XX & YYYY.
(i.e. another XX, YYYY combination can have the same number). |
Did you mean Can't?
Try this sort (I have added few records)
Code: |
//SORTIN DD *
000164319079.010.H.0000XXYYYY.00001.MM/DD/YYYY.MM/DD/YYYY HH:MM:SS. .001000ZZZZ.
000164319079.010.L.00001.000.00000.009. .000000000000.0000000
000164319079.010.M.00001.0000.01.00000000000000.00000000000000.0000000000000.000
000164319079.010.T.00001.001.00001.001.TX001.
000164319079.010.L.00002.006.00600.028. .000000000000.0000000
000164319083.010.H.0000PPYYYY.00001.MM/DD/YYYY.MM/DD/YYYY HH:MM:SS. .001000AAAA.
000164319083.010.L.00000.0000000000000000000000000000 .000000000000.0000000
000164319083.010.V.00001.00001.0010000022.1.09999
999999999999.010.H.0000PPAAYY.00001.12/22/1212.MM/DD/YYYY HH:MM:SS. .001000CCCC.
999999999999.010.L.00000.0000000000000000000000000000 .000000000000.0000000
999999999999.010.V.00001.00001.0010000022.1.09999
111111111111.010.H.0000PAABCD.00001.02/22/1212.MM/DD/YYYY HH:MM:SS. .0010001223.
111111111111.010.L.00000.0000000000000000000000000000 .000000000000.0000000
111111111111.010.V.00001.00001.0010000022.1.09999
000164319080.010.H.0000AAYYYY.00001.MM/DD/YYYY.MM/DD/YYYY HH:MM:SS. .001000CADA.
000164319080.010.L.00001.014.02409.038. .000000000000.0000000
000164319080.010.L.00002.006.00600.246. .000000005996.0000000
//SORTOUT DD SYSOUT=*
//SYSIN DD *
INREC IFTHEN=(WHEN=GROUP,BEGIN=(18,1,CH,EQ,C'H'),
PUSH=(501:24,2,503:26,4,507:27,10))
SORT FIELDS=(501,2,CH,A,503,4,CH,A,507,10,CH,A)
OUTREC FIELDS=(1,470)
//*
|
O/P:
Code: |
000164319080.010.H.0000AAYYYY.00001.MM/DD/YYYY.MM/DD/YYYY HH:MM:SS. .001000CADA
000164319080.010.L.00001.014.02409.038. .000000000000.0000000
000164319080.010.L.00002.006.00600.246. .000000005996.0000000
111111111111.010.H.0000PAABCD.00001.02/22/1212.MM/DD/YYYY HH:MM:SS. .0010001223
111111111111.010.L.00000.0000000000000000000000000000 .000000000000.0000000
111111111111.010.V.00001.00001.0010000022.1.09999
999999999999.010.H.0000PPAAYY.00001.12/22/1212.MM/DD/YYYY HH:MM:SS. .001000CCCC
999999999999.010.L.00000.0000000000000000000000000000 .000000000000.0000000
999999999999.010.V.00001.00001.0010000022.1.09999
000164319083.010.H.0000PPYYYY.00001.MM/DD/YYYY.MM/DD/YYYY HH:MM:SS. .001000AAAA
000164319083.010.L.00000.0000000000000000000000000000 .000000000000.0000000
000164319083.010.V.00001.00001.0010000022.1.09999
000164319079.010.H.0000XXYYYY.00001.MM/DD/YYYY.MM/DD/YYYY HH:MM:SS. .001000ZZZZ
000164319079.010.L.00001.000.00000.009. .000000000000.0000000
000164319079.010.M.00001.0000.01.00000000000000.00000000000000.0000000000000.00
000164319079.010.T.00001.001.00001.001.TX001.
000164319079.010.L.00002.006.00600.028. .000000000000.0000000 |
Hope this helps. |
|
Back to top |
|
|
dinfeo
New User
Joined: 07 Jun 2012 Posts: 20 Location: India
|
|
|
|
Thank You Naish for your help. It is working fine. |
|
Back to top |
|
|
dinfeo
New User
Joined: 07 Jun 2012 Posts: 20 Location: India
|
|
|
|
Hi, I got this Information in the SYSOUT: WER238I POTENTIALLY INEFFICIENT USE OF INREC
Will the additional fields in INREC used for SORTING will degrade the performance (There will be 20 (additional) bytes used for this SORT process) ? |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Did you look at what the message says?
Are your records fixed or variable?
Why are the pushed items starting at 500? |
|
Back to top |
|
|
dinfeo
New User
Joined: 07 Jun 2012 Posts: 20 Location: India
|
|
|
|
The record length of the input file i fixed (472 bytes).
Since I need to sort the group records, PUSH of key fields for sorting is required in INREC FIELDS. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Note, I didn't ask why you were PUSHing, but why you were PUSHing there.
So, amend the example which was given to you to start at position 473, not 500 (done for exampler's convenience, not yours) and I'd bet your message goes away, along with the additional unused 27 extended bytes on each record.
And did you look at the message? |
|
Back to top |
|
|
Naish
New User
Joined: 07 Dec 2006 Posts: 82 Location: UK
|
|
|
|
Sorry for the confusion Bill.
dinfeo, hope you amend as Bill suggested. |
|
Back to top |
|
|
dinfeo
New User
Joined: 07 Jun 2012 Posts: 20 Location: India
|
|
|
|
Thank You Bill & Naish. I have used the position from 473 only.
Please find the SYSOUT Details.
SYSIN :
INREC IFTHEN=(WHEN=GROUP,BEGIN=(18,1,CH,EQ,C'H'),
PUSH=(473:24,2,475:26,4,479:37,10,489:76,4))
SORT FIELDS=(473,2,CH,A,475,4,CH,A,479,10,CH,A,489,4,CH,A)
OUTREC FIELDS=(1,472)
WER108I SORTIN : RECFM=FB ; LRECL= 472; BLKSIZE= 27848
WER257I INREC RECORD LENGTH = 492
WER238I POTENTIALLY INEFFICIENT USE OF INREC
WER237I OUTREC RECORD LENGTH = 472
WER110I SORTOUT : RECFM=FB ; LRECL= 472; BLKSIZE= 27848
WER177I TURNAROUND SORT PERFORMED
WER045C END SORT PHASE
WER246I FILESIZE 95,940 BYTES
WER054I RCD IN 195, OUT 195
WER072I NOEQUALS, BALANCE IN EFFECT
WER169I RELEASE 1.3 BATCH 0506 TPF LEVEL 2.1
WER052I END SYNCSORT - SJEDD90D,STEP002,,DIAG=A200,7041,8A34,ACD7,EBCB,4CEB, |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
And when (for the third time) you look up the 238, what does it say? |
|
Back to top |
|
|
dinfeo
New User
Joined: 07 Jun 2012 Posts: 20 Location: India
|
|
|
|
POTENTIALLY INEFFICIENT USE OF INREC |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Yoy, are you saying that when you look the message up, that is all that it says? |
|
Back to top |
|
|
Naish
New User
Joined: 07 Dec 2006 Posts: 82 Location: UK
|
|
|
|
Curiosity!!! You mentioned in your initial post -
Quote: |
(File Size is 470 bytes)
|
Why did you use LRECL=472 then?. |
|
Back to top |
|
|
dinfeo
New User
Joined: 07 Jun 2012 Posts: 20 Location: India
|
|
|
|
Actually it is 472 only, mistakenly I gave 470. Sorry Naish. |
|
Back to top |
|
|
Naish
New User
Joined: 07 Dec 2006 Posts: 82 Location: UK
|
|
|
|
Curiosity #2!!!
What output did you get? I presume WER238I is an informational message (I'm alien to Syncsort).
Also, you never answered Bill's question - did you look up WER238I? |
|
Back to top |
|
|
dinfeo
New User
Joined: 07 Jun 2012 Posts: 20 Location: India
|
|
|
|
Hi Bill,
Initial input record length is 472 bytes. During SORT, the input length is changed to 492 bytes and then sorted; Final output is 472 bytes.
So there is an increase in 20 bytes from the actual input.
This is all I can get from this SORT Message. If this is not sufficient may I didn't understood your question ? |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
According to this the message just comes when INREC creates a record bigger than the OUTREC.
Since you need that extension for your Sort, and otherwise everything is working, I think you can just run with it.
I'd drop the columns: to make things easier to read, but what the heck?
Good spots Naish.
EDIT: Note that you could have done the search as well, dinfeo. |
|
Back to top |
|
|
Anuj Dhawan
Superior Member
Joined: 22 Apr 2006 Posts: 6248 Location: Mumbai, India
|
|
|
|
WER238I is essentially an informational message. It usually surfaces when INREC control has been used to increase the input record length. SyncSort says, "this can reduce SyncSort performance because a larger volume of data is being processed than if the OUTREC control statement were used to perform the same function. Typically, increasing the record length with INREC is only useful when expanding SUM fields with leading zeros to prevent an overflow condition during SUM." |
|
Back to top |
|
|
|