View previous topic :: View next topic
Author
Message
satish.ms10 Active User Joined: 10 Aug 2009Posts: 184 Location: India
Hi All,
Thank you so much for all your support and time.
Here is my requirement. I have 100 million records in a PS file. I need to search for a particular string (say 'A129') at specific position (say at 4th byte) for first occurrence. From that record I need to copy 10000 records into output file.
Sample Input:
Code:
----+----1----+----2----+----3
XY A101
XY A111
XY A111
XY A121
XY A121
XY A123
XY A129
XY A129
XY A131
XY A141
XY A151
......
......
Expected Output:
Code:
----+----1----+----2----+----3
XY A129
XY A129
XY A131
XY A141
XY A151
......
......
Only 10000 records to output file.
Input file is a VB file with 27000 max length.
Could you please help me with the SORT card?
Thanks in advance.
Thanks,
Sati
Back to top
Bill Woodger Moderator Emeritus Joined: 09 Mar 2011Posts: 7309 Location: Inside the Matrix
You have 100,000,000 records of variable-length up to 27.000 bytes?
I assume this is a once-off.
Code:
OPTION COPY
INREC IFTHEN=(WHEN=INIT,
BUILD=(1,4,10C'0',5)),
IFTHEN=(WHEN=GROUP,
BEGIN=(18,4,CH,EQ,C'A129'),
RECORDS=10000,
PUSH=(5:ID=10))
OUTFIL OMIT=(5,10,CH,EQ,C'0000000000'),
ACCEPT=10000,
BUILD=(1,4,15)
Note. This will read all your 100,000,000 records, every time.
If you are going to do it often, you'd at least want to compare the performance to a program.
Instead of the SEQ= I've used ID= and OMIT for zero, which may mean your ID does not need to be that size. The actual 10000 limit is done by the ACCEPT, so there's not too much to think about for multiple values of that key.
Edit: Fixed the typos and even gave it a light test.
Back to top
enrico-sorichetti Superior Member Joined: 14 Mar 2007Posts: 10873 Location: italy
here is a solution that works
Code:
****** ***************************** Top of Data ******************************
000001 //ENRICO1 JOB NOTIFY=&SYSUID,
000002 // MSGLEVEL=(1,1),CLASS=A,MSGCLASS=X
000003 //*
000004 //S1 EXEC PGM=SORT
000005 //SYSPRINT DD SYSOUT=*
000006 //SYSOUT DD SYSOUT=*
000007 //SORTIN DD *
000008 XY A111
000009 XY A114
000010 XY A117
000011 XY A120
000012 XY A123
000013 XY A126
000014 XY A129
000015 XY A132
000016 XY A135
000017 XY A138
000018 XY A141
000019 XY A144
000020 XY A147
000021 XY A150
000022 XY A153
000023 XY A156
000024 XY A159
000025 XY A162
000026 XY A165
000027 XY A168
000028 XY A171
000029 XY A174
000030 XY A111
000031 XY A114
000032 XY A117
000033 XY A120
000034 XY A123
000035 XY A126
000036 XY A129
000037 XY A132
000038 XY A135
000039 XY A138
000040 XY A141
000041 XY A144
000042 XY A147
000043 XY A150
000044 XY A153
000045 XY A156
000046 XY A159
000047 XY A162
000048 XY A165
000049 XY A168
000050 XY A171
000051 XY A174
000052 //SORTOUT DD SYSOUT=*
000053 //SYSIN DD *
000054 OPTION COPY
000055 INREC IFTHEN=(WHEN=GROUP,BEGIN=(4,4,CH,EQ,C'A129'),
000056 PUSH=(55:ID=4)),
000057 IFTHEN=(WHEN=(4,4,CH,EQ,C'A129'),
000058 OVERLAY=(51:C'A129'))
000059 OUTREC IFTHEN=(WHEN=GROUP,BEGIN=(51,8,CH,EQ,C'A1290001'),
000060 PUSH=(61:51,8),RECORDS=24)
000061 //* OUTFIL INCLUDE=(61,8,CH,EQ,C'A1290001')
****** **************************** Bottom of Data ****************************
Code:
********************************* TOP OF DATA **********************************
XY A111
XY A114
XY A117
XY A120
XY A123
XY A126
XY A129 A1290001 A1290001
XY A132 0001 A1290001
XY A135 0001 A1290001
XY A138 0001 A1290001
XY A141 0001 A1290001
XY A144 0001 A1290001
XY A147 0001 A1290001
XY A150 0001 A1290001
XY A153 0001 A1290001
XY A156 0001 A1290001
XY A159 0001 A1290001
XY A162 0001 A1290001
XY A165 0001 A1290001
XY A168 0001 A1290001
XY A171 0001 A1290001
XY A174 0001 A1290001
XY A111 0001 A1290001
XY A114 0001 A1290001
XY A117 0001 A1290001
XY A120 0001 A1290001
XY A123 0001 A1290001
XY A126 0001 A1290001
XY A129 A1290002 A1290001
XY A132 0002 A1290001
XY A135 0002
XY A138 0002
XY A141 0002
XY A144 0002
XY A147 0002
XY A150 0002
XY A153 0002
XY A156 0002
XY A159 0002
XY A162 0002
XY A165 0002
XY A168 0002
XY A171 0002
XY A174 0002
******************************** BOTTOM OF DATA ********************************
it is for FB records ( inline data, the mods for VB data are obvious )
I commented the include and left out the build to show the auxiliary columns
Back to top
Bill Woodger Moderator Emeritus Joined: 09 Mar 2011Posts: 7309 Location: Inside the Matrix
Effected some tidying on the topic :-)
Back to top
satish.ms10 Active User Joined: 10 Aug 2009Posts: 184 Location: India
Hi Bill/Enrico,
It seems, I lost the few of your replies and suggestions.
Is it intentional or something went wrong?
Kindly assist.
Thanks,
Sati
Back to top
Bill Woodger Moderator Emeritus Joined: 09 Mar 2011Posts: 7309 Location: Inside the Matrix
Bill Woodger wrote:
Effected some tidying on the topic :-)
I've edited my original to remove the typos and demonstrate to myself that it works with 40 records when presented with 500 records of the same key.
The confusion yesterday was between the use of OMIT= and INCLUDE= on the OUTFIL.
Back to top
satish.ms10 Active User Joined: 10 Aug 2009Posts: 184 Location: India
Thank you so much Bill.
Sorry, I missed your post. :-(
Back to top
satish.ms10 Active User Joined: 10 Aug 2009Posts: 184 Location: India
Hi Bill,
Sorry to trouble you. I am getting below mentioned error when I use your sort card.
Code:
INCONSISTENT REFORMATTING FOR *INREC : REASON CODE 02, IFTHEN 1
C5-I12416 C6-K90026 C7-K94453 C8-K94453 E7-I12416
SMF RECORD NOT WRITTEN TO THE SMF DATA SET(RC=20)
END OF DFSORT
Kindly assist.
Thanks,
Sati
Back to top
satish.ms10 Active User Joined: 10 Aug 2009Posts: 184 Location: India
Hi Bill,
Here is the job step code that I am using in my JCL:
Code:
//*
//S1 EXEC PGM=SORT
//SYSPRINT DD SYSOUT=*
//SYSOUT DD SYSOUT=*
//SORTIN DD *
XY A111
XY A114
XY A117
XY A120
XY A123
XY A126
XY A129
XY A132
XY A135
XY A138
XY A141
XY A144
XY A147
XY A150
XY A153
XY A156
XY A159
XY A162
XY A165
XY A168
XY A171
XY A174
XY A111
XY A114
XY A117
XY A120
XY A123
XY A126
XY A129
XY A132
XY A135
XY A138
XY A141
XY A144
XY A147
XY A150
XY A153
XY A156
XY A159
XY A162
XY A165
XY A168
XY A171
XY A174
//SORTOUT DD SYSOUT=*
//SYSIN DD *
OPTION COPY
INREC IFTHEN=(WHEN=INIT,
BUILD=(1,4,10C'0',5)),
IFTHEN=(WHEN=GROUP,
BEGIN=(18,4,CH,EQ,C'A129'),
RECORDS=10000,
PUSH=(5:ID=10))
OUTFIL OMIT=(5,10,CH,EQ,C'0000000000'),
ACCEPT=10000,
BUILD=(1,4,15)
//*
Below is the error message that I got from above job:
Code:
INCONSISTENT REFORMATTING FOR *INREC : REASON CODE 02, IFTHEN 1
C5-I12416 C6-K90026 C7-K94453 C8-K94453 E7-I12416
SMF RECORD NOT WRITTEN TO THE SMF DATA SET(RC=20)
END OF DFSORT
I am unable to find mistake.
Kindly help.
Thanks,
Back to top
Bill Woodger Moderator Emeritus Joined: 09 Mar 2011Posts: 7309 Location: Inside the Matrix
Well, you're trying to test it with fixed-length records, and it expects variable-length records.
You need a step to convert your fixed-length records to variable-length.
Simple COPY operation with OUTFIL FTOV,BUILD=(1,x) where x is the length of your test data. SORTOUT to a temporary dataset, DISP=(NEW,PASS) and same temporary dataset as SORTIN to your step, DISP=(OLD,PASS).
Back to top
enrico-sorichetti Superior Member Joined: 14 Mar 2007Posts: 10873 Location: italy
Bill' s sort control statements are for a VARIABLE format dataset,
they take into account the 4 bytes of the RDW
Your SORTIN is A FIXED format dataset
allocate two variable formats datasets, prime one of them with the sorting data
and You will see that everything works.
Back to top
Please enable JavaScript!