|
View previous topic :: View next topic
|
| Author |
Message |
GuyC
Senior Member
Joined: 11 Aug 2009 Posts: 1281 Location: Belgium
|
|
|
|
I'm sorry if this has been asked and answered but couldn't find it for syncsort.
Basically I want all single occurence records in 1 file and all duplicates in another.
input:
| Code: |
AAAA 201401 data1
AAAA 201402 data2
BBBB 201401 data3
BBBB 201401 data4
BBBB 201402 data5 |
outfil1 :
| Code: |
AAAA 201401 data1
AAAA 201402 data2
BBBB 201402 data5 |
outfil2:
| Code: |
BBBB 201401 data3
BBBB 201401 data4 |
I probably would be able to do this by summarizing and appending a count and then joining with the Original file testing on this count, but kind of hoping it could be a lot simpler. |
|
| Back to top |
|
 |
Anuj Dhawan
Superior Member

Joined: 22 Apr 2006 Posts: 6248 Location: Mumbai, India
|
|
|
|
Hi GuyC,
Nice to see you!
Try this, not tested:
| Code: |
//STEP010 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTIN DD DSN=your.inout.file,DISP=SHR
//SORTOUT DD DSN=non.dup.recs,DISP=...
//SORTXSUM DD DSN=dup.recs,DISP= ...
//SYSIN DD *
SORT FIELDS=(1,11,CH,A)
SUM FIELDS=NONE,XSUM
/* |
|
|
| Back to top |
|
 |
GuyC
Senior Member
Joined: 11 Aug 2009 Posts: 1281 Location: Belgium
|
|
|
|
hi, haven't been around for a while : job-hopping.
Solution is not what I wanted :
| Quote: |
SORT FIELDS=(1,11,CH,A)
SUM FIELDS=NONE,XSUM |
sortout :
| Code: |
AAAA 201401 DATA1
AAAA 201402 DATA2
BBBB 201401 DATA3
BBBB 201402 DATA5 |
sortxsum:
|
|
| Back to top |
|
 |
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
If you have 1.4 upwards, this can be done with DUPKEYS NODUPS,XDUP, which will need a DD for the XDUP. See documentation for details.
If not, look at SELECT in SyncTool (no documentation, so you have to be creative about that part of it). |
|
| Back to top |
|
 |
GuyC
Senior Member
Joined: 11 Aug 2009 Posts: 1281 Location: Belgium
|
|
|
|
Thx,
we've got SYNCSORT FOR Z/OS 1.4.2.0R, but (like a newbie) I'm not finding the documentation. which means I can't get it working with dupkeys
I managed with synctool (SYNCTOOL RELEASE 1.7.2)
| Code: |
//TOOLIN DD *
SELECT FROM(SORTIN) TO(OUTDUPS) ON(1,11,CH) ALLDUPS DISCARD(OUTREST) |
I thought that if I had this working I could implement it in the existing sort, unfortunately I have no clue.
Sorry for slowfeeding the requirement, was hoping to be able to do it myself.
For those still willing : the Original file has a recordtype as well and a day in which order I need the output to be.
So I need the all the recordtypes 1 which have more then 1 occurrence in a month ordered by date
| Code: |
AAAA 20140101 0 maindata
AAAA 20140101 1 DATA2type1
AAAA 20140101 2 DATA2type2
AAAA 20140201 1 DATA3type1
BBBB 20140101 0 maindata
BBBB 20140101 1 DATA3a
BBBB 20140115 1 DATA3b
BBBB 20140127 1 DATA3c
BBBB 20140201 1 DATA6 |
and the Original sort which gave me all recordtype1 sorted was :
| Code: |
SORT FIELDS=(1,13,A)
OUTFIL INCLUDE=(15,1,CH,EQ,C'1'),FNAMES=INCL001
OUTFIL INCLUDE=(15,1,CH,NE,C'1'),FNAMES=EXCL001 |
so : Can I combine these two? (besides running two steps) |
|
| Back to top |
|
 |
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
| Code: |
SORT FIELDS=(1,13,A)
DUPKEYS NODUPS,XDUP |
This will give you a SORTOUT with only records which had unique keys. All records for keys which were duplicate (two or more records same key) will be excluded.
The XDUP says write all the excluded records to the SORTXDUP DD.
However, this will not give you what you want (requirement creep), because you want to know if there are multiple records for a month, not for a day, which is what you are sorting on.
Another however, the sample data you show is already in order. If this is correct, then why SORT?
Change SORTIN DD to SORTIN01 DD, include SORTXDUP DD
| Code: |
MERGE FIELDS=(1,11,A)
DUPKEYS NODUPS,XDUP |
However (yet again) I don't really understand where the record-type comes into it.
You can still have your two OUTFILs (you can specify SAVE instead of the INCLUDE= with NE) but your SORTXDUP DD will contain a mix of record-types.
You can try to work with that, or show sample output for your (representative) sample input. Include RECFM and LRECL.
I don't have access to SyncSort, so this is theoretical... |
|
| Back to top |
|
 |
GuyC
Senior Member
Joined: 11 Aug 2009 Posts: 1281 Location: Belgium
|
|
|
|
I'll give you some background on where the requirement comes from:
The file contains a lot of different info which is input for printing/analysis.
Data recordtype 1: Sales figures per region, month , supplier-begindate, supplier-enddate
Normally there Is only 1 supplier at a given time, but when there are 2 or more I need to do some businesslogic. (<= NEW functionality I'm building)
| Code: |
Supplier1 goes from 01/01/0001 => 21/1
Supplier2 goes from 14/1 => 99/99/9999 |
I have to create 3 outputrecord type1 :
| Code: |
Supplier1 : 01/01/0001 => 13/1 : amounts * 13 / 21
CombineSuppl1&supp2 : 14/01/2014 => 21/01/2014 , weighted avg/sums
Supplier2 : 22/01/2014 => 99/99/9999 : amounts * 9 / 17 |
This involves remembering/storing stuff in working-storage arrays .
since this is only 1% of the input file I thought I would extract that 1% and use that as input,
Afterwards I would re-sort/concatenate my outputfile and the "discarded" rest of the input file for the next program.
So yes my Nodups-file contains all kind of things but I really only need them to pass them on.
I can not garantuee that the input file is in the correct order, I typed it so because that is the order in which the print-program (=the last/next program in the chain receives them)
I'll try to get the duplicates with the dupkeys , and re-sort that 1% on day-basis in 2 steps.
Thanks for your help. |
|
| Back to top |
|
 |
Arun Raj
Moderator
Joined: 17 Oct 2006 Posts: 2482 Location: @my desk
|
|
|
|
GuyC,
Thanks for sharing all that information. Would you mind posting an expected output out of your eariler sample input. |
|
| Back to top |
|
 |
GuyC
Senior Member
Joined: 11 Aug 2009 Posts: 1281 Location: Belgium
|
|
|
|
INPUT :
| Code: |
AAAA 20140101 0 maindata
BBBB 20140101 0 maindata
AAAA 20140101 1 DATA2 1000,00 SupX 01/01/0001 31/12/9999
AAAA 20140101 1 DATA2 0100,00 SupY 01/01/0001 31/12/9999
BBBB 20140101 1 DATA6 0111,00 SupA 01/01/0001 21/02/2014
AAAA 20140201 1 DATA3 0222,00 SupX 01/01/0001 31/12/9999
BBBB 20140201 1 DATA3a 1000,00 SupA 01/01/0001 21/02/2014
BBBB 20140215 1 DATA3b 0500,00 SupB 15/02/2014 31/12/9999
AAAA 20140101 2 DATAx
|
Run Sort <== This sort I had problems with :
SortOut1 : Include all Type1s with more than 1 record / month
| Code: |
AAAA 20140101 1 DATA2 1000,00 SupX 01/01/0001 31/12/9999
AAAA 20140101 1 DATA2 0100,00 SupY 01/01/0001 31/12/9999
BBBB 20140201 1 DATA3a 1000,00 SupA 01/01/0001 21/02/2014
BBBB 20140215 1 DATA3b 0500,00 SupB 15/02/2014 31/12/9999 |
Sortout2 : Everything else
Run Program
MyProgram_out
| Code: |
AAAA 20140101 1 DATA2 1100,00 SupX&Y 01/01/0001 31/12/9999
BBBB 20140201 1 DATA3a 0658,00 SupA 01/01/0001 14/02/2014
BBBB 20140215 1 DATA3ab 0552,00 SupA&B 15/02/2014 21/02/2014
BBBB 20140222 1 DATA3b 0290,00 SupB 22/02/2014 31/12/9999 |
Run sort
sort Progout+sortout2
| Code: |
AAAA 20140101 0 maindata
AAAA 20140101 1 DATA2 1100,00 SupX&Y 01/01/0001 31/12/9999
AAAA 20140101 2 DATAx
AAAA 20140201 1 DATA3 0222,00 SupX 01/01/0001 31/12/9999
BBBB 20140101 0 maindata
BBBB 20140101 1 DATA6 0111,00 SupA 01/01/0001 21/02/2014
BBBB 20140201 1 DATA3a 0658,00 SupA 01/01/0001 14/02/2014
BBBB 20140215 1 DATA3ab 0552,00 SupA&B 15/02/2014 21/02/2014
BBBB 20140222 1 DATA3b 0290,00 SupB 22/02/2014 31/12/9999 |
|
|
| Back to top |
|
 |
Arun Raj
Moderator
Joined: 17 Oct 2006 Posts: 2482 Location: @my desk
|
|
|
|
GuyC,
Not sure if I got your problem right. The below Synctool writes all the duplicates for record-type=1 into the 'OUT' dd and the remaining records into the 'DIS' dd.
| Code: |
//STEP01 EXEC PGM=SYNCTOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//*
//IN DD *
AAAA 20140101 0 maindata
BBBB 20140101 0 maindata
AAAA 20140101 1 DATA2 1000,00 SupX 01/01/0001 31/12/9999
AAAA 20140101 1 DATA2 0100,00 SupY 01/01/0001 31/12/9999
BBBB 20140101 1 DATA6 0111,00 SupA 01/01/0001 21/02/2014
AAAA 20140201 1 DATA3 0222,00 SupX 01/01/0001 31/12/9999
BBBB 20140201 1 DATA3a 1000,00 SupA 01/01/0001 21/02/2014
BBBB 20140215 1 DATA3b 0500,00 SupB 15/02/2014 31/12/9999
AAAA 20140101 2 DATAx
//*
//OUT DD SYSOUT=*
//DIS DD SYSOUT=*
//TOOLIN DD *
SELECT FROM(IN) TO(OUT) ON(1,11,CH) ON(81,4,CH) ALLDUPS -
DISCARD(DIS) USING(CTL1)
//CTL1CNTL DD *
INREC IFTHEN=(WHEN=(15,1,ZD,EQ,1),OVERLAY=(81:C'0000')),
IFTHEN=(WHEN=NONE,OVERLAY=(81:SEQNUM,4,ZD))
SORT FIELDS=(01,13,CH,A)
OUTFIL FNAMES=OUT,BUILD=(1,80)
OUTFIL FNAMES=DIS,BUILD=(1,80) |
OUT
| Code: |
AAAA 20140101 1 DATA2 1000,00 SupX 01/01/0001 31/12/9999
AAAA 20140101 1 DATA2 0100,00 SupY 01/01/0001 31/12/9999
BBBB 20140201 1 DATA3a 1000,00 SupA 01/01/0001 21/02/2014
BBBB 20140215 1 DATA3b 0500,00 SupB 15/02/2014 31/12/9999 |
DIS
| Code: |
AAAA 20140101 0 maindata
AAAA 20140101 2 DATAx
AAAA 20140201 1 DATA3 0222,00 SupX 01/01/0001 31/12/9999
BBBB 20140101 0 maindata
BBBB 20140101 1 DATA6 0111,00 SupA 01/01/0001 21/02/2014 |
|
|
| Back to top |
|
 |
|
|
 |
All times are GMT + 6 Hours |
|