IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

syncsort - include when count > 1 ?


IBM Mainframe Forums -> SYNCSORT
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
GuyC

Senior Member


Joined: 11 Aug 2009
Posts: 1281
Location: Belgium

PostPosted: Tue Apr 08, 2014 2:48 pm
Reply with quote

I'm sorry if this has been asked and answered but couldn't find it for syncsort.

Basically I want all single occurence records in 1 file and all duplicates in another.
input:
Code:
AAAA 201401 data1
AAAA 201402 data2
BBBB 201401 data3
BBBB 201401 data4
BBBB 201402 data5

outfil1 :
Code:
AAAA 201401 data1
AAAA 201402 data2
BBBB 201402 data5

outfil2:
Code:
BBBB 201401 data3
BBBB 201401 data4


I probably would be able to do this by summarizing and appending a count and then joining with the Original file testing on this count, but kind of hoping it could be a lot simpler.
Back to top
View user's profile Send private message
Anuj Dhawan

Superior Member


Joined: 22 Apr 2006
Posts: 6250
Location: Mumbai, India

PostPosted: Tue Apr 08, 2014 3:13 pm
Reply with quote

Hi GuyC,

Nice to see you!

Try this, not tested:

Code:
//STEP010 EXEC PGM=SORT                             
//SYSOUT   DD SYSOUT=*                                   
//SORTIN   DD DSN=your.inout.file,DISP=SHR       
//SORTOUT  DD DSN=non.dup.recs,DISP=...     
//SORTXSUM DD DSN=dup.recs,DISP= ...
//SYSIN    DD *                                         
  SORT FIELDS=(1,11,CH,A)                               
  SUM FIELDS=NONE,XSUM   
/*
Back to top
View user's profile Send private message
GuyC

Senior Member


Joined: 11 Aug 2009
Posts: 1281
Location: Belgium

PostPosted: Tue Apr 08, 2014 3:50 pm
Reply with quote

hi, haven't been around for a while : job-hopping.

Solution is not what I wanted :
Quote:
SORT FIELDS=(1,11,CH,A)
SUM FIELDS=NONE,XSUM

sortout :
Code:
AAAA 201401 DATA1
AAAA 201402 DATA2
BBBB 201401 DATA3
BBBB 201402 DATA5

sortxsum:
Code:
BBBB 201401 DATA4
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Tue Apr 08, 2014 4:00 pm
Reply with quote

If you have 1.4 upwards, this can be done with DUPKEYS NODUPS,XDUP, which will need a DD for the XDUP. See documentation for details.

If not, look at SELECT in SyncTool (no documentation, so you have to be creative about that part of it).
Back to top
View user's profile Send private message
GuyC

Senior Member


Joined: 11 Aug 2009
Posts: 1281
Location: Belgium

PostPosted: Tue Apr 08, 2014 5:38 pm
Reply with quote

Thx,
we've got SYNCSORT FOR Z/OS 1.4.2.0R, but (like a newbie) I'm not finding the documentation. which means I can't get it working with dupkeys icon_sad.gif

I managed with synctool (SYNCTOOL RELEASE 1.7.2)
Code:
//TOOLIN   DD *                                                   
 SELECT FROM(SORTIN) TO(OUTDUPS) ON(1,11,CH) ALLDUPS DISCARD(OUTREST)

I thought that if I had this working I could implement it in the existing sort, unfortunately I have no clue.

Sorry for slowfeeding the requirement, was hoping to be able to do it myself.
For those still willing : the Original file has a recordtype as well and a day in which order I need the output to be.
So I need the all the recordtypes 1 which have more then 1 occurrence in a month ordered by date

Code:
AAAA 20140101 0 maindata
AAAA 20140101 1 DATA2type1
AAAA 20140101 2 DATA2type2
AAAA 20140201 1 DATA3type1
BBBB 20140101 0 maindata
BBBB 20140101 1 DATA3a
BBBB 20140115 1 DATA3b
BBBB 20140127 1 DATA3c
BBBB 20140201 1 DATA6

and the Original sort which gave me all recordtype1 sorted was :
Code:
SORT FIELDS=(1,13,A)
OUTFIL INCLUDE=(15,1,CH,EQ,C'1'),FNAMES=INCL001
OUTFIL INCLUDE=(15,1,CH,NE,C'1'),FNAMES=EXCL001

so : Can I combine these two? (besides running two steps)
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Tue Apr 08, 2014 6:10 pm
Reply with quote

Code:
 SORT FIELDS=(1,13,A)
 DUPKEYS NODUPS,XDUP


This will give you a SORTOUT with only records which had unique keys. All records for keys which were duplicate (two or more records same key) will be excluded.

The XDUP says write all the excluded records to the SORTXDUP DD.

However, this will not give you what you want (requirement creep), because you want to know if there are multiple records for a month, not for a day, which is what you are sorting on.

Another however, the sample data you show is already in order. If this is correct, then why SORT?

Change SORTIN DD to SORTIN01 DD, include SORTXDUP DD

Code:
 MERGE FIELDS=(1,11,A)
 DUPKEYS NODUPS,XDUP


However (yet again) I don't really understand where the record-type comes into it.

You can still have your two OUTFILs (you can specify SAVE instead of the INCLUDE= with NE) but your SORTXDUP DD will contain a mix of record-types.

You can try to work with that, or show sample output for your (representative) sample input. Include RECFM and LRECL.

I don't have access to SyncSort, so this is theoretical...
Back to top
View user's profile Send private message
GuyC

Senior Member


Joined: 11 Aug 2009
Posts: 1281
Location: Belgium

PostPosted: Tue Apr 08, 2014 6:49 pm
Reply with quote

I'll give you some background on where the requirement comes from:

The file contains a lot of different info which is input for printing/analysis.
Data recordtype 1: Sales figures per region, month , supplier-begindate, supplier-enddate
Normally there Is only 1 supplier at a given time, but when there are 2 or more I need to do some businesslogic. (<= NEW functionality I'm building)

Code:
Supplier1 goes from 01/01/0001 => 21/1
Supplier2 goes from 14/1 => 99/99/9999
I have to create 3 outputrecord type1 :
Code:
Supplier1           : 01/01/0001 => 13/1  : amounts * 13 / 21 
CombineSuppl1&supp2 : 14/01/2014 => 21/01/2014 , weighted avg/sums
Supplier2           : 22/01/2014 => 99/99/9999 : amounts * 9 / 17

This involves remembering/storing stuff in working-storage arrays .

since this is only 1% of the input file I thought I would extract that 1% and use that as input,
Afterwards I would re-sort/concatenate my outputfile and the "discarded" rest of the input file for the next program.
So yes my Nodups-file contains all kind of things but I really only need them to pass them on.

I can not garantuee that the input file is in the correct order, I typed it so because that is the order in which the print-program (=the last/next program in the chain receives them)

I'll try to get the duplicates with the dupkeys , and re-sort that 1% on day-basis in 2 steps.

Thanks for your help.
Back to top
View user's profile Send private message
Arun Raj

Moderator


Joined: 17 Oct 2006
Posts: 2481
Location: @my desk

PostPosted: Fri Apr 11, 2014 5:22 pm
Reply with quote

GuyC,

Thanks for sharing all that information. Would you mind posting an expected output out of your eariler sample input.
Back to top
View user's profile Send private message
GuyC

Senior Member


Joined: 11 Aug 2009
Posts: 1281
Location: Belgium

PostPosted: Fri Apr 11, 2014 7:49 pm
Reply with quote

INPUT :
Code:
AAAA 20140101 0 maindata
BBBB 20140101 0 maindata
AAAA 20140101 1 DATA2   1000,00 SupX   01/01/0001 31/12/9999
AAAA 20140101 1 DATA2   0100,00 SupY   01/01/0001 31/12/9999
BBBB 20140101 1 DATA6   0111,00 SupA   01/01/0001 21/02/2014
AAAA 20140201 1 DATA3   0222,00 SupX   01/01/0001 31/12/9999
BBBB 20140201 1 DATA3a  1000,00 SupA   01/01/0001 21/02/2014
BBBB 20140215 1 DATA3b  0500,00 SupB   15/02/2014 31/12/9999
AAAA 20140101 2 DATAx

Run Sort <== This sort I had problems with :
SortOut1 : Include all Type1s with more than 1 record / month
Code:
AAAA 20140101 1 DATA2   1000,00 SupX   01/01/0001 31/12/9999
AAAA 20140101 1 DATA2   0100,00 SupY   01/01/0001 31/12/9999
BBBB 20140201 1 DATA3a  1000,00 SupA   01/01/0001 21/02/2014
BBBB 20140215 1 DATA3b  0500,00 SupB   15/02/2014 31/12/9999

Sortout2 : Everything else
Run Program
MyProgram_out
Code:
AAAA 20140101 1 DATA2   1100,00 SupX&Y 01/01/0001 31/12/9999
BBBB 20140201 1 DATA3a  0658,00 SupA   01/01/0001 14/02/2014
BBBB 20140215 1 DATA3ab 0552,00 SupA&B 15/02/2014 21/02/2014
BBBB 20140222 1 DATA3b  0290,00 SupB   22/02/2014 31/12/9999

Run sort
sort Progout+sortout2
Code:
AAAA 20140101 0 maindata
AAAA 20140101 1 DATA2   1100,00 SupX&Y 01/01/0001 31/12/9999
AAAA 20140101 2 DATAx
AAAA 20140201 1 DATA3   0222,00 SupX   01/01/0001 31/12/9999
BBBB 20140101 0 maindata
BBBB 20140101 1 DATA6   0111,00 SupA   01/01/0001 21/02/2014
BBBB 20140201 1 DATA3a  0658,00 SupA   01/01/0001 14/02/2014
BBBB 20140215 1 DATA3ab 0552,00 SupA&B 15/02/2014 21/02/2014
BBBB 20140222 1 DATA3b  0290,00 SupB   22/02/2014 31/12/9999
Back to top
View user's profile Send private message
Arun Raj

Moderator


Joined: 17 Oct 2006
Posts: 2481
Location: @my desk

PostPosted: Wed Apr 16, 2014 9:29 pm
Reply with quote

GuyC,

Not sure if I got your problem right. The below Synctool writes all the duplicates for record-type=1 into the 'OUT' dd and the remaining records into the 'DIS' dd.
Code:
//STEP01    EXEC PGM=SYNCTOOL   
//TOOLMSG   DD SYSOUT=*         
//DFSMSG    DD SYSOUT=*         
//*                             
//IN        DD *                                             
AAAA 20140101 0 maindata                                     
BBBB 20140101 0 maindata                                     
AAAA 20140101 1 DATA2   1000,00 SupX   01/01/0001 31/12/9999
AAAA 20140101 1 DATA2   0100,00 SupY   01/01/0001 31/12/9999
BBBB 20140101 1 DATA6   0111,00 SupA   01/01/0001 21/02/2014
AAAA 20140201 1 DATA3   0222,00 SupX   01/01/0001 31/12/9999
BBBB 20140201 1 DATA3a  1000,00 SupA   01/01/0001 21/02/2014
BBBB 20140215 1 DATA3b  0500,00 SupB   15/02/2014 31/12/9999
AAAA 20140101 2 DATAx                                       
//*                                                         
//OUT       DD SYSOUT=*                                                 
//DIS       DD SYSOUT=*                                                 
//TOOLIN    DD *                                                       
 SELECT FROM(IN) TO(OUT) ON(1,11,CH) ON(81,4,CH) ALLDUPS -             
                         DISCARD(DIS) USING(CTL1)                       
//CTL1CNTL  DD *                                                       
 INREC IFTHEN=(WHEN=(15,1,ZD,EQ,1),OVERLAY=(81:C'0000')),               
       IFTHEN=(WHEN=NONE,OVERLAY=(81:SEQNUM,4,ZD))                     
 SORT FIELDS=(01,13,CH,A)                                               
 OUTFIL FNAMES=OUT,BUILD=(1,80)                                         
 OUTFIL FNAMES=DIS,BUILD=(1,80)

OUT
Code:
AAAA 20140101 1 DATA2   1000,00 SupX   01/01/0001 31/12/9999
AAAA 20140101 1 DATA2   0100,00 SupY   01/01/0001 31/12/9999
BBBB 20140201 1 DATA3a  1000,00 SupA   01/01/0001 21/02/2014
BBBB 20140215 1 DATA3b  0500,00 SupB   15/02/2014 31/12/9999

DIS
Code:
AAAA 20140101 0 maindata                                     
AAAA 20140101 2 DATAx                                       
AAAA 20140201 1 DATA3   0222,00 SupX   01/01/0001 31/12/9999
BBBB 20140101 0 maindata                                     
BBBB 20140101 1 DATA6   0111,00 SupA   01/01/0001 21/02/2014
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> SYNCSORT

 


Similar Topics
Topic Forum Replies
No new posts INCLUDE OMIT COND for Multiple values... DFSORT/ICETOOL 5
No new posts To get the count of rows for every 1 ... DB2 3
No new posts Compare only first records of the fil... SYNCSORT 7
No new posts To find whether record count are true... DFSORT/ICETOOL 6
No new posts Validating record count of a file is ... DFSORT/ICETOOL 13
Search our Forums:

Back to Top