|
View previous topic :: View next topic
|
| Author |
Message |
mfarien
New User
Joined: 02 Mar 2007 Posts: 17 Location: USA
|
|
|
|
I want to sort a file on a key SORT FIELDS=(1,5,CH,A) . I want to sort, keep all the records, I just want to flag the duplicates. Say my input file is of length 5 . I want output file of length six and mark all the records those are duplicate with a flag Y.
Example
abcde
abcde
qqqqq
rrrrrr
qqqqq
ppppp
I want my o/p file as
abcdey
abcdey
ppppp
rrrrrr
qqqqqy
qqqqqy |
|
| Back to top |
|
 |
Frank Yaeger
DFSORT Developer

Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
Here's a DFSORT/ICETOOL job that will do what you asked for:
| Code: |
//S1 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN DD DSN=... input file (FB/5)
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)
//OUT DD DSN=... output file (FB/6)
//TOOLIN DD *
SORT FROM(IN) TO(T1) USING(CTL1)
SPLICE FROM(T1) TO(OUT) ON(1,5,CH) KEEPBASE KEEPNODUPS -
WITHALL WITH(1,5) USING(CTL2)
/*
//CTL1CNTL DD *
SORT FIELDS=(1,5,CH,A)
OUTREC OVERLAY=(6:SEQNUM,8,ZD,RESTART=(1,5))
/*
//CTL2CNTL DD *
SORT FIELDS=(1,5,CH,A,6,8,ZD,D)
OUTFIL FNAMES=OUT,
IFTHEN=(WHEN=(6,8,ZD,GT,+1),BUILD=(1,5,C'y')),
IFTHEN=(WHEN=NONE,BUILD=(1,5,X))
/*
|
|
|
| Back to top |
|
 |
mfarien
New User
Joined: 02 Mar 2007 Posts: 17 Location: USA
|
|
|
|
Thanks Frank,
Just to understand it more, for the same condition, where I want all the i/p records in o/p file, with duplicates flagged, how I am going to code , if my input file length is 100, output is 101,sort fileds are 1,15 and flag should be at position 101.
I am trying to understand the control cards -
OUTREC OVERLAY=(6:SEQNUM,8,ZD,RESTART=(1,5)) .
SORT FIELDS=(1,5,CH,A,6,8,ZD,D)
OUTFIL FNAMES=OUT,
IFTHEN=(WHEN=(6,8,ZD,GT,+1),BUILD=(1,5,C'y')),
IFTHEN=(WHEN=NONE,BUILD=(1,5,X)) |
|
| Back to top |
|
 |
Frank Yaeger
DFSORT Developer

Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
I assumed that you only had the key in each record as shown in your original example, so you didn't care about the order of the records with the same key for output. If you have other fields in the record and do care about the order of the records with the same key for output, then we'd need to do it a different way.
Let's start over. Show me a better example of your input records and expected output records with the other fields in the record besides the key so I can see what you really want. |
|
| Back to top |
|
 |
mfarien
New User
Joined: 02 Mar 2007 Posts: 17 Location: USA
|
|
|
|
OK. Got it.
Let me restart it. I have one i/p file, LRECL=100. My sort key is 15 chars.
rest all won't matter to me. I have already sorted the file with the key. Now say, I have 100 records in my sorted i/p file, with 20 duplicates, means total 80 unique records and 20 with duplicates. I want the output file LRECL=100 with a flag in each duplicate record at position 101 so in my cobol program, I know that it's a duplicate and I can process it accordingly checking for the flag. So I will have 100 in my o/p file and have 20 with flags and 80 without any flags.
The '.............' in the example are fields with 9's , A's and X's I want those as it is and they have nothing to do with sort or duplicates.
Example.
105682004709136.......................... < 100>
105682004709136.......................... < 100>
105682025446815.......................... < 100>
105682093745261.......................... < 100>
105682093745261.......................... < 100>
105682095668485.......................... < 100>
I want my o/p file as
105682004709136.......................... < 100>Y
105682004709136.......................... < 100>Y
105682025446815.......................... < 100>
105682093745261.......................... < 100>Y
105682093745261.......................... < 100>Y
105682095668485.......................... < 100> |
|
| Back to top |
|
 |
Frank Yaeger
DFSORT Developer

Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
I'm not sure what the answer to my previous question is so I'll ask it more directly:
Let's say your input is:
| Code: |
105682004709136.R01...................... < 100>
105682004709136.R02...................... < 100>
105682004709136.R03...................... < 100>
105682025446815.R04...................... < 100>
105682093745261.R05...................... < 100>
105682093745261.R06...................... < 100>
105682095668485.R07...................... < 100>
|
Can the output have the records with the same keys in any order, e.g. (R03, R02, R01 for the first key):
| Code: |
105682004709136.R03...................... < 100>Y
105682004709136.R02...................... < 100>Y
105682004709136.R01...................... < 100>Y
...
|
Or must the output have the same keys in their original order, e.g. R01, R02, R03 for the first key):
| Code: |
105682004709136.R01...................... < 100>Y
105682004709136.R02...................... < 100>Y
105682004709136.R03...................... < 100>Y
...
|
|
|
| Back to top |
|
 |
mfarien
New User
Joined: 02 Mar 2007 Posts: 17 Location: USA
|
|
|
|
| Those could be in any order. |
|
| Back to top |
|
 |
dick scherrer
Moderator Emeritus

Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Sorry to "charge in", but i have to ask. . .
Is there some reason processing the data 3 times is better than adding the bit of code needed to handle duplicates in the COBOL program (which requires only 1 pass of the data)?
Hopefully, there is something i am misunderstanding. . .  |
|
| Back to top |
|
 |
mfarien
New User
Joined: 02 Mar 2007 Posts: 17 Location: USA
|
|
|
|
If I have duplicates, those need to be reported and I need to add up the sum of amounts. I can not ignore, it is not to delete or omit duplicates, but to flag the duplicates and then use the file with the duplicates for calculating some amounts and also to put those in reporting.
There may be same key, but the other fields could be different. Means a same key, under different department, and getting some benefits. So need to know what all benefits key has received under the different department and to update the departments of duplicate keys in the cobol reports !
So I will have a good file ready with the duplicates flagged. I did write a program to do this, but what can be done in JCL for 100,000's of records will take time in Cobol.
I hope it explains ! |
|
| Back to top |
|
 |
dick scherrer
Moderator Emeritus

Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
I believe that i understand what you need to do.
I also believe that proper coding would allow you to process the 100,000s of data only one time rather than the 3 times this approach will require.
The data you show is already in sequence, so that is not an issue. |
|
| Back to top |
|
 |
Frank Yaeger
DFSORT Developer

Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
mfarien,
Here's an updated DFSORT/ICETOOL job for your "new" requirement.
| Code: |
//S1 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN DD DSN=... input file (FB/100)
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)
//OUT DD DSN=... output file (FB/101)
//TOOLIN DD *
SORT FROM(IN) TO(T1) USING(CTL1)
SPLICE FROM(T1) TO(OUT) ON(1,15,CH) KEEPBASE KEEPNODUPS -
WITHALL WITH(1,100) USING(CTL2)
/*
//CTL1CNTL DD *
SORT FIELDS=(1,15,CH,A)
OUTREC OVERLAY=(102:SEQNUM,8,ZD,RESTART=(1,15))
/*
//CTL2CNTL DD *
SORT FIELDS=(1,15,CH,A,102,8,ZD,D)
OUTFIL FNAMES=OUT,
IFTHEN=(WHEN=(102,8,ZD,GT,+1),BUILD=(1,100,C'Y')),
IFTHEN=(WHEN=NONE,BUILD=(1,100,X))
/*
|
|
|
| Back to top |
|
 |
mfarien
New User
Joined: 02 Mar 2007 Posts: 17 Location: USA
|
|
|
|
| Thanks Frank it worked well and I have learned something new and good for my day to day use. |
|
| Back to top |
|
 |
dick scherrer
Moderator Emeritus

Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Well, you have learned something new. . .
For your requirement it is very likely not good and should surely not be used day to day.
Maybe someday you will also learn that it is nearly never a good decision to read all of the data, write all of the data, and read it all again when a single read would be sufficient. |
|
| Back to top |
|
 |
mfarien
New User
Joined: 02 Mar 2007 Posts: 17 Location: USA
|
|
|
|
I am not processing it so many times as you have understood.
Here a raw file is sorted and flagged for duplicates in JCL and later on used for processing in a Cobol.
What would be best way to do it in a single read ?
( When I do need an to process duplicates in reports and transaction files differently in cobol processing) |
|
| Back to top |
|
 |
dick scherrer
Moderator Emeritus

Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
| Quote: |
| I am not processing it so many times as you have understood. |
The data you posted as the "input" is already in sequence - which would lead to "extra" processing. If the "real" data will not be in sequence, sorting it will not be extra overhead - it would be needed
| Quote: |
| (When I do need an to process duplicates in reports and transaction files differently in cobol processing) |
Please clarify this - i do not understand. . .  |
|
| Back to top |
|
 |
mfarien
New User
Joined: 02 Mar 2007 Posts: 17 Location: USA
|
|
|
|
Yes, that's what we were discussing. If I am using the above ICETOOL step I am not going to have a sort step above it. I will remove my sort and use this. So sort and flag addition in 1 step only. I already mentioned in my post in which I gave the data 'I have already sorted the file with the key'. ....
Hope now we are on same page ..  |
|
| Back to top |
|
 |
dick scherrer
Moderator Emeritus

Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Yup, i believe we are.
Good luck
d |
|
| Back to top |
|
 |
|
|
 |
All times are GMT + 6 Hours |
|