Flagging duplicates

mfarien · New User Joined: 02 Mar 2007 Posts: 17 Location: USA

I want to sort a file on a key SORT FIELDS=(1,5,CH,A) . I want to sort, keep all the records, I just want to flag the duplicates. Say my input file is of length 5 . I want output file of length six and mark all the records those are duplicate with a flag Y.
Example

abcde
abcde
qqqqq
rrrrrr
qqqqq
ppppp

I want my o/p file as

abcdey
abcdey
ppppp
rrrrrr
qqqqqy
qqqqqy

Frank Yaeger · Posted: Thu Dec 06, 2007 10:33 pm

Here's a DFSORT/ICETOOL job that will do what you asked for:

mfarien · New User Joined: 02 Mar 2007 Posts: 17 Location: USA

Thanks Frank,

Just to understand it more, for the same condition, where I want all the i/p records in o/p file, with duplicates flagged, how I am going to code , if my input file length is 100, output is 101,sort fileds are 1,15 and flag should be at position 101.

I am trying to understand the control cards -

OUTREC OVERLAY=(6:SEQNUM,8,ZD,RESTART=(1,5)) .

SORT FIELDS=(1,5,CH,A,6,8,ZD,D)
OUTFIL FNAMES=OUT,
IFTHEN=(WHEN=(6,8,ZD,GT,+1),BUILD=(1,5,C'y')),
IFTHEN=(WHEN=NONE,BUILD=(1,5,X))

Frank Yaeger · Posted: Fri Dec 07, 2007 12:42 am

I assumed that you only had the key in each record as shown in your original example, so you didn't care about the order of the records with the same key for output. If you have other fields in the record and do care about the order of the records with the same key for output, then we'd need to do it a different way.

Let's start over. Show me a better example of your input records and expected output records with the other fields in the record besides the key so I can see what you really want.

mfarien · New User Joined: 02 Mar 2007 Posts: 17 Location: USA

OK. Got it.

Let me restart it. I have one i/p file, LRECL=100. My sort key is 15 chars.
rest all won't matter to me. I have already sorted the file with the key. Now say, I have 100 records in my sorted i/p file, with 20 duplicates, means total 80 unique records and 20 with duplicates. I want the output file LRECL=100 with a flag in each duplicate record at position 101 so in my cobol program, I know that it's a duplicate and I can process it accordingly checking for the flag. So I will have 100 in my o/p file and have 20 with flags and 80 without any flags.

The '.............' in the example are fields with 9's , A's and X's I want those as it is and they have nothing to do with sort or duplicates.

Example.
105682004709136.......................... < 100>
105682004709136.......................... < 100>
105682025446815.......................... < 100>
105682093745261.......................... < 100>
105682093745261.......................... < 100>
105682095668485.......................... < 100>

I want my o/p file as
105682004709136.......................... < 100>Y
105682004709136.......................... < 100>Y
105682025446815.......................... < 100>
105682093745261.......................... < 100>Y
105682093745261.......................... < 100>Y
105682095668485.......................... < 100>

Frank Yaeger · Posted: Fri Dec 07, 2007 2:04 am

I'm not sure what the answer to my previous question is so I'll ask it more directly:

Let's say your input is:

mfarien · New User Joined: 02 Mar 2007 Posts: 17 Location: USA

Those could be in any order.

dick scherrer · Posted: Fri Dec 07, 2007 2:36 am

Hello,

Sorry to "charge in", but i have to ask. . .

Is there some reason processing the data 3 times is better than adding the bit of code needed to handle duplicates in the COBOL program (which requires only 1 pass of the data)?

Hopefully, there is something i am misunderstanding. . .

mfarien · New User Joined: 02 Mar 2007 Posts: 17 Location: USA

If I have duplicates, those need to be reported and I need to add up the sum of amounts. I can not ignore, it is not to delete or omit duplicates, but to flag the duplicates and then use the file with the duplicates for calculating some amounts and also to put those in reporting.
There may be same key, but the other fields could be different. Means a same key, under different department, and getting some benefits. So need to know what all benefits key has received under the different department and to update the departments of duplicate keys in the cobol reports !
So I will have a good file ready with the duplicates flagged. I did write a program to do this, but what can be done in JCL for 100,000's of records will take time in Cobol.
I hope it explains !

dick scherrer · Posted: Fri Dec 07, 2007 3:04 am

Hello,

I believe that i understand what you need to do.

I also believe that proper coding would allow you to process the 100,000s of data only one time rather than the 3 times this approach will require.

The data you show is already in sequence, so that is not an issue.

Frank Yaeger · Posted: Fri Dec 07, 2007 3:28 am

mfarien,

Here's an updated DFSORT/ICETOOL job for your "new" requirement.

mfarien · New User Joined: 02 Mar 2007 Posts: 17 Location: USA

Thanks Frank it worked well and I have learned something new and good for my day to day use.

dick scherrer · Posted: Fri Dec 07, 2007 9:40 pm

Hello,

Well, you have learned something new. . .

For your requirement it is very likely not good and should surely not be used day to day.

Maybe someday you will also learn that it is nearly never a good decision to read all of the data, write all of the data, and read it all again when a single read would be sufficient.

mfarien · New User Joined: 02 Mar 2007 Posts: 17 Location: USA

I am not processing it so many times as you have understood.
Here a raw file is sorted and flagged for duplicates in JCL and later on used for processing in a Cobol.
What would be best way to do it in a single read ?
( When I do need an to process duplicates in reports and transaction files differently in cobol processing)

dick scherrer · Posted: Fri Dec 07, 2007 11:22 pm

Hello,

mfarien · New User Joined: 02 Mar 2007 Posts: 17 Location: USA

Yes, that's what we were discussing. If I am using the above ICETOOL step I am not going to have a sort step above it. I will remove my sort and use this. So sort and flag addition in 1 step only. I already mentioned in my post in which I gave the data 'I have already sorted the file with the key'. ....
Hope now we are on same page ..

dick scherrer · Posted: Sat Dec 08, 2007 1:14 am

Yup, i believe we are.

Good luck

d