Joinkeys with duplicated keys

juares castro · New User Joined: 04 May 2012 Posts: 34 Location: Brazil

Hi everyone!

I am trying do a joinkey in SYNCSORT with duplicated keys in both INfiles. Actually in this example we have identical files talking about keys:
pos. 4 length 3.

Here input files:

SORTJNF1

Rohit Umarjikar · Posted: Fri Sep 23, 2016 8:19 am

Please learn to use code tags.
Why do you have

mistah kurtz · Posted: Fri Sep 23, 2016 10:48 am

You have mentioned key positions as POS: 4, LENGTH: 3, but your input/output suggest otherwise.

Try this sort card:

Bill Woodger · Posted: Fri Sep 23, 2016 12:12 pm

mistah kurtz,

If you use 3,4 as the keys, there are no duplicate keys, so there is evidence for OP being correct with 4,3 (their problem is with duplicate keys).

juares castro,

You are SORTing both files (each JOINKEYS SORTs its input unless told not to). You sample data doesn't show any reason for this (data is in key order). Consider using SORTED on the JOINKEYS statement(s).

It looks as though you just want to "side-by-side" the data. Both your input files are the same. Best way to do this is to generate a sequence number, appended to each record (for fixed-length records, prepended to the data for variable-length records) and using the sequence number as the key.

Arun Raj · Posted: Fri Sep 23, 2016 5:25 pm

Bill Woodger · Posted: Fri Sep 23, 2016 9:00 pm

No, Arun. On inspection the data shown, including the key, is identical and is claimed to be identical as you noted.

And no, I don't know why this:

Arun Raj · Posted: Fri Sep 23, 2016 9:20 pm

Bill - Records 3,4 and 5 are different.

RahulG31 · Active User Joined: 20 Dec 2014 Posts: 446 Location: USA

I wanted to make sure this comment from Bill is Not missed and this should be what OP wanted:

Bill Woodger · Posted: Fri Sep 23, 2016 9:40 pm

Thanks, Arun. Well, it wasn't a very good "inspection" then, was it? Now I understand what the TS/OP meant :-)

If the keys are identical on the two files, simplest for me is the join on the sequence numbers. Can verify that the keys are identical if desired.

Each of the three (if I have looked correctly) "333" keys on F1 will match with each of the three "333" keys on F2, giving nine "333" keys on the joined data.

If there is not a one-to-one relationship in the keys, then the easiest is to make the keys unique, by adding a sequence number with RESTART= for the key value field. So, two-part key, 333 001, 333 002, 333 003, and then the join will be one-for-one.

Arun Raj · Posted: Fri Sep 23, 2016 11:11 pm

Happens sometimes. You're welcome, Bill.

juares castro · New User Joined: 04 May 2012 Posts: 34 Location: Brazil

Sorry All!
The position of keys is 4 length 3.
My intention is understand why we have duplicated keys (or records) being created.
I am thinking my code is not OK... missing something.
In this example we have the same number of records but we could have this scenario with different numbers of records, and to demonstrate I filed with different content after the keys on 2nd file.

Thanks in advance!

Bill Woodger · Posted: Sat Sep 24, 2016 10:25 am

And what is it you want to know that hasn't been covered?

sergeyken · Posted: Sat Sep 24, 2016 5:18 pm

juares castro · New User Joined: 04 May 2012 Posts: 34 Location: Brazil

Hi again!
Sorry, but first of all, i hope to make questions in this forum i should not know everything about SYNCSORT. I am not an expert.
My intention is understand if there is a different code i have done that does not create records in SORTOUT. I mean, in my example we have the same number in 2 files ( 6 recs) with identical keys (but we could have different number of records with non identical keys) .
In this example my intention is create a SORTOUT with 6 records, as its entry files.
Looking the responses above i guess using a sequence number i could have the result i want.

sergeyken · Posted: Sun Sep 25, 2016 3:19 am

I repeat in bold:

First of all you have to understand what exactly you want (or you are supposed) to do?
Then you can ask about SYNCSORT, or not SYNCSORT, or any other tool
- from your message it's not obvious if SYNCSORT is appropriate tool in your case?
You did not explain what exactly your task is?

juares castro · New User Joined: 04 May 2012 Posts: 34 Location: Brazil

Hi Sergeyken!
Repeating in normal text:
My intention is match two files and when we have duplicated keys in one of them or both the result not bring us "new" records.
I think SYNCSORT can do this, for sure i do not know how only using the keys to match. As i said maybe using a sequence number. Could i be right? (in bold)

Abid Hasan · New User Joined: 25 Mar 2013 Posts: 88 Location: India

Hello Juares,

Though same is already explained earlier in Mr. Woodger's post, will try putting it across again; *SORT (i.e. SYNC* or DF*) works by performing a cartesian join on keys. In your case for 3 bytes from position 4 achieved exactly this as shared in your first post. Since you had multiple matching entries ON THESE Key positions, hence a matched pair for all these records was also built by sort. Re-iterating - there are matching combinations of keys for 3 bytes from position 4.

To avoid this, you can have another key combination, i.e. 3 bytes from position 7, in conjunction to what you already have.
OR
You can use the suggestions others have already stated.

Again, the duplicates are not really duplicates, they are all 'unique combinations' if you compare them using the keys.

Rohit Umarjikar · Posted: Mon Sep 26, 2016 9:54 pm

joinkeys with no repeated records This is precisely what you want, you could do yourself in future as there are a lot of examples and learning of this kind on this forum.

Bill Woodger · Posted: Tue Sep 27, 2016 1:05 am

I'm probably going to take a knife to this topic...

Unless you are on a very old SyncSORT, a solution from 2009 is not appropriate.

Use JNF1CNTL and JNF2CNTL to append/prepend (fixed-length records/variable-length records) sequence numbers, then include those as minor keys for your JOINKEYS statements.

If this doesn't work, post, in full, what you have tried, including the full output (all three tasks) from the spool, with representative sample data, the output you obtain, and the output you desire.

juares castro · New User Joined: 04 May 2012 Posts: 34 Location: Brazil

I thank you all!