I am having the records in 3 files in the below manner:
Code:
File1 :
**********************
1234 y
1268 y
1299 y
1290 y
1000 y
file2 :
***********************
1000 y
1250 y
1386 y
1290 y
file3:
***********************
1000 y
1001 y
1003 y
***********************
Ignore the flowers. I have put them only for column identification.
Now I need to merge these file records into a single file based on the first 4 bytes as key.
Code:
Opt file:
**********************
1000 y y y
1001 y
1003 y
1234 y
1250 y
1268 y
1290 y y
1299 y
1386 y
Basically I am merging the data from more than 2 files into one file. We have a beautiful card for 2 files using Icetool and sort to do this. But with more than 2 files I am getting confused with the permutatoins and combinations.
Even with join keys also I was successful with only two files. For 3 files I have to join file 1 to 2 and then 2 with 3 and then 1 with 3 and then create a consolidated file.
I need your advise in avoiding this hardship and confusion. I am quite sure sort and icetool can do this. If not then a COBOL pgm is my only way to resolve this issue.
If we can get this then probably we are moving towards a generalized solution where I can process N number of files.
Joined: 07 Dec 2007 Posts: 2205 Location: San Jose
k_rajesh,
If all files are of the same length and the Y flags are already in the right places isn't it a simple task of just concatenating the files and splicing?
Here is an example of splicing 3 files concatenated to IN dd
Code:
//STEP0100 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN DD *
----+----1----+----2----+----3----+----4----+----5----
1234 Y
1268 Y
1299 Y
1290 Y
1000 Y
// DD *
1000 Y
1250 Y
1386 Y
1290 Y
// DD *
1000 Y
1001 Y
1003 Y
//OUT DD SYSOUT=*
//TOOLIN DD *
SPLICE FROM(IN) TO(OUT) ON(1,4,CH) WITHEACH -
WITH(8,1) WITH(10,1) KEEPNODUPS
//*
The output is
Code:
1000 Y Y Y
1001 Y
1003 Y
1234 Y
1250 Y
1268 Y
1290 Y Y
1299 Y
1386 Y
All you need to do is change the number of WITH statements in accordance to your files.
Kolusu,
Just curious if we need to reposition 'Y' in three files before splicing. OP's input file shows 'Y' in the same position for all 3 input files.
I understand that the example you have shown has 'Y' in different positions and it gives expected results but OP's input had them in the same position.
Please correct me if I am wrong.
Quote:
File1 :
**********************
1234 y
1268 y
1299 y
1290 y
1000 y
file2 :
***********************
1000 y
1250 y
1386 y
1290 y
file3:
***********************
1000 y
1001 y
1003 y
***********************
Thanks for clarifying my doubt. I was totally looking in the wrong direction. Thanks for correcting me.
Also your processing was right. The 'Y' are not in the same position in all files.
With they being placed in the correct position I see that your splice does the magic I need.
Thanks,
K. Rajesh.
Skolusu wrote:
k_rajesh,
If all files are of the same length and the Y flags are already in the right places isn't it a simple task of just concatenating the files and splicing?
Here is an example of splicing 3 files concatenated to IN dd
Code:
//STEP0100 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN DD *
----+----1----+----2----+----3----+----4----+----5----
1234 Y
1268 Y
1299 Y
1290 Y
1000 Y
// DD *
1000 Y
1250 Y
1386 Y
1290 Y
// DD *
1000 Y
1001 Y
1003 Y
//OUT DD SYSOUT=*
//TOOLIN DD *
SPLICE FROM(IN) TO(OUT) ON(1,4,CH) WITHEACH -
WITH(8,1) WITH(10,1) KEEPNODUPS
//*
The output is
Code:
1000 Y Y Y
1001 Y
1003 Y
1234 Y
1250 Y
1268 Y
1290 Y Y
1299 Y
1386 Y
All you need to do is change the number of WITH statements in accordance to your files.
As usual while implementing the solution there were more developments.
Let me explain with the example below:
Master file:
1000 n n n n
1002 n n n n
1003 n n n n
1004 n n n n
1005 n n n n
File1:
1002 y n n n
1003 y n n n
File2:
1000 n y n n
1004 n y n n
File3:
1002 n n y n
File4:
1000 n n n y
1005 n n n y
When I excitedly applied the splice using only File 1,2,3,4 I got the below o/p:
Rep1:
1000 n n n n
1002 Y n n n
1003 y n n n
1004 n y n n
1005 n n n y
While I was expecting:
Rep1:
1000 n y n y
1002 y n y n
1003 y n n n
1004 n y n n
1005 n n n y
I realized that I made a mistake with the WITH option of splice. I gave
SPLICE FROM(IN1) TO(OUT1) ON(1,4,CH) WITHEACH -
WITH(8,1) WITH(10,1) WITH(12,1)
This splice works well if the key is present on all the input files. For cases where a record is present in only file 2 and 4 this option fails. So I created a master file containing all the records without dups as shown above. Then I went for splicing with the master file and all the Files 1,2,3,4
I changed the splice as shown below:
SPLICE FROM(IN1) TO(OUT1) ON(1,4,CH) WITHEACH -
WITH(6,1) WITH(8,1) WITH(10,1) WITH(12,1)
I still did not get the desired output as all the records are not present on all the files (Eg: 1000).
Hence I found that the only way is to splice the master file with each
file in staged manner.
Hence I applied the below splice for using master file and file1 as IN1:
The output TMP1 is spliced with File2 using the below splice for TMP1 and File2 as IN2 in another JCL:
SPLICE FROM(IN2) TO(TMP2) ON(1,4,CH) WITHEACH -
WITH(6,7) KEEPNODUPS
In this manner I was able to get the required output as shown below:
Rep1:
1000 n y n y
1002 y n y n
1003 y n n n
1004 n y n n
1005 n n n y
Hope this long diagnosis report does not irritate. I just wanted to detail all the aspects I tried. I would be very happy if you can suggest any other easy procedure.
Thanks,
K. Rajesh
P.S: In a NUTSHELL my requirement is to identify all the keys (assume Emp codes) in all the 4 files first. Then I need to tally these EMP codes for being present in all the files. This should be reported depicting the EMP code and the files in which it is present with Ys and Ns.
Joined: 07 Dec 2007 Posts: 2205 Location: San Jose
k_rajesh,
You are complicating a simple task. Use the following DFSORT JCL which will give you the desired results.
Code:
//STEP0100 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN DD *
1002 Y N N N
1003 Y N N N
// DD *
1000 N Y N N
1004 N Y N N
// DD *
1002 N N Y N
// DD *
1000 N N N Y
1005 N N N Y
//OUT DD SYSOUT=*
//TOOLIN DD *
SPLICE FROM(IN) TO(OUT) ON(1,4,CH) WITHANY USING(CTL1) -
WITH(6,1) WITH(8,1) WITH(10,1) WITH(12,1) KEEPNODUPS
//CTL1CNTL DD *
INREC FINDREP=(INOUT=(C'N',C' '),STARTPOS=6,DO=4)
OUTFIL FNAMES=OUT,
OVERLAY=(06:06,1,CHANGE=(1,C' ',C'N'),NOMATCH=(06,1),
08:08,1,CHANGE=(1,C' ',C'N'),NOMATCH=(08,1),
10:10,1,CHANGE=(1,C' ',C'N'),NOMATCH=(10,1),
12:12,1,CHANGE=(1,C' ',C'N'),NOMATCH=(12,1))
//*
This will produce
Code:
1000 N Y N Y
1002 Y N Y N
1003 Y N N N
1004 N Y N N
1005 N N N Y