View previous topic :: View next topic
|
Author |
Message |
obulisankar
New User
Joined: 03 May 2007 Posts: 20 Location: bangalore
|
|
|
|
Hi all,
I have a doubt here, for comparing 2 or more files, which utility is best in terms of performance, space.....
I have a confusion here to use either ICETOOL or SORT utility.
Any suggestions / points could be of great help...
Thanks in Advance.... |
|
Back to top |
|
|
Skolusu
Senior Member
Joined: 07 Dec 2007 Posts: 2205 Location: San Jose
|
|
|
|
obulisankar,
It depends. Various factors like LRECL, RECFM, volume of records, no: of duplicates, the length of the fields to be compared etc.. influence the performance. Please provide us with a sample input and desired output with the DCB properties of all files involved along with the rules for comparison
Kolusu |
|
Back to top |
|
|
obulisankar
New User
Joined: 03 May 2007 Posts: 20 Location: bangalore
|
|
|
|
Kolusu,
Here is the information you asked for:
LRECL --> 160
RECFM --> FB
Volume of records --> will be in millions
No. of duplicates --> keep varying
No. of keys used
for comparision --> 1
Lenghth of
key field --> 10
Position of Key --> 147
The DCB properties are same for all the input and output files.
I have 3 files: File1 , File2 , File3 . All of these files contain the records of the same type.
I need to compare these 3 files and get 2 output files.
Output File1 : Should contain the matching records from File1 and File2 and non-matching records from File3.
Output File2 : Should contain all the matching records from 3 files.
Example:
--------
File1:
-----
AAA BBB 100 DDD
EEE FFF 200 ccc
DDD NNN 300 AAA
ZZZ NNN 300 AAA
File2:
------
AAA BBB 100 DDD
CCC FFF 600 EEE
XXX JJJ 400 VVV
ZZZ NNN 300 AAA
File3:
------
AAA BBB 100 DDD
CCC FFF 200 GGG
EEE JJJ 500 KKK
Output File1:
--------------
DDD NNN 300 AAA
ZZZ NNN 300 AAA
Output File2:
-------------
AAA BBB 100 DDD
Thanks in advance |
|
Back to top |
|
|
Skolusu
Senior Member
Joined: 07 Dec 2007 Posts: 2205 Location: San Jose
|
|
|
|
obuli sankar,
How do you want to handle duplicates? Do you need the duplicate records in the output when a match is found? Lets say file 1 has 5 duplicates for the key AAA BBB 100 DDD . It is a matching key in all the 3 files. Do you need to see the 5 records in output or just one record?
Kolusu |
|
Back to top |
|
|
obulisankar
New User
Joined: 03 May 2007 Posts: 20 Location: bangalore
|
|
|
|
Yes. Its like we are doing File1 - File2. When comparing 2 files File1 and File2 only the duplicates from the first file should be retained.
Thanks |
|
Back to top |
|
|
obulisankar
New User
Joined: 03 May 2007 Posts: 20 Location: bangalore
|
|
|
|
I want all the duplicates from the first file to be in the output file. |
|
Back to top |
|
|
Skolusu
Senior Member
Joined: 07 Dec 2007 Posts: 2205 Location: San Jose
|
|
|
|
obulisankar,
The following DFSORT/ICETOOL JCL will give you the desired results.
Code: |
//STEP0100 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//FILE1 DD *
AAA BBB 100 DDD - DUP1
AAA BBB 100 DDD - DUP2
AAA BBB 100 DDD - DUP3
AAA BBB 100 DDD - DUP4
AAA BBB 100 DDD - DUP5
EEE FFF 200 CCC
DDD NNN 300 AAA
ZZZ NNN 300 AAA
//FILE2 DD *
AAA BBB 100 DDD
CCC FFF 600 EEE
DDD NNN 300 VVV
ZZZ NNN 300 AAA
//FILE3 DD *
AAA BBB 100 DDD
CCC FFF 200 GGG
EEE JJJ 500 KKK
//T1 DD DSN=&&T1,DISP=(MOD,PASS),SPACE=(CYL,(X,Y),RLSE)
//T2 DD DSN=&&T2,DISP=(MOD,PASS),SPACE=(CYL,(X,Y),RLSE)
//OUT1 DD SYSOUT=*
//OUT2 DD SYSOUT=*
//TOOLIN DD *
SORT FROM(FILE2) USING(CTL1)
SORT FROM(FILE3) USING(CTL2)
SORT FROM(T1) USING(CTL3)
COPY FROM(FILE1) USING(CTL4)
SPLICE FROM(T2) TO(OUT1) ON(147,10,CH) -
WITHALL WITH(01,160) USING(CTL5)
//CTL1CNTL DD *
SORT FIELDS=(147,10,CH,A)
SUM FIELDS=NONE
OUTFIL FNAMES=T1,OVERLAY=(161:147,10,10Z)
//CTL2CNTL DD *
SORT FIELDS=(147,10,CH,A)
SUM FIELDS=NONE
OUTFIL FNAMES=T1,OVERLAY=(161:10Z,147,10)
//CTL3CNTL DD *
SORT FIELDS=(147,10,CH,A)
SUM FIELDS=(161,8,BI,169,8,BI,177,4,BI)
OUTFIL FNAMES=T2
//CTL4CNTL DD *
OUTFIL FNAMES=T2,OVERLAY=(161:20X)
//CTL5CNTL DD *
OUTFIL FNAMES=OUT1,INCLUDE=(171,10,BI,EQ,0),BUILD=(1,160)
OUTFIL FNAMES=OUT2,SAVE,BUILD=(1,160)
/* |
OUT1
Code: |
DDD NNN 300 AAA
ZZZ NNN 300 AAA |
OUT2
Code: |
AAA BBB 100 DDD - DUP1
AAA BBB 100 DDD - DUP2
AAA BBB 100 DDD - DUP3
AAA BBB 100 DDD - DUP4
AAA BBB 100 DDD - DUP5
|
|
|
Back to top |
|
|
obulisankar
New User
Joined: 03 May 2007 Posts: 20 Location: bangalore
|
|
|
|
Kolusu,
Thanks for your reply, but still according to our requirements the volume of records in the datasets is very huge.. My intention is to reduce Temprary files as much as possible..
Will it be possible to have only one "Temporary Dataset" for comparison of 3 files.
By copying records from File1 to T1 and marking as "AAA"
By copying records from File2 to T1 and marking as "BBB"
By copying records from File3 to T1 and marking as "CCC"
Something of this will be really helpful for me. Actually i'm able to do comparison for 2 Files using only 1 Temporary file.
Thanks in Advance |
|
Back to top |
|
|
Skolusu
Senior Member
Joined: 07 Dec 2007 Posts: 2205 Location: San Jose
|
|
|
|
obulisankar wrote: |
Thanks for your reply, but still according to our requirements the volume of records in the datasets is very huge.. My intention is to reduce Temprary files as much as possible..
Will it be possible to have only one "Temporary Dataset" for comparison of 3 files.
|
Not really. With Duplicates in your files and since you want to retain them I cant think of getting it done with a single temp dataset
Kolusu |
|
Back to top |
|
|
ap_mainframes
Active User
Joined: 29 Dec 2005 Posts: 181 Location: Canada
|
|
|
|
obulisankar,
Do you have any other option than ICETOOL ?
I think normal DFSORT cant handle two input files anytime ! And in your example you are doing of file balancing.
Any corrections are more than welcome ! |
|
Back to top |
|
|
Skolusu
Senior Member
Joined: 07 Dec 2007 Posts: 2205 Location: San Jose
|
|
|
|
ap_mainframes wrote: |
I think normal DFSORT cant handle two input files anytime ! And in your example you are doing of file balancing. |
ap_mainframes,
can you explain what a normal DFSORT is ? if possible show me an example of how the above requirement can be solved
Thanks |
|
Back to top |
|
|
Craq Giegerich
Senior Member
Joined: 19 May 2007 Posts: 1512 Location: Virginia, USA
|
|
|
|
ap_mainframes wrote: |
obulisankar,
Do you have any other option than ICETOOL ?
I think normal DFSORT cant handle two input files anytime ! And in your example you are doing of file balancing.
Any corrections are more than welcome ! |
Have you bothered to look at Skolusu's signature line! |
|
Back to top |
|
|
ap_mainframes
Active User
Joined: 29 Dec 2005 Posts: 181 Location: Canada
|
|
|
|
Skolusu,
Thats what I meant that there is no other option than ICETOOL.
To my knowledge, Sort cant handle two input files. |
|
Back to top |
|
|
Frank Yaeger
DFSORT Developer
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
Quote: |
To my knowledge, Sort cant handle two input files. |
I'm not sure what you mean by this.
If you're talking about using PGM=ICEMAN (DFSORT) rather than using PGM=ICETOOL (DFSORT's ICETOOL), then there are two cases where DFSORT can handle two (or more input files):
1. Concatenated SORTIN input files for a SORT or COPY operation
2. Different SORTINdd input files for a MERGE operation |
|
Back to top |
|
|
ap_mainframes
Active User
Joined: 29 Dec 2005 Posts: 181 Location: Canada
|
|
|
|
Frank,
I am under impression that when we do PGM=SORT, DFSORT is invoked.
I may be wrong though. If I am, Can you expain what is a diffrence between PGM=SORT and PGM=ICEMAN ?
I think, when you do PGM=SORT you cant handle two files. thats what I meant. Correct me if I am worng.
Thanks, |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
Can you expain what is a diffrence between PGM=SORT and PGM=ICEMAN ? |
There is no difference. They both execute the exact same load module.
Looking at this previous topic may help:
ibmmainframes.com/about28740-15.html |
|
Back to top |
|
|
obulisankar
New User
Joined: 03 May 2007 Posts: 20 Location: bangalore
|
|
|
|
Frank,
I'm new to ICEMAN, My intention is to compare two or more files using ICEMAN (the constraint is only one temporary file should be used).
Below is the information:
LRECL --> 160
RECFM --> FB
Volume of records --> will be in millions
No. of duplicates --> keep varying
No. of keys used for comparision --> 1
Lenghth of key field --> 10
Position of Key --> 147
The DCB properties are same for all the input and output files.
I have 3 files: File1 , File2 , File3 . All of these files contain the records of the same type.
I need to compare these 3 files and get 2 output files.
Output File1 : Should contain the matching records from File1 and File2 and non-matching records from File3. [(File1 & File2) - File3]
Output File2 : Should contain all the matching records from 3 files.
[File1 & File2 & File3]
Example:
--------
File1:
-----
AAA BBB 100 DDD
AAA BBB 100 DDD
EEE FFF 200 CCC
DDD NNN 300 AAA
ZZZ NNN 300 AAA
File2:
------
AAA BBB 100 DDD
CCC FFF 600 EEE
DDD NNN 300 AAA
ZZZ NNN 300 AAA
File3:
------
AAA BBB 100 DDD
CCC FFF 200 GGG
EEE JJJ 500 KKK
Output File1:
--------------
DDD NNN 300 AAA
ZZZ NNN 300 AAA
Output File2:
-------------
AAA BBB 100 DDD
AAA BBB 100 DDD
Thanks in advance |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Is there some reason you posted all of this again?
When you asked last week, you were given an answer from a DFSORT developer:
Quote: |
Not really. With Duplicates in your files and since you want to retain them I cant think of getting it done with a single temp dataset |
You need to move on. . . |
|
Back to top |
|
|
Frank Yaeger
DFSORT Developer
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
PGM=ICEMAN and PGM=SORT are identical - both invoke DFSORT. Read my previous comments as "If you're talking about PGM=ICEMAN or PGM=SORT (DFSORT) ...". |
|
Back to top |
|
|
|