View previous topic :: View next topic
|
Author |
Message |
Akash Sharma
New User
Joined: 13 Jan 2009 Posts: 36 Location: India
|
|
|
|
Hi All,
I want some help on the below.
I have 2 PS: PS1 and PS2. Each having RECL as 500.
I want to compare PS2 with PS1 and write records in PS3 which are present in PS2 but not in PS1. Please suggest how can i do it. Both PS1 and PS2 contains millions of records.
e.g.
Input PS1
Code: |
abcdefghi
pppppppp
qqqqqqqq
rrrrrrrrrrr
yyyyyyyy
nnnnnnnn
xxxxxxxx
|
Input PS2
Code: |
iiiiiiiiiiiiiiiiii
pppppppp
llllllllllllllllllllllll
zzzzzzzzzzzz
xxxxxxxxxx
nnnnnnnnnn
rrrrrrrrrrrrrr
|
Output PS3
Code: |
iiiiiiiiiiiiiiiiiii
lllllllllllllllllll
zzzzzzzzzz
rrrrrrrrrrrr
|
|
|
Back to top |
|
|
Craq Giegerich
Senior Member
Joined: 19 May 2007 Posts: 1512 Location: Virginia, USA
|
|
|
|
Once again,
1. What is the RECFM of each file and the LRECL of each?
2. What are the positions, length and format of the keys for matching?
3. Are the files already sorted?
4. Are there duplicate keys in either file? |
|
Back to top |
|
|
Akash Sharma
New User
Joined: 13 Jan 2009 Posts: 36 Location: India
|
|
|
|
Craq Giegerich wrote: |
Once again,
1. What is the RECFM of each file and the LRECL of each?
2. What are the positions, length and format of the keys for matching?
3. Are the files already sorted?
4. Are there duplicate keys in either file? |
1. RECFM is FB. LRECL is 200.
2. We would have to match the whole record(200 bytes) in PS2 with PS1. There is no unique key available.
3. No, the files are not sorted. They are just record exracts from database.
4. No records (200 byte) will be duplicate. |
|
Back to top |
|
|
CICS Guy
Senior Member
Joined: 18 Jul 2007 Posts: 2146 Location: At my coffee table
|
|
|
|
A search on "compare files duplicates" turned up one that is (almost) exactly what you are asking for:
Compare File B with File A and eliminate duplicate
By the way,
Quote: |
Each having RECL as 500. |
Quote: |
the whole record(200 bytes) |
which one is correct? |
|
Back to top |
|
|
Craq Giegerich
Senior Member
Joined: 19 May 2007 Posts: 1512 Location: Virginia, USA
|
|
|
|
Since the files are not in sequence and do not have matching keys then you would have to compare all of the records in PS1 with all of the records in PS2. For millions of records that would be a ridiculous idea, you had better redesign the process. |
|
Back to top |
|
|
Frank Yaeger
DFSORT Developer
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
Akash,
Assuming that the LRECL is 200 and you want to match on all 200 bytes, here's a DFSORT/ICETOOL job that will do what you asked for.
Code: |
//S1 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN1 DD DSN=... input file1 (FB/200)
//IN2 DD DSN=... input file2 (FB/200)
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(MOD,PASS)
//OUT DD DSN=... output file (FB/200)
//TOOLIN DD *
COPY FROM(IN1) TO(T1) USING(CTL1)
COPY FROM(IN2) TO(T1) USING(CTL2)
SELECT FROM(T1) TO(OUT) ON(1,200,CH) NODUPS USING(CTL3)
/*
//CTL1CNTL DD *
INREC OVERLAY=(201:C'1')
/*
//CTL2CNTL DD *
INREC OVERLAY=(201:C'2')
/*
//CTL3CNTL DD *
OUTFIL FNAMES=OUT,INCLUDE=(201,1,CH,EQ,C'2'),
BUILD=(1,200)
/*
|
Note that the ouput records will be in sorted order. If you really want them in their original order, we'd have to add a sequence number and another pass to sort on that sequence number. |
|
Back to top |
|
|
Robert Sample
Global Moderator
Joined: 06 Jun 2008 Posts: 8696 Location: Dubuque, Iowa, USA
|
|
|
|
Now, Craq, for two million records in each file that would only require a total of 4 trillion compares. Surely the z10 could get 4 trillion compares done in a few thousand CPU seconds (maybe a week or two elapsed time)? |
|
Back to top |
|
|
Frank Yaeger
DFSORT Developer
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
Quote: |
Since the files are not in sequence and do not have matching keys then you would have to compare all of the records in PS1 with all of the records in PS2. For millions of records that would be a ridiculous idea, you had better redesign the process. |
SELECT sorts the records so it only has to compare adjacent sorted records - it does not have to compare all of the records in PS1 with all of the records in PS2. |
|
Back to top |
|
|
Akash Sharma
New User
Joined: 13 Jan 2009 Posts: 36 Location: India
|
|
|
|
Thanks a lot Frank.
Its working absolutely fine.
Your time and help is much appreciated. |
|
Back to top |
|
|
|