IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Sorting two files


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Hima1985

New User


Joined: 17 Apr 2009
Posts: 70
Location: India

PostPosted: Thu Jul 02, 2009 12:27 am
Reply with quote

Hi All,

One more solution needed, thanks a lot for all your previous suggestions :-)

I have a requirement as below :
I need to comapre two files and list out differences in to two files. files are of length 4504 and VB.

File1 :
Code:

BBBBBBBBBBBBBBBBB12345672BBBBBBBBBBBBBBBBBBBBBB
CCCCCCCCCCCCCCCCC12345671CCCCCCCCCCCCCCCCCCCCCC
AAAAAAAAAAAAAAAAA12345673AAAAAAAAAAAAAAAAAAAAAA


File2:
Code:

AAAAAAAAAAAAAAAAA12345673AAAAAAAAAAAAADDDDDDDD
BBBBBBBBBBBBBBBBB12345672BBBBBBBBBBBBBAAAAAAAA
CCCCCCCCCCCCCCCCC12345671CCCCCCCCCCCCCCCCCCCCC


I need to compare byte by byte and list out the differences in output files for that i have used below code

Code:
//TOOLIN DD *
COPY FROM(FILEB) USING(CTL1)
COPY FROM(FILEA) TO(T1) USING(CTL2)
SELECT FROM(T1) TO(FILEC) ON(1,1500,CH) ON(1500,1500,CH) -
   ON(3000,1500,CH) NODUPS USING(CTL3)
SELECT FROM(T1) TO(FILED) ON(1,1500,CH) ON(1500,1500,CH) -
   ON(3000,1500,CH) NODUPS USING(CTL4)
/*
//CTL1CNTL DD *
  INREC BUILD=(1,4,5:C'BB',7:4504)
/*
//CTL2CNTL DD *
  INREC BUILD=(1,4,5:C'VV',7:4504)
/*
//CTL3CNTL DD *
  OUTFIL FNAMES=FILEC,INCLUDE=(5,2,CH,EQ,C'VB'),
    BUILD=(1,4,5:4504)
/*
//CTL4CNTL DD *
  OUTFIL FNAMES=FILED,INCLUDE=(5,2,CH,EQ,C'VV'),
    BUILD=(1,4,5:4504)
/*


Its giving me FILEC and FILED as outputs correctly

FILEC
Code:
AAAAAAAAAAAAAAAAA12345673AAAAAAAAAAAAAAAAAAAAAA
BBBBBBBBBBBBBBBBB12345672BBBBBBBBBBBBBBBBBBBBBB


FILED
Code:
AAAAAAAAAAAAAAAAA12345673AAAAAAAAAAAAADDDDDDDD
BBBBBBBBBBBBBBBBB12345672BBBBBBBBBBBBBAAAAAAAA


its giving like this as these records are having differences, you you observe records got sorted and came as output, but my requirement is these records needs to get compared, but needs to get sorted only on a key which is starting at 18th position to 10 bytes

which means my out puts should be

FILEC
Code:
BBBBBBBBBBBBBBBBB12345672BBBBBBBBBBBBBBBBBBBBBB
AAAAAAAAAAAAAAAAA12345673AAAAAAAAAAAAAAAAAAAAAA


FILED
Code:
BBBBBBBBBBBBBBBBB12345672BBBBBBBBBBBBBAAAAAAAA
AAAAAAAAAAAAAAAAA12345673AAAAAAAAAAAAADDDDDDDD

can we achive this with sort? If yes please help me out.
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Thu Jul 02, 2009 12:37 am
Reply with quote

Hima1985,

The Total bytes you can sort must NOT exceed 4092 (or, when the EQUALS option is in operation, 4088 bytes). You are trying to sort beyond that .

Here is a trick to sort 4500 bytes with FB input.

ibmmainframes.com/viewtopic.php?p=125375#125375

However with VB FILES it is always messy and I suggest you write a program for this type of comparison.
Back to top
View user's profile Send private message
Hima1985

New User


Joined: 17 Apr 2009
Posts: 70
Location: India

PostPosted: Thu Jul 02, 2009 12:41 am
Reply with quote

Skolusu,

Thanks for the quicky reply.

my comparision till 3500 bytes will do.... as i will have only spaces in remianing bytes...

I tried to write a PLI for this, my input files are having lakhs of records, so every record needs to be comapred with all the records in other files...its taking hell lot of IO operations....and my program running for 10+ hours.
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8697
Location: Dubuque, Iowa, USA

PostPosted: Thu Jul 02, 2009 1:03 am
Reply with quote

Why not just sort the 3500 bytes of each file since that CAN be done via sort? Then your PL/I comparison program is a simple two-way match program and will finish in a reasonable amount of time.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Thu Jul 02, 2009 2:51 am
Reply with quote

Hello,

Quote:
so every record needs to be comapred with all the records in other files...its taking hell lot of IO operations..
The worst possible solution. . . (generating a qsam cartesian product). Unless your family has the hardware sales contract icon_smile.gif

As Robert suggests, you simply need a "two-way" match program. Just for such a need, there is a "Sticky" near the top of the COBOL part of the forum with working 2-file match/merge code. You should be able to easily adapt this to your requirement.
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Thu Jul 02, 2009 6:16 am
Reply with quote

Hima1985,

I don't understand as to why you need 4 passes of data when both files are of the same LRECL. Since both files are of the same length can't you concatenate them together and perform the sorting on 3500 and look for duplicates?

Here is a sample JCL which will give you the desired results. we create a 1 byte header record in step0100 and concatenate it with your input files which would be used to identify the record to the file when using the group function. Since you wanted the data to be sorted on the key at 18 , we put it at the beginning at pos and followed by your data and it is sorted so that we retain the sequence you wanted. Any matching records will be summed and total would be gt > 2 , so we ignore them and select the records with indicator 1 and 2.


Code:

//STEP0100 EXEC PGM=SORT                                           
//SYSOUT   DD SYSOUT=*                                             
//SORTIN   DD *                                                   
HDR                                                               
//SORTOUT  DD DSN=&&H,DISP=(,PASS),SPACE=(TRK,(1,1),RLSE)         
//SYSIN    DD *                                                   
  SORT FIELDS=COPY                                                 
  INREC OVERLAY=(4500:X)                                           
  OUTFIL FTOV                                                     
/*                                                                 
//STEP0200 EXEC PGM=SORT                                           
//SYSOUT   DD SYSOUT=*                                             
//SORTIN   DD DSN=&&H,DISP=SHR,VOL=REF=*.STEP0100.SORTOUT         
//         DD DSN=Your input file1,DISP=SHR 
//         DD DSN=&&H,DISP=SHR,VOL=REF=*.STEP0100.SORTOUT         
//         DD DSN==Your input file2,DISP=SHR   
//FILEC    DD SYSOUT=*                                             
//FILED    DD SYSOUT=*                                             
//SYSIN    DD *                                                   
  INREC IFTHEN=(WHEN=INIT,BUILD=(1,4,C'0',22,10,5)),               
  IFTHEN=(WHEN=GROUP,BEGIN=(16,3,CH,EQ,C'HDR'),PUSH=(5:ID=1))     
  SORT FIELDS=(1,4,BI,A,6,3510,BI,A)                               
  SUM FIELDS=(5,1,ZD)                                             
  OUTFIL FNAMES=FILEC,INCLUDE=(5,1,ZD,EQ,1),BUILD=(1,4,16)         
  OUTFIL FNAMES=FILED,INCLUDE=(5,1,ZD,EQ,2),BUILD=(1,4,16)         
/*
Back to top
View user's profile Send private message
Hima1985

New User


Joined: 17 Apr 2009
Posts: 70
Location: India

PostPosted: Fri Jul 03, 2009 1:08 am
Reply with quote

Kolusu

I have tried the solution you have provided.....but no luck

Its comapring records perfectly....but giving out put in a sorted order on all the fields...not only on 18th to 10 bytes....as well why we are using build BUILD=(1,4,16) please suggest

Please suggest.
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Fri Jul 03, 2009 3:19 am
Reply with quote

Hima1985 wrote:
Kolusu

I have tried the solution you have provided.....but no luck

Its comapring records perfectly...
.but giving out put in a sorted order on all the fields...not only on 18th to 10 bytes....as well why we are using build BUILD=(1,4,16) please suggest

Please suggest.


Hima,

Did the comparison work or did it fail? The Inrec reformats the records as follows


Code:

RDW |IND|KEY(22-31)|ACTUAL DATA
1-4 |5-5|6-15      |16 - END   
--------------------------------
ZZLL| 1 |12345672BB|file1 data 
ZZLL| 1 |12345671CC|file1 data 
ZZLL| 1 |12345673AA|file1 data 
                               
ZZLL| 2 |12345673AA|file2 data 
ZZLL| 2 |12345672BB|file2 data 
ZZLL| 2 |12345671CC|file2 data 


Now we sort on rdw (4) + key we added at the beginning(10) +3500 data and sum on byte 5.

The reason we stick the key in the beginning is to sort on the key first and then data. If rdw+key+data is a match in file 2 the sum will be equal to 3. if it did not find a match then the indicator will have 1 or 2 and we are writing that to the respective output file and removing the indicator and key which we added at the beginning.

When you say something does not work you need to do a better job of explaining the requirements with sample input from 2 files and desired output from them. I am not going to guess anymore.
Back to top
View user's profile Send private message
Hima1985

New User


Joined: 17 Apr 2009
Posts: 70
Location: India

PostPosted: Tue Jul 07, 2009 4:18 pm
Reply with quote

Kolusu

Sorry for the delayed resp, Its comapring records perfectly....but giving out put in a sorted order on all the fields..i mean

IN1
Code:
XXXXXXXXXXXXXXXXX1234567890XXXXXXXXBXXXXX
XXXXXXXXXXXXXXXXX1234567898XXXXXXXXNXXXXX
XXXXXXXXXXXXXXXXX1234567899XXXXXXXXDXXXXX


IN2
Code:
XXXXXXXXXXXXXXXXX1234567890XXXXXXXXBXXXXX
XXXXXXXXXXXXXXXXX1234567898XXXXXXXXDXXXXX
XXXXXXXXXXXXXXXXX1234567899XXXXXXXXSXXXXX


with your suggested code out put files coming as

OP1
Code:
XXXXXXXXXXXXXXXXX1234567899XXXXXXXXDXXXXX
XXXXXXXXXXXXXXXXX1234567898XXXXXXXXNXXXXX


OP2
Code:

XXXXXXXXXXXXXXXXX1234567898XXXXXXXXDXXXXX
XXXXXXXXXXXXXXXXX1234567899XXXXXXXXSXXXXX


which means its sorting on all the record not on the key position 18 please observe the record which is having D in it. My requirement is compare should happen on complete record but sort should be on key position...which means my expected output is

OP1
Code:
XXXXXXXXXXXXXXXXX1234567898XXXXXXXXNXXXXX
XXXXXXXXXXXXXXXXX1234567899XXXXXXXXDXXXXX



OP2
Code:
XXXXXXXXXXXXXXXXX1234567898XXXXXXXXDXXXXX
XXXXXXXXXXXXXXXXX1234567899XXXXXXXXSXXXXX
Back to top
View user's profile Send private message
Hima1985

New User


Joined: 17 Apr 2009
Posts: 70
Location: India

PostPosted: Wed Jul 08, 2009 12:05 am
Reply with quote

Kolusu/Frank,

Any suggestions on this?
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Wed Jul 08, 2009 12:50 am
Reply with quote

Please. . . Do not pester . . . icon_evil.gif

Keep in mind that you have multiple topics waiting for someone else to to your work. . .

d
Back to top
View user's profile Send private message
Hima1985

New User


Joined: 17 Apr 2009
Posts: 70
Location: India

PostPosted: Wed Jul 08, 2009 12:58 am
Reply with quote

Fine Dick

Thanks, will wait.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Wed Jul 08, 2009 1:19 am
Reply with quote

Quote:
with your suggested code out put files coming as

OP1
Code:
XXXXXXXXXXXXXXXXX1234567899XXXXXXXXDXXXXX
XXXXXXXXXXXXXXXXX1234567898XXXXXXXXNXXXXX


I don't know how you got that result. When I run with Kolusu's code I get:

XXXXXXXXXXXXXXXXX1234567898XXXXXXXXNXXXXX
XXXXXXXXXXXXXXXXX1234567899XXXXXXXXDXXXXX

as you said you wanted. Since Kolusu is sorting on the key in 18-27 first, 1234567898 there would sort before 1234567899 there and the N and D wouldn't matter.

Note that we're assuming all of your records are the same length (at least 3504 bytes). If they are, in fact, different lengths, that could explain what you're seeing because we're sorting on the length first, so a shorter record would appear before a longer record.

We don't know exactly what your data looks like (you don't show the record lengths), so we can only go by what you tell us (or what we assume).

We have DFSORT development work to do, so we can't afford the time to play guessing games.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Compare 2 files and retrive records f... DFSORT/ICETOOL 3
No new posts Compare 2 files(F1 & F2) and writ... JCL & VSAM 8
No new posts Write line by line from two files DFSORT/ICETOOL 7
No new posts Compare only first records of the fil... SYNCSORT 7
This topic is locked: you cannot edit posts or make replies. Automation need help in sorting the data DFSORT/ICETOOL 38
Search our Forums:

Back to Top