Portal | Manuals | References | Downloads | Info | Programs | JCLs | Master the Mainframes
IBM Mainframe Computers Forums Index
 
Register
 
IBM Mainframe Computers Forums Index Mainframe: Search IBM Mainframe Forum: FAQ Memberlist Usergroups Profile Log in to check your private messages Log in
 

 

Sorting two files

 
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL
View previous topic :: :: View next topic  
Author Message
Hima1985

New User


Joined: 17 Apr 2009
Posts: 70
Location: India

PostPosted: Thu Jul 02, 2009 12:27 am    Post subject: Sorting two files
Reply with quote

Hi All,

One more solution needed, thanks a lot for all your previous suggestions :-)

I have a requirement as below :
I need to comapre two files and list out differences in to two files. files are of length 4504 and VB.

File1 :
Code:

BBBBBBBBBBBBBBBBB12345672BBBBBBBBBBBBBBBBBBBBBB
CCCCCCCCCCCCCCCCC12345671CCCCCCCCCCCCCCCCCCCCCC
AAAAAAAAAAAAAAAAA12345673AAAAAAAAAAAAAAAAAAAAAA


File2:
Code:

AAAAAAAAAAAAAAAAA12345673AAAAAAAAAAAAADDDDDDDD
BBBBBBBBBBBBBBBBB12345672BBBBBBBBBBBBBAAAAAAAA
CCCCCCCCCCCCCCCCC12345671CCCCCCCCCCCCCCCCCCCCC


I need to compare byte by byte and list out the differences in output files for that i have used below code

Code:
//TOOLIN DD *
COPY FROM(FILEB) USING(CTL1)
COPY FROM(FILEA) TO(T1) USING(CTL2)
SELECT FROM(T1) TO(FILEC) ON(1,1500,CH) ON(1500,1500,CH) -
   ON(3000,1500,CH) NODUPS USING(CTL3)
SELECT FROM(T1) TO(FILED) ON(1,1500,CH) ON(1500,1500,CH) -
   ON(3000,1500,CH) NODUPS USING(CTL4)
/*
//CTL1CNTL DD *
  INREC BUILD=(1,4,5:C'BB',7:4504)
/*
//CTL2CNTL DD *
  INREC BUILD=(1,4,5:C'VV',7:4504)
/*
//CTL3CNTL DD *
  OUTFIL FNAMES=FILEC,INCLUDE=(5,2,CH,EQ,C'VB'),
    BUILD=(1,4,5:4504)
/*
//CTL4CNTL DD *
  OUTFIL FNAMES=FILED,INCLUDE=(5,2,CH,EQ,C'VV'),
    BUILD=(1,4,5:4504)
/*


Its giving me FILEC and FILED as outputs correctly

FILEC
Code:
AAAAAAAAAAAAAAAAA12345673AAAAAAAAAAAAAAAAAAAAAA
BBBBBBBBBBBBBBBBB12345672BBBBBBBBBBBBBBBBBBBBBB


FILED
Code:
AAAAAAAAAAAAAAAAA12345673AAAAAAAAAAAAADDDDDDDD
BBBBBBBBBBBBBBBBB12345672BBBBBBBBBBBBBAAAAAAAA


its giving like this as these records are having differences, you you observe records got sorted and came as output, but my requirement is these records needs to get compared, but needs to get sorted only on a key which is starting at 18th position to 10 bytes

which means my out puts should be

FILEC
Code:
BBBBBBBBBBBBBBBBB12345672BBBBBBBBBBBBBBBBBBBBBB
AAAAAAAAAAAAAAAAA12345673AAAAAAAAAAAAAAAAAAAAAA


FILED
Code:
BBBBBBBBBBBBBBBBB12345672BBBBBBBBBBBBBAAAAAAAA
AAAAAAAAAAAAAAAAA12345673AAAAAAAAAAAAADDDDDDDD

can we achive this with sort? If yes please help me out.
Back to top
View user's profile Send private message

Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Thu Jul 02, 2009 12:37 am    Post subject:
Reply with quote

Hima1985,

The Total bytes you can sort must NOT exceed 4092 (or, when the EQUALS option is in operation, 4088 bytes). You are trying to sort beyond that .

Here is a trick to sort 4500 bytes with FB input.

http://ibmmainframes.com/viewtopic.php?p=125375#125375

However with VB FILES it is always messy and I suggest you write a program for this type of comparison.
Back to top
View user's profile Send private message
Hima1985

New User


Joined: 17 Apr 2009
Posts: 70
Location: India

PostPosted: Thu Jul 02, 2009 12:41 am    Post subject:
Reply with quote

Skolusu,

Thanks for the quicky reply.

my comparision till 3500 bytes will do.... as i will have only spaces in remianing bytes...

I tried to write a PLI for this, my input files are having lakhs of records, so every record needs to be comapred with all the records in other files...its taking hell lot of IO operations....and my program running for 10+ hours.
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 7904
Location: Bellevue, IA

PostPosted: Thu Jul 02, 2009 1:03 am    Post subject:
Reply with quote

Why not just sort the 3500 bytes of each file since that CAN be done via sort? Then your PL/I comparison program is a simple two-way match program and will finish in a reasonable amount of time.
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Thu Jul 02, 2009 2:51 am    Post subject:
Reply with quote

Hello,

Quote:
so every record needs to be comapred with all the records in other files...its taking hell lot of IO operations..
The worst possible solution. . . (generating a qsam cartesian product). Unless your family has the hardware sales contract icon_smile.gif

As Robert suggests, you simply need a "two-way" match program. Just for such a need, there is a "Sticky" near the top of the COBOL part of the forum with working 2-file match/merge code. You should be able to easily adapt this to your requirement.
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Thu Jul 02, 2009 6:16 am    Post subject:
Reply with quote

Hima1985,

I don't understand as to why you need 4 passes of data when both files are of the same LRECL. Since both files are of the same length can't you concatenate them together and perform the sorting on 3500 and look for duplicates?

Here is a sample JCL which will give you the desired results. we create a 1 byte header record in step0100 and concatenate it with your input files which would be used to identify the record to the file when using the group function. Since you wanted the data to be sorted on the key at 18 , we put it at the beginning at pos and followed by your data and it is sorted so that we retain the sequence you wanted. Any matching records will be summed and total would be gt > 2 , so we ignore them and select the records with indicator 1 and 2.


Code:

//STEP0100 EXEC PGM=SORT                                           
//SYSOUT   DD SYSOUT=*                                             
//SORTIN   DD *                                                   
HDR                                                               
//SORTOUT  DD DSN=&&H,DISP=(,PASS),SPACE=(TRK,(1,1),RLSE)         
//SYSIN    DD *                                                   
  SORT FIELDS=COPY                                                 
  INREC OVERLAY=(4500:X)                                           
  OUTFIL FTOV                                                     
/*                                                                 
//STEP0200 EXEC PGM=SORT                                           
//SYSOUT   DD SYSOUT=*                                             
//SORTIN   DD DSN=&&H,DISP=SHR,VOL=REF=*.STEP0100.SORTOUT         
//         DD DSN=Your input file1,DISP=SHR 
//         DD DSN=&&H,DISP=SHR,VOL=REF=*.STEP0100.SORTOUT         
//         DD DSN==Your input file2,DISP=SHR   
//FILEC    DD SYSOUT=*                                             
//FILED    DD SYSOUT=*                                             
//SYSIN    DD *                                                   
  INREC IFTHEN=(WHEN=INIT,BUILD=(1,4,C'0',22,10,5)),               
  IFTHEN=(WHEN=GROUP,BEGIN=(16,3,CH,EQ,C'HDR'),PUSH=(5:ID=1))     
  SORT FIELDS=(1,4,BI,A,6,3510,BI,A)                               
  SUM FIELDS=(5,1,ZD)                                             
  OUTFIL FNAMES=FILEC,INCLUDE=(5,1,ZD,EQ,1),BUILD=(1,4,16)         
  OUTFIL FNAMES=FILED,INCLUDE=(5,1,ZD,EQ,2),BUILD=(1,4,16)         
/*
Back to top
View user's profile Send private message
Hima1985

New User


Joined: 17 Apr 2009
Posts: 70
Location: India

PostPosted: Fri Jul 03, 2009 1:08 am    Post subject:
Reply with quote

Kolusu

I have tried the solution you have provided.....but no luck

Its comapring records perfectly....but giving out put in a sorted order on all the fields...not only on 18th to 10 bytes....as well why we are using build BUILD=(1,4,16) please suggest

Please suggest.
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Fri Jul 03, 2009 3:19 am    Post subject: Reply to: Sorting two files
Reply with quote

Hima1985 wrote:
Kolusu

I have tried the solution you have provided.....but no luck

Its comapring records perfectly...
.but giving out put in a sorted order on all the fields...not only on 18th to 10 bytes....as well why we are using build BUILD=(1,4,16) please suggest

Please suggest.


Hima,

Did the comparison work or did it fail? The Inrec reformats the records as follows


Code:

RDW |IND|KEY(22-31)|ACTUAL DATA
1-4 |5-5|6-15      |16 - END   
--------------------------------
ZZLL| 1 |12345672BB|file1 data 
ZZLL| 1 |12345671CC|file1 data 
ZZLL| 1 |12345673AA|file1 data 
                               
ZZLL| 2 |12345673AA|file2 data 
ZZLL| 2 |12345672BB|file2 data 
ZZLL| 2 |12345671CC|file2 data 


Now we sort on rdw (4) + key we added at the beginning(10) +3500 data and sum on byte 5.

The reason we stick the key in the beginning is to sort on the key first and then data. If rdw+key+data is a match in file 2 the sum will be equal to 3. if it did not find a match then the indicator will have 1 or 2 and we are writing that to the respective output file and removing the indicator and key which we added at the beginning.

When you say something does not work you need to do a better job of explaining the requirements with sample input from 2 files and desired output from them. I am not going to guess anymore.
Back to top
View user's profile Send private message
Hima1985

New User


Joined: 17 Apr 2009
Posts: 70
Location: India

PostPosted: Tue Jul 07, 2009 4:18 pm    Post subject:
Reply with quote

Kolusu

Sorry for the delayed resp, Its comapring records perfectly....but giving out put in a sorted order on all the fields..i mean

IN1
Code:
XXXXXXXXXXXXXXXXX1234567890XXXXXXXXBXXXXX
XXXXXXXXXXXXXXXXX1234567898XXXXXXXXNXXXXX
XXXXXXXXXXXXXXXXX1234567899XXXXXXXXDXXXXX


IN2
Code:
XXXXXXXXXXXXXXXXX1234567890XXXXXXXXBXXXXX
XXXXXXXXXXXXXXXXX1234567898XXXXXXXXDXXXXX
XXXXXXXXXXXXXXXXX1234567899XXXXXXXXSXXXXX


with your suggested code out put files coming as

OP1
Code:
XXXXXXXXXXXXXXXXX1234567899XXXXXXXXDXXXXX
XXXXXXXXXXXXXXXXX1234567898XXXXXXXXNXXXXX


OP2
Code:

XXXXXXXXXXXXXXXXX1234567898XXXXXXXXDXXXXX
XXXXXXXXXXXXXXXXX1234567899XXXXXXXXSXXXXX


which means its sorting on all the record not on the key position 18 please observe the record which is having D in it. My requirement is compare should happen on complete record but sort should be on key position...which means my expected output is

OP1
Code:
XXXXXXXXXXXXXXXXX1234567898XXXXXXXXNXXXXX
XXXXXXXXXXXXXXXXX1234567899XXXXXXXXDXXXXX



OP2
Code:
XXXXXXXXXXXXXXXXX1234567898XXXXXXXXDXXXXX
XXXXXXXXXXXXXXXXX1234567899XXXXXXXXSXXXXX
Back to top
View user's profile Send private message
Hima1985

New User


Joined: 17 Apr 2009
Posts: 70
Location: India

PostPosted: Wed Jul 08, 2009 12:05 am    Post subject:
Reply with quote

Kolusu/Frank,

Any suggestions on this?
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Wed Jul 08, 2009 12:50 am    Post subject: Reply to: Sorting two files
Reply with quote

Please. . . Do not pester . . . icon_evil.gif

Keep in mind that you have multiple topics waiting for someone else to to your work. . .

d
Back to top
View user's profile Send private message
Hima1985

New User


Joined: 17 Apr 2009
Posts: 70
Location: India

PostPosted: Wed Jul 08, 2009 12:58 am    Post subject:
Reply with quote

Fine Dick

Thanks, will wait.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Moderator


Joined: 15 Feb 2005
Posts: 7130
Location: San Jose, CA

PostPosted: Wed Jul 08, 2009 1:19 am    Post subject:
Reply with quote

Quote:
with your suggested code out put files coming as

OP1
Code:
XXXXXXXXXXXXXXXXX1234567899XXXXXXXXDXXXXX
XXXXXXXXXXXXXXXXX1234567898XXXXXXXXNXXXXX


I don't know how you got that result. When I run with Kolusu's code I get:

XXXXXXXXXXXXXXXXX1234567898XXXXXXXXNXXXXX
XXXXXXXXXXXXXXXXX1234567899XXXXXXXXDXXXXX

as you said you wanted. Since Kolusu is sorting on the key in 18-27 first, 1234567898 there would sort before 1234567899 there and the N and D wouldn't matter.

Note that we're assuming all of your records are the same length (at least 3504 bytes). If they are, in fact, different lengths, that could explain what you're seeing because we're sorting on the length first, so a shorter record would appear before a longer record.

We don't know exactly what your data looks like (you don't show the record lengths), so we can only go by what you tell us (or what we assume).

We have DFSORT development work to do, so we can't afford the time to play guessing games.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL All times are GMT + 6 Hours
Page 1 of 1

 

Search our Forum:

Similar Topics
Topic Author Forum Replies Posted
No new posts High CPU consumption Job using IAM fi... aswinir JCL & VSAM 8 Thu Dec 01, 2016 8:28 pm
No new posts Match or compare two files in VB Format anatol DFSORT/ICETOOL 14 Thu Nov 03, 2016 7:41 pm
No new posts Efficient sorting chandracdac DFSORT/ICETOOL 5 Sat Oct 22, 2016 3:23 am
This topic is locked: you cannot edit posts or make replies. How to use 2 input files in control c... Gunapala CN DFSORT/ICETOOL 23 Thu Oct 13, 2016 3:42 pm
No new posts Adding records from two files into on... shiitiizz SYNCSORT 4 Mon Sep 19, 2016 8:41 pm


Facebook
Back to Top
 
Mainframe Wiki | Forum Rules | Bookmarks | Subscriptions | FAQ | Tutorials | Contact Us