IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Extract matching records from two files


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
murmohk1

Senior Member


Joined: 29 Jun 2006
Posts: 1436
Location: Bangalore,India

PostPosted: Wed Jan 24, 2007 1:19 pm
Reply with quote

Hi All,

I have a flat file with n records (the count varies time to time).

My requirement is to extract all the above records (with duplicates) from another flat file (say FILE2). File2 contains few million records.


File attributes:

(1) LRECL = 400
(2) Recfm=fb
(3) Key length=14 starts at 16 column.


I used Create files with matching and non-matching records from SORTTRICK. As the number of records in file2 is in millions, my job is abending with SB37.

Is there another way to acheive the same in different manner (without using temp files or taking much space).

Regads,
Murali
Back to top
View user's profile Send private message
William Thompson

Global Moderator


Joined: 18 Nov 2006
Posts: 3156
Location: Tucson AZ

PostPosted: Wed Jan 24, 2007 2:37 pm
Reply with quote

First off, is it working correctly (at least until it abends)? Have you tested it against a small subset of both files?
Is it b37ing just the one output file? Does the output allocation equal the input allocation (since the max size could be the entire input file)?

Yes, there are other ways, but which one is dependant on the size of the key file and the sort order of the large file2.
Back to top
View user's profile Send private message
murmohk1

Senior Member


Joined: 29 Jun 2006
Posts: 1436
Location: Bangalore,India

PostPosted: Wed Jan 24, 2007 3:25 pm
Reply with quote

The JCL was working fine when I tried with 67 sample records.

Regarding B37 abend, since we are copying entire file2, Im unable to get the required space. Moreover file2 is on tape and writing files on tape for test purpose is prohibited in my shop.
Back to top
View user's profile Send private message
William Thompson

Global Moderator


Joined: 18 Nov 2006
Posts: 3156
Location: Tucson AZ

PostPosted: Wed Jan 24, 2007 3:47 pm
Reply with quote

murmohk1 wrote:
Regarding B37 abend, since we are copying entire file2, Im unable to get the required space. Moreover file2 is on tape and writing files on tape for test purpose is prohibited in my shop.
That can be a problem, the solution, if the JCL is functioning correctly, you will need more space.
Is it possible to post your JCL, maybe somebody here might have some suggestions or improvements.
Back to top
View user's profile Send private message
IQofaGerbil

Active User


Joined: 05 May 2006
Posts: 183
Location: Scotland

PostPosted: Wed Jan 24, 2007 5:50 pm
Reply with quote

What is the purpose of your test?

1 - a full system test using production sized datasets?
If yes and you do not have enough DASD (have you tried multi-volume DASD allocation perhaps?) or permission to use tapes, then you would seem to be stuck

2- to test the functionality of your process?
If yes then use a cut down version of the production file, estimate how big a file you will get away with using, then base your test on that.
Back to top
View user's profile Send private message
murmohk1

Senior Member


Joined: 29 Jun 2006
Posts: 1436
Location: Bangalore,India

PostPosted: Thu Jan 25, 2007 10:14 am
Reply with quote

Code:

//STEP3 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//*
//IN1 DD DSN=W.G.PRODSCHM.POL.BTR2,DISP=SHR     BTR2 FILE
//*
//IN2 DD DSN=ISTEST.W.G.PRODSCHM.POL.PREVMNTH.TEMP,DISP=SHR
//*
//F1 DD DSN=&&F11,UNIT=SYSDA,SPACE=(CYL,(1200,1200)),
//          DISP=(MOD,CATLG)
//T1 DD DSN=&&T11,UNIT=SYSDA,SPACE=(CYL,(1200,1200)),
//           DISP=(MOD,CATLG)
//*
//OUT12 DD DSN=ISTEST.CR7011.OUT12,DISP=(MOD,CATLG,DELETE),
//         SPACE=(CYL,(900,900),RLSE),DCB=*.IN2
//*
//OUT1 DD DUMMY
//OUT2 DD DSN=ISTEST.CR7011.OUT2,DISP=(MOD,CATLG,DELETE),
//        SPACE=(CYL,(900,900),RLSE),DCB=*.IN2
//*
//TOOLIN DD *
  SELECT FROM(IN1) TO(F1) ON(16,14,CH) FIRST
  SELECT FROM(IN2) TO(F1) ON(16,14,CH) FIRST
  SELECT FROM(F1) TO(T1) ON(16,14,CH) FIRSTDUP USING(CTL1)
  SELECT FROM(F1) TO(T1) ON(16,14,CH) NODUPS USING(CTL2)
  COPY FROM(IN1) TO(T1) USING(CTL3)
  COPY FROM(IN2) TO(T1) USING(CTL4)
  SPLICE FROM(T1) TO(OUT1) ON(16,14,CH) -
        WITHALL WITH(1,401) USING(CTL5)

/*
//CTL1CNTL DD *
* MARK RECORDS WITH FILE1/FILE2 MATCH WITH 'DD'.
       OUTFIL FNAMES=T1,OVERLAY=(401:C'DD')
/*
//CTL2CNTL DD *
* MARK RECORDS WITHOUT FILE1/FILE2 MATCH WITH 'UU'.
       OUTFIL FNAMES=T1,OVERLAY=(401:C'UU')
/*
//CTL3CNTL DD *
* MARK FILE1 RECORDS WITH '11'.
       OUTFIL FNAMES=T1,OVERLAY=(401:C'11')
/*
//CTL4CNTL DD *
*MARK FILE2 RECORDS WITH '22'.
       OUTFIL FNAMES=T1,OVERLAY=(401:C'22')
/*
//CTL5CNTL DD *
* WRITE FILE1 ONLY RECORDS TO OUT1 FILE. REMOVE ID.
       OUTFIL FNAMES=OUT1,INCLUDE=(401,2,CH,EQ,C'1U'),
          BUILD=(1,400)
* WRITE FILE2 ONLY RECORDS TO OUT2 FILE. REMOVE ID.
       OUTFIL FNAMES=OUT2,INCLUDE=(401,2,CH,EQ,C'2U'),
          BUILD=(1,400)
* WRITE MATCHING RECORDS TO OUT12 FILE. REMOVE ID.
       OUTFIL FNAMES=OUT12,SAVE,
          BUILD=(1,400)
/*



IN1 is the production dataset and IN2 has 67+K records. Please note the record count varies in both files from time to time. And also IN1 count increases.


This job is set as monthly job. I tried multi volume also. But dint work.

Without getting space abend, is there a way to extract the data.
Back to top
View user's profile Send private message
IQofaGerbil

Active User


Joined: 05 May 2006
Posts: 183
Location: Scotland

PostPosted: Thu Jan 25, 2007 4:23 pm
Reply with quote

Which file is giving the B37?
Back to top
View user's profile Send private message
muthuvel

Active User


Joined: 29 Nov 2005
Posts: 217
Location: Canada

PostPosted: Thu Jan 25, 2007 5:13 pm
Reply with quote

Please try giving

Code:
//             DATACLAS=COMPRESS,


in the JCL DD statement for the output file.
Back to top
View user's profile Send private message
murmohk1

Senior Member


Joined: 29 Jun 2006
Posts: 1436
Location: Bangalore,India

PostPosted: Sun Jan 28, 2007 11:20 am
Reply with quote

Thanks all for the replies.

F1 is throwing space abend.
Back to top
View user's profile Send private message
William Thompson

Global Moderator


Joined: 18 Nov 2006
Posts: 3156
Location: Tucson AZ

PostPosted: Sun Jan 28, 2007 11:30 pm
Reply with quote

I doubt this will help, but what is the current allocation of the two input files?
What is the content of the IEC030I message?
Quote:
The error was detected by the end-of-volume routine. This system completion code is accompanied by message IEC030I. Refer to the explanation of message IEC030I for complete information about the task that was ended and for an explanation of the return code (rc in the message text) in register 15.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Mon Jan 29, 2007 12:04 am
Reply with quote

Hello,

If you wrote a small COBOL program that would "match" the 2 files and write out what you need to meet your requirement, you would eliminate the need for more dasd or permission to use "work" tape(s) or data compression or some other work-around.

It would very likely run as fast or faster than the process that needs very large intermediate/transient storage. A 2-file match/merge is a single pass of the data and will run about the same speed as merely reading the files sequentially.
Back to top
View user's profile Send private message
IQofaGerbil

Active User


Joined: 05 May 2006
Posts: 183
Location: Scotland

PostPosted: Mon Jan 29, 2007 4:37 pm
Reply with quote

If I have got this correct,
F1 will be smaller than IN1+IN2 because it has all of the dups removed
T1 will be bigger than IN1+IN2
so if F1 fails on space then T1 will surely also fail for same reason?

You say that the main file has 'millions' of records. Do you know how many millions?

It looks to me that you need approx 7000 tracks per million records, so on a model-3 3390 disk you will squeeze in appox 7 million records.
That of course assumes that you get your hands on an empty volume (not likely!)

Unless you can view your starage pool to see what is available, then it looks like to might need to calculate your storage requirements accurately and then speak to your storage managment people to see if they can accomodate.

Also consider (depending on your actual calculations) reducing your secondary space allocation request, you might be getting a B37 because the disk allocated to you does not have 1200cyls available when you might not actually need it.
Back to top
View user's profile Send private message
murmohk1

Senior Member


Joined: 29 Jun 2006
Posts: 1436
Location: Bangalore,India

PostPosted: Mon Jan 29, 2007 5:00 pm
Reply with quote

Thanks IQofaGerbil for the information.

Since IN1 is a master file (kind of), record count is getting increased daily. As of now its holding close to 4 million records. As expected, Im unable to get empty volume.

Writing a program is ruled out as the records are stored randomly. I need to open/close the multiple times (which again is not a good programming technique).

Is there a way to extract the records in some other manner.
Back to top
View user's profile Send private message
IQofaGerbil

Active User


Joined: 05 May 2006
Posts: 183
Location: Scotland

PostPosted: Mon Jan 29, 2007 5:30 pm
Reply with quote

Looks like you 'only' need approx 2000 cyls for each of T1 F1.

Can you see your storage pools to find out if there are disks with that kind of space available?
Depending on the storage management system in your shop there might be few disks with 'big' (1200cyls) amounts of contiguous space but lots with small/meduim amounts.

Use trial and error , why not try playing with your allocation numbers eg (480,90) or (150,150)
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Mon Jan 29, 2007 8:29 pm
Reply with quote

Hello,

Writing a program should NOT be ruled just because the way the data is stored is not convenient for this process. If your data is "random", sort it before comparing the files. You do not need to keep the sorted data, just use it for the compare, then delete it.

Depending on just how your process works, you may have created a process that will run for many, many hours - if it ever completes with the full volume of data. If you need to open/read/close a file containing several million records and do this 60-70thousand times, my guess is that the job will never be allowed to complete. If you multiple 65,000 by 5 million, you get 325,000,000,000 "reads".
Back to top
View user's profile Send private message
murmohk1

Senior Member


Joined: 29 Jun 2006
Posts: 1436
Location: Bangalore,India

PostPosted: Tue Jan 30, 2007 5:57 pm
Reply with quote

Whether file is sorted or not, I guess it occupies same space. Since the required space was not available for my job (using dfsort technique), job is failing with space abend.
Back to top
View user's profile Send private message
shuklas

New User


Joined: 21 Dec 2006
Posts: 20
Location: London

PostPosted: Tue Jan 30, 2007 7:09 pm
Reply with quote

You can use DATACLAS=DSIZE10

It can accommodate 10MB of data.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Tue Jan 30, 2007 8:54 pm
Reply with quote

Hello,

Please post your sort jcl, the control statements, and the abend info.
Back to top
View user's profile Send private message
murmohk1

Senior Member


Joined: 29 Jun 2006
Posts: 1436
Location: Bangalore,India

PostPosted: Thu Feb 01, 2007 5:48 pm
Reply with quote

I had posted my JCL in the previous posts. Attached is the spool content (xdc).

Note : Ran the job again today for the spool content. I had changed the SPACE parameter only, others being as it was previous run.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Thu Feb 01, 2007 9:05 pm
Reply with quote

Hello,

Are you sure you posted the output that was created from the JCL you posted previously? Your posted jcl says it is STEP3. The jesysmsg.doc has no STEP3. This is where the abend occurred in that attached output
Code:

* STEPNAME  PGM NAME   COMP   
* BTR2      ICETOOL   *SB37   


From the same jes output, please post the jcl that was actually used in the run that provided the jesysmsg.log you attached.

Also, while this may be possible with your space restrictions and using the sort, it have could taken you a just couple of hours to have it running in COBOL and would use MUCH less maching resources.

If you change your UNIT parameter on the big output and intermediate datasets to UNIT=(SYSDA,16) and your SPACE to SPACE=(CYL,(2500,500),rlse) you will have a better chance. The current allocations will abend when you fill up the volume initially allocated (unless your system dynamically spans volumes - many don't). The sortwork space may dynamically get more work space, but "real" files are usually bound by your unit/space specifications from the jcl.
Back to top
View user's profile Send private message
murmohk1

Senior Member


Joined: 29 Jun 2006
Posts: 1436
Location: Bangalore,India

PostPosted: Fri Feb 02, 2007 2:14 pm
Reply with quote

In my shop, volume allocation is done dynamically by the system (as told by the batch management people). So, I havent used UNIT parameter in the job.

Also, I had attached the job from JES output.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Fri Feb 02, 2007 8:46 pm
Reply with quote

Hello,

From here, i'd recommend one or more of the following:

1. Talk with the batch management people and find out how much space is available in the storage class your job dynamically uses.

2. Try a run with these jcl changes (from above - UNIT=(SYSDA,16) and SPACE=(CYL,(2500,500),rlse) and see if that helps. If 2500 is too big, lower it, but if you cannot get 2500 initially, i suspect you will still have space issues. In this shop, our datasets often dynamically span packs (we do use the basic unit parameter), but when i ran into space problems, includeing the ",16" got around the abends.

3. Go ahead and write the COBOL code. After all, most of the folks here ARE programmers icon_biggrin.gif

Good luck and keep us posted.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Compare 2 files(F1 & F2) and writ... JCL & VSAM 8
No new posts Extract the file name from another fi... DFSORT/ICETOOL 6
No new posts Write line by line from two files DFSORT/ICETOOL 7
No new posts Compare only first records of the fil... SYNCSORT 7
No new posts Need help for File Aid JCL to extract... Compuware & Other Tools 23
Search our Forums:

Back to Top