I used Create files with matching and non-matching records from SORTTRICK. As the number of records in file2 is in millions, my job is abending with SB37.
Is there another way to acheive the same in different manner (without using temp files or taking much space).
Joined: 18 Nov 2006 Posts: 3156 Location: Tucson AZ
First off, is it working correctly (at least until it abends)? Have you tested it against a small subset of both files?
Is it b37ing just the one output file? Does the output allocation equal the input allocation (since the max size could be the entire input file)?
Yes, there are other ways, but which one is dependant on the size of the key file and the sort order of the large file2.
Joined: 29 Jun 2006 Posts: 1436 Location: Bangalore,India
The JCL was working fine when I tried with 67 sample records.
Regarding B37 abend, since we are copying entire file2, Im unable to get the required space. Moreover file2 is on tape and writing files on tape for test purpose is prohibited in my shop.
Joined: 18 Nov 2006 Posts: 3156 Location: Tucson AZ
murmohk1 wrote:
Regarding B37 abend, since we are copying entire file2, Im unable to get the required space. Moreover file2 is on tape and writing files on tape for test purpose is prohibited in my shop.
That can be a problem, the solution, if the JCL is functioning correctly, you will need more space.
Is it possible to post your JCL, maybe somebody here might have some suggestions or improvements.
1 - a full system test using production sized datasets?
If yes and you do not have enough DASD (have you tried multi-volume DASD allocation perhaps?) or permission to use tapes, then you would seem to be stuck
2- to test the functionality of your process?
If yes then use a cut down version of the production file, estimate how big a file you will get away with using, then base your test on that.
/*
//CTL1CNTL DD *
* MARK RECORDS WITH FILE1/FILE2 MATCH WITH 'DD'.
OUTFIL FNAMES=T1,OVERLAY=(401:C'DD')
/*
//CTL2CNTL DD *
* MARK RECORDS WITHOUT FILE1/FILE2 MATCH WITH 'UU'.
OUTFIL FNAMES=T1,OVERLAY=(401:C'UU')
/*
//CTL3CNTL DD *
* MARK FILE1 RECORDS WITH '11'.
OUTFIL FNAMES=T1,OVERLAY=(401:C'11')
/*
//CTL4CNTL DD *
*MARK FILE2 RECORDS WITH '22'.
OUTFIL FNAMES=T1,OVERLAY=(401:C'22')
/*
//CTL5CNTL DD *
* WRITE FILE1 ONLY RECORDS TO OUT1 FILE. REMOVE ID.
OUTFIL FNAMES=OUT1,INCLUDE=(401,2,CH,EQ,C'1U'),
BUILD=(1,400)
* WRITE FILE2 ONLY RECORDS TO OUT2 FILE. REMOVE ID.
OUTFIL FNAMES=OUT2,INCLUDE=(401,2,CH,EQ,C'2U'),
BUILD=(1,400)
* WRITE MATCHING RECORDS TO OUT12 FILE. REMOVE ID.
OUTFIL FNAMES=OUT12,SAVE,
BUILD=(1,400)
/*
IN1 is the production dataset and IN2 has 67+K records. Please note the record count varies in both files from time to time. And also IN1 count increases.
This job is set as monthly job. I tried multi volume also. But dint work.
Without getting space abend, is there a way to extract the data.
Joined: 18 Nov 2006 Posts: 3156 Location: Tucson AZ
I doubt this will help, but what is the current allocation of the two input files?
What is the content of the IEC030I message?
Quote:
The error was detected by the end-of-volume routine. This system completion code is accompanied by message IEC030I. Refer to the explanation of message IEC030I for complete information about the task that was ended and for an explanation of the return code (rc in the message text) in register 15.
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
Hello,
If you wrote a small COBOL program that would "match" the 2 files and write out what you need to meet your requirement, you would eliminate the need for more dasd or permission to use "work" tape(s) or data compression or some other work-around.
It would very likely run as fast or faster than the process that needs very large intermediate/transient storage. A 2-file match/merge is a single pass of the data and will run about the same speed as merely reading the files sequentially.
If I have got this correct,
F1 will be smaller than IN1+IN2 because it has all of the dups removed
T1 will be bigger than IN1+IN2
so if F1 fails on space then T1 will surely also fail for same reason?
You say that the main file has 'millions' of records. Do you know how many millions?
It looks to me that you need approx 7000 tracks per million records, so on a model-3 3390 disk you will squeeze in appox 7 million records.
That of course assumes that you get your hands on an empty volume (not likely!)
Unless you can view your starage pool to see what is available, then it looks like to might need to calculate your storage requirements accurately and then speak to your storage managment people to see if they can accomodate.
Also consider (depending on your actual calculations) reducing your secondary space allocation request, you might be getting a B37 because the disk allocated to you does not have 1200cyls available when you might not actually need it.
Joined: 29 Jun 2006 Posts: 1436 Location: Bangalore,India
Thanks IQofaGerbil for the information.
Since IN1 is a master file (kind of), record count is getting increased daily. As of now its holding close to 4 million records. As expected, Im unable to get empty volume.
Writing a program is ruled out as the records are stored randomly. I need to open/close the multiple times (which again is not a good programming technique).
Is there a way to extract the records in some other manner.
Looks like you 'only' need approx 2000 cyls for each of T1 F1.
Can you see your storage pools to find out if there are disks with that kind of space available?
Depending on the storage management system in your shop there might be few disks with 'big' (1200cyls) amounts of contiguous space but lots with small/meduim amounts.
Use trial and error , why not try playing with your allocation numbers eg (480,90) or (150,150)
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
Hello,
Writing a program should NOT be ruled just because the way the data is stored is not convenient for this process. If your data is "random", sort it before comparing the files. You do not need to keep the sorted data, just use it for the compare, then delete it.
Depending on just how your process works, you may have created a process that will run for many, many hours - if it ever completes with the full volume of data. If you need to open/read/close a file containing several million records and do this 60-70thousand times, my guess is that the job will never be allowed to complete. If you multiple 65,000 by 5 million, you get 325,000,000,000 "reads".
Joined: 29 Jun 2006 Posts: 1436 Location: Bangalore,India
Whether file is sorted or not, I guess it occupies same space. Since the required space was not available for my job (using dfsort technique), job is failing with space abend.
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
Hello,
Are you sure you posted the output that was created from the JCL you posted previously? Your posted jcl says it is STEP3. The jesysmsg.doc has no STEP3. This is where the abend occurred in that attached output
Code:
* STEPNAME PGM NAME COMP
* BTR2 ICETOOL *SB37
From the same jes output, please post the jcl that was actually used in the run that provided the jesysmsg.log you attached.
Also, while this may be possible with your space restrictions and using the sort, it have could taken you a just couple of hours to have it running in COBOL and would use MUCH less maching resources.
If you change your UNIT parameter on the big output and intermediate datasets to UNIT=(SYSDA,16) and your SPACE to SPACE=(CYL,(2500,500),rlse) you will have a better chance. The current allocations will abend when you fill up the volume initially allocated (unless your system dynamically spans volumes - many don't). The sortwork space may dynamically get more work space, but "real" files are usually bound by your unit/space specifications from the jcl.
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
Hello,
From here, i'd recommend one or more of the following:
1. Talk with the batch management people and find out how much space is available in the storage class your job dynamically uses.
2. Try a run with these jcl changes (from above - UNIT=(SYSDA,16) and SPACE=(CYL,(2500,500),rlse) and see if that helps. If 2500 is too big, lower it, but if you cannot get 2500 initially, i suspect you will still have space issues. In this shop, our datasets often dynamically span packs (we do use the basic unit parameter), but when i ran into space problems, includeing the ",16" got around the abends.
3. Go ahead and write the COBOL code. After all, most of the folks here ARE programmers