View previous topic :: View next topic
|
Author |
Message |
ajsimon
New User
Joined: 12 Oct 2006 Posts: 5 Location: USA
|
|
|
|
I have coded a COBOL program, which read two sequential files and performs a slight calculations and printing the result in two out put file depends on the Flag in the input records.
During this processing, my program will take the First record from the First file and check against all the records in the second file. if there is a matching record found in the second file it will perform the calculation and go back to first file to take the next record.
The program works fine with my input test records. However when I ran the program with the Full load, it is taking more 24 hrs and it is failing with S322 (Time out ABEND).
When I tested the COBOL program with test input, I have selected 175 input records for various test scenarios.
Both my expected output and actual out are same as I expected.
There is Looping during the processing with test record.
My full load has 342,738 records, where as my test input has only 175.
Program took 9 Sec to process the 175 records.
Is there any way that I can reduce the processing time by loading the input data into a DB2 table and then process it?
Thank you! |
|
Back to top |
|
|
ajsimon
New User
Joined: 12 Oct 2006 Posts: 5 Location: USA
|
|
|
|
Oops !
I meant to say there is no Looping during my Testing
Sorry.
thanks |
|
Back to top |
|
|
socker_dad
Active User
Joined: 05 Dec 2006 Posts: 177 Location: Seattle, WA
|
|
|
|
With that kind of execution time, something is desperately wrong - somewhere, somehow, you are sliding into an infinite loop.
An unexpected condition is occurring that your program does not handle.
Can you upload a file of your program? Or at least post the main logic?
How is the data defined? Are there keys? Has the data been sorted on the keys? What about duplicate keys? |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Please rethink your approach to your requirement.
If you read one file and for each record in it you read all of the records in the other file, you may have created a process that can never be run. As you've posted that one file has 342,738 records how many does the "other" file have?
How are the 2 files related? What are you comparing against to determine if this is a "hit" or not?
Your better solution is probably going to be putting both files (or a working copy of them) in the same sequence and "matching" them to determine the hits and misses. |
|
Back to top |
|
|
lcmontanez
New User
Joined: 19 Jun 2007 Posts: 50 Location: Chicago
|
|
|
|
dick scherrer wrote: |
Your better solution is probably going to be putting both files (or a working copy of them) in the same sequence and "matching" them to determine the hits and misses. |
I agree with Dick.
FYI, EZT plus has a great synchronized file processing feature (very easy) if that is an option at your shop (nothing wrong with COBOL doing it). |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
Program took 9 Sec to process the 175 records. |
I just happened to run a job that processed 6885 records and my job ran in 6 seconds. 9 seconds is a long time for so few records (unless maybe your system is not so fast or it may be "pegged").
As a test, you might just read one file and see how long that takes. Reading both files and processing the hits and misses should take only slightly longer than reading both files one time. |
|
Back to top |
|
|
William Thompson
Global Moderator
Joined: 18 Nov 2006 Posts: 3156 Location: Tucson AZ
|
|
|
|
dick scherrer wrote: |
Reading both files and processing the hits and misses should take only slightly longer than reading both files one time. |
342,738 times as long? |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Methinks that "342,738 times as long?" is far more than slightly. . .
Cartesian products do require exponentially more resoures. |
|
Back to top |
|
|
kgumraj2
New User
Joined: 01 Aug 2007 Posts: 42 Location: Hyderabad
|
|
|
|
Hi,
If in your program, you have INSPECT verb , do try to avoid. As this is not recommended.
We had similar issue. |
|
Back to top |
|
|
Anuj Dhawan
Superior Member
Joined: 22 Apr 2006 Posts: 6248 Location: Mumbai, India
|
|
|
|
ajsimon,
Try one more thing, in your JCL executing your program, use TIME parameter with some low value of time & use DSIP=(NEW,CATLG,CATLG) for the outputs. Put display at start & end of, in your COBOL program, where there are some calculation. Browse the output file after JOB completion & see the SYSOUT, hopefully this'll help you to find out the reason behind your problem. |
|
Back to top |
|
|
stodolas
Active Member
Joined: 13 Jun 2007 Posts: 631 Location: Wisconsin
|
|
|
|
As Dick and William mentioned. The root problem here is a full search of file 2. So you read file 1 and search through 300,000 recs in file2. Then you read file1 and search through all 300,000 recs in file 2, etc etc. So if file 1 has 300,000 recs and file 2 has 300,000 recs, You are doing 90,000,000,000 reads of file2. Sure INSPECT is not so good, but 90,000,000,000 reads for a full file search is worse.
Sort both files in the same order before running the program then run a "normal" 2 file compare |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
If in your program, you have INSPECT verb , do try to avoid. As this is not recommended.
We had similar issue. |
Might not be recommended in your system, but there is no good reason to not use INSPECT for cases where it will perform better than writing the same functionality in your own code.
I'll predict that the INSPECT is not what causes the extremely long run-time.
If you need to create a 2-file match/merge and are unsure how to proceed, there is a "sticky" at the top of the COBOL forum that contains sample code. |
|
Back to top |
|
|
gpowell382
New User
Joined: 25 Aug 2005 Posts: 31 Location: USA
|
|
|
|
Have you tried running a utility sort program before you run your program and put both files in the same order - you may want to make a copy of the original files for your program - I process over 6 million records comparing two inputs and it runs in less than 1/2 hour. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
I process over 6 million records comparing two inputs and it runs in less than 1/2 hour. |
Yup, sorting the files first and using a match/merge instead of multiple full-file traversals should take this job down to several minutes. . . |
|
Back to top |
|
|
ajsimon
New User
Joined: 12 Oct 2006 Posts: 5 Location: USA
|
|
|
|
Hi Everyone,
Thanks a Lot for all your Help.
I am going to try your suggestions and see how things are going.
I will surely update the status.
Just to let you know that my program does not use any INSPECT verb.
My First File is the one that has 342,738.
But my second file has only 10,000 records. |
|
Back to top |
|
|
stodolas
Active Member
Joined: 13 Jun 2007 Posts: 631 Location: Wisconsin
|
|
|
|
So it is 3,427,380,000 reads of file 2 without the sort instead of the 90,000,000,000 |
|
Back to top |
|
|
ajsimon
New User
Joined: 12 Oct 2006 Posts: 5 Location: USA
|
|
|
|
Thats Correct. but the input Files are sorted by the FOCUS Program.
Thanks |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Does this
Quote: |
the input Files are sorted by the FOCUS Program |
mean that both files are already in the same sequence? Regardless of what "the FOCUS program" did, the 2 files need to be in the same sequence for your process to work well. This may require an additional sort of one or both files.
Once you have the data in the proper sequence, you mikght want to check out the 2-file match sample code that is in the "sticky". |
|
Back to top |
|
|
ajsimon
New User
Joined: 12 Oct 2006 Posts: 5 Location: USA
|
|
|
|
Hi,
The Files are different from each other, both are having the different layout and different record length, however the first fields are same in both the file and both are sorted by the first field. The first fields in the 2 files are 6 digit account number. I am matching this account field to check the other records and performing my calculation.
My main concern is, when I run the job with my test records it worked fine, and performed all the calculation and printed all the records in my out files the way it suppose to be. I am not seeing any looping during the process.
But when I am running with the Full Volume of data then the job is running more than 24 hrs and failing with time out error.
So I am not able find out whether the job is looping or it is still processing.
If I use display, the job is failing with line print exceeded ABEND
Even though I increase the Print Line numbers in the JCL.
Is there any way that I can trouble shoot this problem. Or
Can you recommend any other Testing Process.
Thanks
Simon |
|
Back to top |
|
|
stodolas
Active Member
Joined: 13 Jun 2007 Posts: 631 Location: Wisconsin
|
|
|
|
Don't do a full search through all of File2 after reading File1. That is your biggest processing time thing. Like Dick and I have both said, do a 2 file compare type programe. You will read each record in File 1 once and each record in file2 one for a total of 352,738 reads instead of your 3,427,722,738 reads the way you coded it. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
The Files are different from each other, both are having the different layout and different record length |
This is quite normal - almost always 2 files to be matched have different formats. Do not let this complicate your thoughts. You've mentioned that there is a common "key", so you you will be able to sort both files in that sequence and match them.
Your program is most likely not "looping" uncontrollably (if it is looping, it is outside the code that is taking forever to process as you've defined the process). Your choice of an implementation strategy does work perfectly with tiny amounts of data - it does not go into a loop. It is an unacceptable approach for even modest volumes, let alone a situation that will cause more than a billion reads. One way to verify that your program is not in a loop is to look at the job while it is running and notice that the number of excp's continually gets larger. If your code was stuck in a loop, the excp count would not increase.
You cannot "debug" what you have - it is simply an incorrect approach to the requirement. You need to stop looking at what you have currently and put an acceptable solution in place (matching/merging the 2 files). Even if you find a way to get this to run in only 18 hours, it should never be allowed to be placed in producton. As has already been posted, it should run in minutes, not hours and properly implemented, it will.
Once you put the replacement code together, we can help if you have questions or problems. |
|
Back to top |
|
|
|