Joined: 06 Jun 2008 Posts: 8154 Location: East Dubuque, Illinois, USA
My production job which is running for more 20 years is failing
Who cares how long the program ran without failing? What matters now is that SOMETHING CHANGED and the program is now failing.
The 219 message states in the MAC manual what to do:
System Action: The system issues an SVC Dump, writes a software error record to the logrec data set, and the task is ended. Operator Response: Start a generalized trace facility (GTF) trace, and re-create the problem. Reply to message AHL100A with:
On the DD statement for the data set in error, specify:
Application Programmer Response: Make sure that your program does not alter the DCB or IOB during processing of SVC 25.
System Programmer Response: If the error recurs and the program is not in error, look at the messages in the job log for more information. Search problem reporting data bases for a fix for the problem. If no fix exists, contact the IBM Support Center. Provide the JCL, the program listing for the job, and the logrec data set error record.
while the CEE3204S message in the manual indicates
CEE3204S The system detected a protection exception (System Completion
Explanation: Your program attempted to access a storage location to which it was not authorized. Programmer Response: Check your application for these common errors:
Using the wrong AMODE to reference storage
Trying to use a pointer that has not been set
Trying to store data into storage reserved for the system
Using an invalid index to an array
See a Principles of Operation manual for a full list of protection exceptions. System Action: The thread is terminated. Symbolic Feedback Code: CEE344
If you have tried the diagnostics in the 219 message, then the next step is to contact IBM and open a PMR.
Joined: 23 Nov 2006 Posts: 19270 Location: Inside the Matrix
There were no changes done to this for atleast 2 years.. i am sure about that...
Possibly not, but that has nothing to do with i mentioned earlier. . .
Something HAS changed somewhere. It could be the data or ANY of the other possibilities mentioned above. There is also the chance that the problem has been in the code all along and just never caused the problem til now.
Be suspicous of any arrays or called modules (as DBZ mentioned).
Explanation: A recursive error was detected. A condition was raised,
causing the number of nested conditions to exceed the limit set by the
DEPTHCONDLMT option. The reason code indicates which subcomponent or
process was active when the exception was detected.
Reason code Explanation
X'07' (7) While Language Environment was trying to output a message, a
subsequent condition was raised.
Programmer Response: In the case of CEEHDLR routine, recursion can occur
when you use the DEPTHCONDLMT run-time option. It may be helpful to
generate a system dump of the original error by using run-time options
TERMTHDACT(UAIMM) and TRAP(ON,NOSPIE).
+CEE0374C CONDITION=CEE3204S TOKEN=00030C84 59C3C5C5 00000000 933
WHILE RUNNING PROGRAM IGG019BP
+CEE0374C CONDITION=CEE3206S TOKEN=00030C86 59C3C5C5 00000000 934
WHILE RUNNING PROGRAM CEEBINIT
You have failed in a system module, via CEEBINIT.
CEEBINIT is used when you call a Cobol program.
As others have suggested, something has done your storage in somewhere, sufficiently to knock over the IGG019BP.
The current data that you are using or possibly the immediately previous data or, if you are unlucky, data some time earlier, has caused your problem.
If possible, try to run with SSRANGE on, which should check the tables for overflow. Otherwise it is likely one of the called modules that is doing something.
dbz's suggestion is useful to you. If you shorten your file but leave about 1000 before the abend, it may help you to track it down.
I would go back through the calling chain and see what program was called to get to the abend. Fortunately even the IBM routines follow the call/save conventions, so it is just some work going back through the dump.
When you find the module call, that might not be the one causing the problem. Storage has been overwritten at some point, but that might be earlier than immediately previous to that, as the abend won't occur until you happen to have something try to use the corrupted storage as instructions.