Sporadic 0C1/AEKA abend in CICS

Mainak_Dalal · New User Joined: 05 May 2010 Posts: 19 Location: USA

One of our CICS Cobol is sporadically abending with 0C1/AEKA. Looking at the assembler listing, the abending instruction is MVCL 6,0 which is move character long. The instruction is actually moving 320 kb of data.
Checked the CICS logs, PSW, register values. But no clue why the MVCL instruction causes 0C1 sporadically.
Can you please suggest what else to check, what could be the cause of this rare issue and what could be the solution?
As a temporary solution, we newcopy / phase in the load module and it solves the issue.

Robert Sample · Posted: Fri Feb 12, 2016 1:05 am

The S0C1 may be caused by something else (such as table overflowing into code) and have nothing to do with the MVCL. You may have to use one -- or more -- traces in CICS to figure out what, exactly, is happening. The fact that a newcopy resolves the problem points to some kind of storage issue causing the operation exception.

And WHY are you moving 320KB of data in a CICS program? CICS programs should be transaction oriented and hence deal with no more than a few K at most.

Transient issues, especially S0C1 and S0C4 ABENDs, are notoriously hard to resolve since they frequently are caused by something far removed from where the problem becomes apparent.

Bill O'Boyle · Posted: Fri Feb 12, 2016 5:29 am

Is there anything in the DFHTACB? The registers at the time of the dump begin at X'60' off the DFHTACB.

Mainak_Dalal · New User Joined: 05 May 2010 Posts: 19 Location: USA

Robert, the program handles compression and decompression of a very large VSAM file that stores transactions history of last 20+ years. It does some initialization of a large array and hence does the 320kb move through MVCL. It seems to me to be some kind of storage overlay. Will try to trace it. But the problem is it happens very infrequently and the programs involved, have not been changed in several years.

Bill, I will check the DFHTACB and will keep you posted

Thanks for your suggestions.

Bill Woodger · Posted: Fri Feb 12, 2016 4:15 pm

Note that the execution of an MVCL cannot cause a S0C1.

Your program code must have been overwritten, or a wild branch has entered part-way through an instruction.

Most likely the former, since the problem disappears with a new executable loaded.

As Robert has said, not necessarily so easy to track. It will be a program that were changed immediately (where immediately is some timespan) before the failures started to occur. It need have nothing to do with the program that is failing. But it might, and probably a higher likelihood than other programs. When was the program last changed? What was the change?

Mainak_Dalal · New User Joined: 05 May 2010 Posts: 19 Location: USA

When I checked the Memory dump, I am seeing missing module names

Robert Sample · Posted: Mon Feb 15, 2016 10:20 pm

FWIW, X'27C5E7D2' is an X'27' followed by EXK -- perhaps the second half of a length followed by the starting bytes of a field? If you don't have a dump from the region, get one and look for those bytes in the dump.

If the problem only started last year, could it be date-related? Have you pulled together a list of occurrences to look for commonalities (same day of month, Julian day incremented by some constant, that type of thing)?

As we said, these issues can be very tough to debug since you don't have much of a starting idea. Does the CICS region have storage protection turned on? Is the problem occurring in production AND test?

Mainak_Dalal · New User Joined: 05 May 2010 Posts: 19 Location: USA

Mainak_Dalal · New User Joined: 05 May 2010 Posts: 19 Location: USA

Robert, it is only in production

Bill Woodger · Posted: Mon Feb 15, 2016 11:49 pm

It is highly likely that you've trashed your program. You don't seem to understand yet what a S0C1 is. BALR cannot give you a S0C1. You can branch to a wild address and happen to get a S0C1, or your BALR can get overwritten and give an invalid instruction. What is the actual instruction pointed-to in the dump?

Mainak_Dalal · New User Joined: 05 May 2010 Posts: 19 Location: USA

Bill, the actual instruction is a BALR in the dump. Most likely it is causing a wild branch. register 15 contains the address that you are seeing in DSA 3 in my screenshot. Which has offset +27C5E7D2. But when I checked the dump, I do not see any content at this offset.

where I am confused is that the chain of programs are called from prod CICS region hundreds of times a day. But they abend seldom, specially when called first time after region job is restarted. Our shop has AUTOINSTALL. So the first call to the program loads the program and does a phase in. In majority of the S0C1 with this load module happens when a massive VSAM is being accessed.

PeterHolland · Posted: Tue Feb 16, 2016 2:42 pm

Maybe the next link can help you to pinpoint your problem :

194.196.36.29/support/knowledgecenter/SSGMCP_4.2.0/com.ibm.cics.ts.doc/dfhs1/topics/dfhs14h.html

Bill Woodger · Posted: Tue Feb 16, 2016 4:37 pm

Bill O'Boyle · Posted: Tue Feb 16, 2016 5:23 pm

As Bill has said, it would be safe to assume that your compiler supports the BALR instruction. But, you had said from the start that it was an MVCL? Under certain circumstances, COBOL will bypass the in-line MVCL and CALL (BALR) to a run-time routine to perform the MOVE.

Could you check the Assembler expansion and to your right, when R15 is loaded with the routine's address, the name will/should be present. If this is one of the routines which didn't resolve during link-edit, then I think you'll have your answer.

Or (as an exercise) divide the target storage-area by 256, get the quotient and use reference modification in an in-line PERFORM UNTIL, moving 256 bytes at a time.

When the PERFORM UNTIL completes, if you had a remainder after the divide then move this as a stand-alone reference-modification MOVE with the starting position "Quotient" times 256 plus 1 for the length-value in the remainder or you can omit the length-value and the compiler will calculate this.

This would be a process of elimination, overriding the MVCL issue.