It appears that CM00L540 is the villain. Looking a the source code, I see a copybook that matches the overlaid storage:
05 WS-TS-QUEUE-5 PIC X(04) VALUE 'TSQ5'.
05 WS-TS-QUEUE-6 PIC X(04) VALUE 'TSQ6'.
05 WS-CMMSD124 PIC X(08) VALUE 'CMMSD124'.
05 WS-CMMSALL PIC X(08) VALUE 'CMMSALL '.
05 WS-CMMSCLM PIC X(08) VALUE 'CMMSCLM '.
05 WS-CMMSACCS PIC X(08) VALUE 'CMMSACCS'.
05 WS-VALIDALL PIC X(08) VALUE 'VALIDALL'.
C67202 05 WS-VALIDGRP PIC X(08) VALUE 'VALIDGRP'.
C67202 05 WS-ENTR PIC X(04) VALUE 'ENTR'.
C67202 05 WS-NOTE PIC X(04) VALUE 'NOTE'.
05 WS-PLUS-100 PIC S9(03) VALUE +100 COMP-3.
C01325 05 WS-CLAIM-AREA-REQUEST-VALUES.
C01325 10 WS-GETMAIN-READ-REQUEST PIC X(01) VALUE '1'.
C01325 10 WS-SAVE-IN-TS-REQUEST PIC X(01) VALUE '2'.
C01325 10 WS-FREEMAIN-REQUEST PIC X(01) VALUE '3'.
C01325 10 WS-GETMAIN-REQUEST PIC X(01) VALUE '4'.
Now I look at the compiled listing. We use MicroFocus' APS and it adds a few generated flags:
01 TEXT-MSG PIC X(30)
VALUE 'PLEASE ENTER NEXT TRANSID'.
02 TRUX PIC X VA
88 ALWAYS VA
88 NEVER VA
02 FALSX PIC X VA
02 DEBIT-CREDIT-ADJ-FLG PIC X.
88 DEBIT-CREDIT-ADJ VA
02 DUPLICATE-OVRD-FOUND-FLG PIC X.
88 DUPLICATE-OVRD-FOUND VA
01 dfhb0040 comp pic s9(8) is global.
01 dfhb0041 comp pic s9(8) is global.
01 dfhb0042 comp pic s9(8) is global.
01 dfhb0043 comp pic s9(8) is global.
01 dfhb0044 comp pic s9(8) is global.
The cobol translator is adding the dfhb004* definitions, and it looks like it is somehow writing past its storage. The program never references any of the dfhb004* variables. Is it possible that this is the culprit?
Joined: 06 Jun 2008 Posts: 8217 Location: Dubuque, Iowa, USA
From the manual (with emphasis added by me):
CICS has detected a storage violation
CICS® can detect storage violations when:
1. The duplicate storage accounting area (SAA®) or the initial SAA of a TIOA storage element has become corrupted.
2. The leading storage check zone or the trailing storage check zone of a user-task storage element has become corrupted.
CICS detects storage violations involving TIOAs by checking the SAA chains when it receives a command to FREEMAIN an individual element of TIOA storage, at least as far as the target element. It also checks the chains when it FREEMAINs the storage belonging to a TCTTE after the last output has taken place. CICS detects storage violations involving user-task storage by checking the storage check zones of an element of user-task storage when it receives a command to FREEMAIN that element of storage. It also checks the chains when it FREEMAINs all the storage belonging to a task when the task ends.
The storage violation is detected not at the time it occurs, but only when the SAA chain or the storage check zones are checked. This is illustrated in Figure 18, which shows the sequence of events when CICS detects a violation of a user task storage element. The sequence is the same when CICS detects a violation of a TIOA storage element.
The fact that the SAA or storage check zone is overlaid some time before it is detected does not matter too much for user storage where the trailing storage check zone has been overlaid, because the transaction whose storage has been violated is also very likely to be the one responsible for the violation. It is fairly common for transactions to write data beyond the end of the allocated area in a storage element and into the check zone. This is the cause of the violation in Figure 18.
The situation could be more serious if the leading check zone has been overlaid, because in that case it could be that some other unrelated transaction was to blame. However, storage elements belonging to individual tasks are likely to be more or less contiguous, and overwrites could extend beyond the end of one element and into the next.
If the leading storage check zone was only overwritten by chance by some other task, the problem might not be reproducible. On other occasions, other parts of storage might be affected. If you have this sort of problem, you need to investigate it as though CICS had not detected it, using the techniques of Storage violations that affect innocent transactions.
To recap: there is essentially no chance that the compiler-generated fields you suspect are causing your problem. There may be a rogue transaction in your CICS region causing this problem, or your program could be doing it -- without extensive debugging (and probably going through trace output), you're not going to know which it is. This is a case where I strongly recommend you contact your CICS support group and use their resources to assist in figuring out the error -- especially since it may not be in your program at all.
Joined: 16 Apr 2008 Posts: 104 Location: South Carolina
I am in the CICS systems group. I am trying to debug this for an application area. I am trying to increase the trace table, but this program runs for so long, I can not get a full trace of everything that it is doing or the getmain.
I am wondering if the program that was beneath this task in storage might have gone out the bottom of a table and over wrote the top of itself and the bottom of this task...
Joined: 06 Jun 2008 Posts: 8217 Location: Dubuque, Iowa, USA
valyk, again I emphasize -- the transaction with the storage abend may have absolutely nothing to do with the storage overlay; it could be any transaction running in the CICS region. There is little, if any, chance of memory wrap such as you mentioned -- too much system data in high memory (and low memory) to have any belief in that. The CICS region probably would have crashed completely long before the SAA overlay was discovered if a memory wrap was occurring.
Ideas from the manual I referenced last post:
If you received a transaction abend message, read What the transaction abend message can tell you. Otherwise, go on to What the CICS system dump can tell you.
What the transaction abend message can tell you
If you get a transaction abend message, it is very likely that CICS detected the storage violation when it was attempting to satisfy a FREEMAIN request for user storage. Make a note of the information the message contains, including:
* The transaction abend code.
* The identity of the transaction whose storage has been violated.
* The identity of the program running at the time the violation was detected.
* The identity of the terminal at which the task was started.
Because CICS does not detect the overlay at the time it occurs, the program identified in the abend message probably is not the one in error. However, it is likely that it issued the FREEMAIN request on which the error was detected. One of the other programs in the abended transaction might have violated the storage in the first place.
What the CICS system dump can tell you
Before looking at the system dump, you must format it using the appropriate formatting keywords. The ones you need for investigating storage violations are:
* TR, to get you the internal trace table
* TCP, to get you terminal-related areas
* AP, to get you the TCAs and user storage.
The dump formatting program reports the damaged storage check zone or SAA chain when it attempts to format the storage areas, and this can help you with diagnosis by identifying the TCA or TCTTE owning the storage.
When you have formatted the dump, take a look at the data overlaying the SAA or storage check zone to see if its nature suggests which program put it there. There are two places you can see this, one being the exception trace entry in the internal trace table, and the other being the violated area of storage itself. Look first at the exception trace entry in the internal trace table to check that it shows the data overlaying the SAA or storage check zone. Does the data suggest what program put it there? Remember that the program is likely to be part of the violated transaction in the case of user storage. For terminal storage, you probably have more than one transaction to consider.
As the SAAs and storage check zones are only 8 bytes long, there might not be enough data for you to identify the program. In this case, find the overlaid data in the formatted dump. The area is pointed to in the diagnostic message from the dump formatting program. The data should tell you what program put it there, and, more importantly, what part of the program was being executed when the overlay occurred.
If the investigations you have done so far have enabled you to find the cause of the overlay, you should be able to fix the problem.
What to do if you cannot find what is overlaying the SAA
The technique described in this section enables you to locate the code responsible for the error by narrowing your search to the sequence of instructions executing between the last two successive old-style trace entries in the trace table.
You do this by forcing CICS to check the SAA chain of terminal storage and the storage check zones of user-task storage every time an old-style trace entry is made from AP domain. These types of trace entry have point IDs of the form AP 00xx, "xx" being two hexadecimal digits. Storage chain checking is not done for new-style trace entries from AP domain or any other domain. (For a discussion of old and new-style trace entries, see Using traces in problem determination.)
The procedure has a significant processing overhead, because it involves a large amount of tracing. You are likely to use it only when you have had no success with other methods.
How you can force storage chain checking
You can force storage chain checking either by using the CSFE DEBUG transaction, or by using the CHKSTSK or CHKSTRM system initialization parameter. Tracing must also be active, or CICS will do no extra checking. The CSFE transaction has the advantage that you need not bring CICS down before you can use it.
Joined: 14 Jan 2008 Posts: 2504 Location: Atlanta, Georgia, USA
On the FREEMAIN, is this explicitly issued by the program or implicity issued by CICS?
If it's implicit, then most likely it's occurring near task termination and CICS is attempting to free storage which had been implicitly acquired previously by CICS, such as WS, commarea, etc.
As Robert has said, it may not be the program within your task that's causing the SV as you could be suffering from "Sympathy Sickness", where another task in the mix at the time trashed your task's SAA/SCZ, which I wouldn't rule out.
Is this the program invoking a freemain, or cics doing the house cleaning?
The reason why I think it is this program is that I have looked at three storage violations in the past two days (this is happening several times during the day in about 8 AORs) and the overlaid storage is identical. It looks like several DB2 rows being overlaid in this program.