IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

working on a storage violation...


IBM Mainframe Forums -> CICS
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
valyk

Active User


Joined: 16 Apr 2008
Posts: 104
Location: South Carolina

PostPosted: Wed Dec 16, 2009 7:21 pm
Reply with quote

We are receiving a storage violation that is kicking my butt:

Code:

SM 0D11 SMMF  *EXC* - Storage_check_failed_on_freemain_request - FUNCTION(FREE

              TASK-00258 KE_NUM-005E TCB-QR   /007F8270 RET-BC2FFC22 TIME-17:1
                1-0000  00800000 00000011 00000000 00000000  B4090000 00000000
                  0020  3C771CA0 3F1F21F8 00000002 3BBA4070  01000000 01551101
                  0040  00000001 01000000 A0000000 00000000  3C771FB4 0000003F
                  0060  3C771FB4 0010505C 3D259340 800A698C  3BBFE700 3BB42058
                2-0000  3F1F21F0
                3-0000  E4F0F0F0 F0F2F5F8 C8C1D5C3 3F1F3518  3F1E5EA8 00000000
                  0020  00001308 00000000 3F1F21F8 000012E8  000012DB 00000000
                  0040  00000000 00000000 00000000 00000000  00000000 00000000
                  0060  00000000 00000000 00000000 00000000  00000000 00000000
                4-0000  40404040 40404040 40404040 40404040



Code:

** DFHPD0124  Storage violation detected at 3F1F3510. Leading SAA is invalid.


Code:

3F1F3510.:3F1F351F.--All bytes contain X'40', C' '
3F1F3520   3F1E5EA8   40404040   40404040   40404040   | ..;y             |
3F1F3530.:3F1F354F.--All bytes contain X'40', C' '
3F1F3550   40404040   40404040   00004040   40404040   |         ..       |
3F1F3560.:3F1F35EF.--All bytes contain X'40', C' '
3F1F35F0   40404040   40404040   E2D8D3C3   C1404040   |         SQLCA    |
3F1F3600   00000088   00000064   00004040   40404040   | ...h......       |
3F1F3610.:3F1F364F.--All bytes contain X'40', C' '
3F1F3650   C4E2D5E7   D9C6C640   FFFFFF92   00000000   | DSNXRFF ...k.... |
3F1F3660   00000000   FFFFFFFF   00000000   00000000   | ................ |
3F1F3670   40404040   40404040   404040F0   F2F0F0F0   |            02000 |
3F1F3680   D7F8F1F0   40404040   40404040   40404040   | P810             |
3F1F3690.:3F1F36CF.--All bytes contain X'40', C' '
3F1F36D0   40404040   40404040   F8F1F040   40404040   |         810      |
3F1F36E0.:3F1F370F.--All bytes contain X'40', C' '
3F1F3710   F8F1F0D6   C3C54040   40404040   40404040   | 810OCE           |
3F1F3720.:3F1F372F.--All bytes contain X'40', C' '
3F1F3730   40404040   40F2F0F0   F960F1F2   60F0F340   |      2009-12-03  |
3F1F3740.:3F1F37BF.--All bytes contain X'40', C' '
3F1F37C0   40404040   40404040   00000000   00000000   |         ........ |
3F1F37D0.:3F1F3C3F.--All bytes contain X'00'
3F1F3C40   00010000   00000000   00284000   001EC3D4   | .......... ...CM |


Paging a few lines up, we see part of a working storage section of a program.

Code:

3F1F29F0.:3F1F2A0F.--All bytes contain X'40', C' '
3F1F2A10   40400000   00000000   00000000   00000000   |   .............. |
3F1F2A20.:3F1F2A7F.--All bytes contain X'00'
3F1F2A80   00000000   00000000   C4C1E3C1   C2C1E2C5   | ........DATABASE |
3F1F2A90   40C5D9D9   D6D94060   40404040   40404040   |  ERROR -         |
3F1F2AA0   4040C3C1   D3D340C8   C5D3D740   C4C5E2D2   |   CALL HELP DESK |
3F1F2AB0   40404040   40404040   40404000   00000000   |            ..... |
3F1F2AC0   00000000   00000000   00000000   00E3E2D8   | .............TSQ |
3F1F2AD0   F1E3E2D8   F2E3E2D8   F3E3E2D8   F4E3E2D8   | 1TSQ2TSQ3TSQ4TSQ |
3F1F2AE0   F5E3E2D8   F6C3D4D4   E2C4F1F2   F4C3D4D4   | 5TSQ6CMMSD124CMM |
3F1F2AF0   E2C1D3D3   40C3D4D4   E2C3D3D4   40C3D4D4   | SALL CMMSCLM CMM |
3F1F2B00   E2C1C3C3   E2E5C1D3   C9C4C1D3   D3E5C1D3   | SACCSVALIDALLVAL |
3F1F2B10   C9C4C7D9   D7C5D5E3   D9D5D6E3   C5100CF1   | IDGRPENTRNOTE..1 |
3F1F2B20   F2F3F400   00000000   D7D3C5C1   E2C540C5   | 234.....PLEASE E |
3F1F2B30   D5E3C5D9   40D5C5E7   E340E3D9   C1D5E2C9   | NTER NEXT TRANSI |
3F1F2B40   C4404040   40400000   E3C6C6C6   00000000   | D     ..TFFF.... |
3F1F2B50.:3F1F2CCF.--All bytes contain X'00'
3F1F2CD0   00000000   00000000   40404040   40404040   | ........         |
3F1F2CE0.:3F1F351F.--All bytes contain X'40', C' '
         ^-- LEADING SAA SHOULD HAVE BEEN IN THIS ADDRESS RANGE
3F1F3520   3F1E5EA8   40404040   40404040   40404040   | ..;y             |
3F1F3530.:3F1F354F.--All bytes contain X'40', C' '
3F1F3550   40404040   40404040   00004040   40404040   |         ..       |


Now I find the previous SAA code.

Code:

3F1F21F0   E4F0F0F0   F0F2F5F8   C8C1D5C3   3F1F3518   | U0000258HANC.... |
3F1F2200   3F1E5EA8   00000000   3F1F21F8   00000000   | ..;y.......8.... |
3F1F2210   00001308   00000000   3F1F21F8   000012E8   | ...........8...Y |
3F1F2220   000012DB   00000000   00000000   00000000   | ................ |
3F1F2230.:3F1F223F.--All bytes contain X'00'
3F1F2240   00000000   00000000   C9C7E9E2   D9E3C3C4   | ........IGZSRTCD |
3F1F2250.:3F1F225F.--All bytes contain X'00'
3F1F2260   00000000   00000000   E2E8E2D6   E4E34040   | ........SYSOUT   |
3F1F2270   00000000   00000000   0E000000   00000000   | ................ |
3F1F2280   0F000000   00000000   00000000   00000000   | ................ |
3F1F2290.:3F1F229F.--All bytes contain X'40', C' '
3F1F22A0   40404040   40404040   40404040   40400000   |               .. |
3F1F22B0   5C5C5C40   C3D4F0F0   D3F5F4F0   40E6E240   | *** CM00L540 WS  |
3F1F22C0   C2C5C7C9   D5E2405C   5C5C0000   00000000   | BEGINS ***...... |
3F1F22D0   C3D4F0F0   D3F5F4F0   D4C6E4D3   D3F8C4E4   | CM00L540MFULL8DU |
3F1F22E0   C2C4C4F6   D7C7C200   F0C4F6D7   C7C2F040   | BDD6PGB.0D6PGB0  |
3F1F22F0   40404040   F0404040   4040F040   40404040   |     0     0      |
3F1F2300   F0404040   4040F040   40404040   F0404040   | 0     0     0    |
3F1F2310   4040F040   40404040   F0404040   4040F040   |   0     0     0  |
3F1F2320   40404040   00000000   C5D4C3F2   C4F8F3F0   |     ....EMC2D830 |
3F1F2330   F7C4F6D7   C7C2C4C1   E4E3C8C4   F8C8E2E3   | 7D6PGBDAUTHD8HST |
3F1F2340   C4404040   40400000   00000000   00000000   | D     .......... |


It appears that CM00L540 is the villain. Looking a the source code, I see a copybook that matches the overlaid storage:

Code:

           05  WS-TS-QUEUE-5              PIC X(04) VALUE 'TSQ5'.
           05  WS-TS-QUEUE-6              PIC X(04) VALUE 'TSQ6'.
      *****************************************************************
           05  WS-CMMSD124                PIC X(08) VALUE 'CMMSD124'.
           05  WS-CMMSALL                 PIC X(08) VALUE 'CMMSALL '.
           05  WS-CMMSCLM                 PIC X(08) VALUE 'CMMSCLM '.
           05  WS-CMMSACCS                PIC X(08) VALUE 'CMMSACCS'.
           05  WS-VALIDALL                PIC X(08) VALUE 'VALIDALL'.
C67202     05  WS-VALIDGRP                PIC X(08) VALUE 'VALIDGRP'.
C67202     05  WS-ENTR                    PIC X(04) VALUE 'ENTR'.
C67202     05  WS-NOTE                    PIC X(04) VALUE 'NOTE'.
           05  WS-PLUS-100                PIC S9(03) VALUE +100 COMP-3.
C01325*****************************************************************
C01325     05  WS-CLAIM-AREA-REQUEST-VALUES.
C01325         10  WS-GETMAIN-READ-REQUEST PIC X(01) VALUE '1'.
C01325         10  WS-SAVE-IN-TS-REQUEST   PIC X(01) VALUE '2'.
C01325         10  WS-FREEMAIN-REQUEST     PIC X(01) VALUE '3'.
C01325         10  WS-GETMAIN-REQUEST      PIC X(01) VALUE '4'.
C01325*****************************************************************


Now I look at the compiled listing. We use MicroFocus' APS and it adds a few generated flags:

Code:

               01  TEXT-MSG                    PIC X(30)
                           VALUE 'PLEASE ENTER NEXT TRANSID'.

               01  GENERATED-FLAGS.
                   02  TRUX                                    PIC X VA
                       88  ALWAYS                                    VA
                       88  NEVER                                     VA
                   02  FALSX                                   PIC X VA
                   02  DEBIT-CREDIT-ADJ-FLG                    PIC X.
                       88  DEBIT-CREDIT-ADJ                          VA
                   02  DUPLICATE-OVRD-FOUND-FLG                PIC X.
                       88  DUPLICATE-OVRD-FOUND                      VA



               01   dfhb0040  comp pic s9(8) is global.
               01   dfhb0041  comp pic s9(8) is global.
               01   dfhb0042  comp pic s9(8) is global.
               01   dfhb0043  comp pic s9(8) is global.
               01   dfhb0044  comp pic s9(8) is global.


The cobol translator is adding the dfhb004* definitions, and it looks like it is somehow writing past its storage. The program never references any of the dfhb004* variables. Is it possible that this is the culprit?
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8696
Location: Dubuque, Iowa, USA

PostPosted: Wed Dec 16, 2009 7:33 pm
Reply with quote

From the manual (with emphasis added by me):
Quote:
CICS has detected a storage violation

CICS® can detect storage violations when:

1. The duplicate storage accounting area (SAA®) or the initial SAA of a TIOA storage element has become corrupted.
2. The leading storage check zone or the trailing storage check zone of a user-task storage element has become corrupted.

CICS detects storage violations involving TIOAs by checking the SAA chains when it receives a command to FREEMAIN an individual element of TIOA storage, at least as far as the target element. It also checks the chains when it FREEMAINs the storage belonging to a TCTTE after the last output has taken place. CICS detects storage violations involving user-task storage by checking the storage check zones of an element of user-task storage when it receives a command to FREEMAIN that element of storage. It also checks the chains when it FREEMAINs all the storage belonging to a task when the task ends.

The storage violation is detected not at the time it occurs, but only when the SAA chain or the storage check zones are checked. This is illustrated in Figure 18, which shows the sequence of events when CICS detects a violation of a user task storage element. The sequence is the same when CICS detects a violation of a TIOA storage element.

The fact that the SAA or storage check zone is overlaid some time before it is detected does not matter too much for user storage where the trailing storage check zone has been overlaid, because the transaction whose storage has been violated is also very likely to be the one responsible for the violation. It is fairly common for transactions to write data beyond the end of the allocated area in a storage element and into the check zone. This is the cause of the violation in Figure 18.

The situation could be more serious if the leading check zone has been overlaid, because in that case it could be that some other unrelated transaction was to blame. However, storage elements belonging to individual tasks are likely to be more or less contiguous, and overwrites could extend beyond the end of one element and into the next.

If the leading storage check zone was only overwritten by chance by some other task, the problem might not be reproducible. On other occasions, other parts of storage might be affected. If you have this sort of problem, you need to investigate it as though CICS had not detected it, using the techniques of Storage violations that affect innocent transactions.

To recap: there is essentially no chance that the compiler-generated fields you suspect are causing your problem. There may be a rogue transaction in your CICS region causing this problem, or your program could be doing it -- without extensive debugging (and probably going through trace output), you're not going to know which it is. This is a case where I strongly recommend you contact your CICS support group and use their resources to assist in figuring out the error -- especially since it may not be in your program at all.
Back to top
View user's profile Send private message
valyk

Active User


Joined: 16 Apr 2008
Posts: 104
Location: South Carolina

PostPosted: Wed Dec 16, 2009 7:44 pm
Reply with quote

icon_smile.gif I am in the CICS systems group. I am trying to debug this for an application area. I am trying to increase the trace table, but this program runs for so long, I can not get a full trace of everything that it is doing or the getmain.

I am wondering if the program that was beneath this task in storage might have gone out the bottom of a table and over wrote the top of itself and the bottom of this task...
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8696
Location: Dubuque, Iowa, USA

PostPosted: Wed Dec 16, 2009 7:49 pm
Reply with quote

valyk, again I emphasize -- the transaction with the storage abend may have absolutely nothing to do with the storage overlay; it could be any transaction running in the CICS region. There is little, if any, chance of memory wrap such as you mentioned -- too much system data in high memory (and low memory) to have any belief in that. The CICS region probably would have crashed completely long before the SAA overlay was discovered if a memory wrap was occurring.

Ideas from the manual I referenced last post:
Quote:
If you received a transaction abend message, read What the transaction abend message can tell you. Otherwise, go on to What the CICS system dump can tell you.
What the transaction abend message can tell you

If you get a transaction abend message, it is very likely that CICS detected the storage violation when it was attempting to satisfy a FREEMAIN request for user storage. Make a note of the information the message contains, including:

* The transaction abend code.
* The identity of the transaction whose storage has been violated.
* The identity of the program running at the time the violation was detected.
* The identity of the terminal at which the task was started.

Because CICS does not detect the overlay at the time it occurs, the program identified in the abend message probably is not the one in error. However, it is likely that it issued the FREEMAIN request on which the error was detected. One of the other programs in the abended transaction might have violated the storage in the first place.
What the CICS system dump can tell you

Before looking at the system dump, you must format it using the appropriate formatting keywords. The ones you need for investigating storage violations are:

* TR, to get you the internal trace table
* TCP, to get you terminal-related areas
* AP, to get you the TCAs and user storage.

The dump formatting program reports the damaged storage check zone or SAA chain when it attempts to format the storage areas, and this can help you with diagnosis by identifying the TCA or TCTTE owning the storage.

When you have formatted the dump, take a look at the data overlaying the SAA or storage check zone to see if its nature suggests which program put it there. There are two places you can see this, one being the exception trace entry in the internal trace table, and the other being the violated area of storage itself. Look first at the exception trace entry in the internal trace table to check that it shows the data overlaying the SAA or storage check zone. Does the data suggest what program put it there? Remember that the program is likely to be part of the violated transaction in the case of user storage. For terminal storage, you probably have more than one transaction to consider.

As the SAAs and storage check zones are only 8 bytes long, there might not be enough data for you to identify the program. In this case, find the overlaid data in the formatted dump. The area is pointed to in the diagnostic message from the dump formatting program. The data should tell you what program put it there, and, more importantly, what part of the program was being executed when the overlay occurred.

If the investigations you have done so far have enabled you to find the cause of the overlay, you should be able to fix the problem.
What to do if you cannot find what is overlaying the SAA

The technique described in this section enables you to locate the code responsible for the error by narrowing your search to the sequence of instructions executing between the last two successive old-style trace entries in the trace table.

You do this by forcing CICS to check the SAA chain of terminal storage and the storage check zones of user-task storage every time an old-style trace entry is made from AP domain. These types of trace entry have point IDs of the form AP 00xx, "xx" being two hexadecimal digits. Storage chain checking is not done for new-style trace entries from AP domain or any other domain. (For a discussion of old and new-style trace entries, see Using traces in problem determination.)

The procedure has a significant processing overhead, because it involves a large amount of tracing. You are likely to use it only when you have had no success with other methods.
How you can force storage chain checking

You can force storage chain checking either by using the CSFE DEBUG transaction, or by using the CHKSTSK or CHKSTRM system initialization parameter. Tracing must also be active, or CICS will do no extra checking. The CSFE transaction has the advantage that you need not bring CICS down before you can use it.
Back to top
View user's profile Send private message
Bill O'Boyle

CICS Moderator


Joined: 14 Jan 2008
Posts: 2501
Location: Atlanta, Georgia, USA

PostPosted: Wed Dec 16, 2009 7:52 pm
Reply with quote

On the FREEMAIN, is this explicitly issued by the program or implicity issued by CICS?

If it's implicit, then most likely it's occurring near task termination and CICS is attempting to free storage which had been implicitly acquired previously by CICS, such as WS, commarea, etc.

As Robert has said, it may not be the program within your task that's causing the SV as you could be suffering from "Sympathy Sickness", where another task in the mix at the time trashed your task's SAA/SCZ, which I wouldn't rule out.

Bill
Back to top
View user's profile Send private message
valyk

Active User


Joined: 16 Apr 2008
Posts: 104
Location: South Carolina

PostPosted: Wed Dec 16, 2009 8:08 pm
Reply with quote

This is what is being displayed for the freemain:

Code:
*EXC* - Storage_check_failed_on_freemain_request - FUNCTION(FREEMAIN) ADDRESS(41459BA8) CALLER(EXEC) EXEC_KEY(USER)


Is this the program invoking a freemain, or cics doing the house cleaning?

The reason why I think it is this program is that I have looked at three storage violations in the past two days (this is happening several times during the day in about 8 AORs) and the overlaid storage is identical. It looks like several DB2 rows being overlaid in this program.
Back to top
View user's profile Send private message
Bill O'Boyle

CICS Moderator


Joined: 14 Jan 2008
Posts: 2501
Location: Atlanta, Georgia, USA

PostPosted: Wed Dec 16, 2009 8:15 pm
Reply with quote

Maybe the following link could help -

www-01.ibm.com/support/docview.wss?rs=1083&uid=swg27007891

Bill
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> CICS

 


Similar Topics
Topic Forum Replies
No new posts CLIST - Virtual storage allocation error CLIST & REXX 5
No new posts PD not working for unsigned packed JO... DFSORT/ICETOOL 5
No new posts Def PD not working for unsigned packe... JCL & VSAM 3
No new posts CICS vs LE: STORAGE option CICS 0
No new posts Insufficient Storage ABENDS & Debugging 7
Search our Forums:

Back to Top