IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

How to stop a runaway CICS trans with DB2


IBM Mainframe Forums -> CICS
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
John Poulakos

Active User


Joined: 13 Jun 2012
Posts: 178
Location: United States

PostPosted: Fri Aug 02, 2019 8:38 pm
Reply with quote

This probably doesn't have much to do with the problem, but just in case: I have a CICS transaction that is invoked as a web service provider using CICS channel/containers. It looped on a cursor and ran for 1/2 hour and wrote 47M lines to CEEMSG before it could be stopped.

The fix is obvious (I added it as a comment). But this was in test CICS, where mistakes happen. I am concerned that this type of problem could easily occur again. Here's the code. It's basically doing the fetch and writing a message to CEEMSG over and over...

Code:
EXEC SQL OPEN GETDOCPD END-EXEC.
PERFORM P100-FETCH-DOCPD THRU P100-EXIT
  UNTIL WS-EOF = 'Y' OR XX > 99.       
EXEC SQL CLOSE GETDOCPD END-EXEC. 

**************************************
 P100-FETCH-DOCPD.                   
     EXEC SQL                         
        FETCH  GETDOCPD               
         INTO :DOC-TYPE,             
              :BATCH,                 
              :DOC-ID,               
              :DATE-TIME,             
              :MESSAGE-ID,           
              :MESSAGE-TEXT,         
              :SUSFDATE :SUSFNULL     
     END-EXEC.                       
                                     
     IF SQLCODE = 100                     
**       MOVE 'Y' TO WS-EOF  *** missing statement ***                   
    Display  'None related  ' 
    MOVE SPACES TO SciArray(1)             
    PERFORM P610-PUT-CHILD                 
    GO TO P100-EXIT                       
END-IF.                                   
.
.
.
P100-EXIT. EXIT.

 *******************************************************
  P610-PUT-CHILD.                                       
      COMPUTE OutMem = (Length of SciArray * XX).       
      MOVE XX TO SciArray-num.                           
      EXEC CICS                                         
          PUT CONTAINER('SCISTAT3001')                   
         FROM (SciArray-Hdr)                             
         FLENGTH(OutMem)                                 
      END-EXEC.                                         


Back to top
View user's profile Send private message
John Poulakos

Active User


Joined: 13 Jun 2012
Posts: 178
Location: United States

PostPosted: Fri Aug 02, 2019 9:06 pm
Reply with quote

I included the P610-PUT-CHILD routine because it contains an "EXEC CICS" command, which I think will impact loop detection.
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3048
Location: NYC,USA

PostPosted: Fri Aug 02, 2019 9:31 pm
Reply with quote

1. Why I don't see XX is incremented for every fetch?
2. Why you have Display in online module?
4. Check SQLCODE=0 for considering it to be successful fetch.
3. You can fetch into an array, close the cursor and then loop this array to PUT into container (P610-para).
Back to top
View user's profile Send private message
John Poulakos

Active User


Joined: 13 Jun 2012
Posts: 178
Location: United States

PostPosted: Fri Aug 02, 2019 10:05 pm
Reply with quote

Rohit Umarjikar wrote:
1. Why I don't see XX is incremented for every fetch?
2. Why you have Display in online module?
3. Check SQLCODE=0 for considering it to be successful fetch.
4. You can fetch into an array, close the cursor and then loop this array to PUT into container (P610-para).


1. It only gets incremented on a found condition.
2. It is in test CICS to display unusual conditions… not in Production.
3. Agreed that using SQLCODE to limit the perform loop is a better idea. The WS-EOF switch was due to other DB2 functions being invoked that might have reset SQLCODE.
4. Found data undergoes a great deal of modification and other DB2 calls before being added to the array.

I have only shown the logic that relates to the problem. Not the logic that occurs on a found condition. A found condition could not have occurred. In this test, the programmer simply entered an value that wasn't found as a test.

My point is this was a simple mistake anyone could make when testing a program. My objective is to stop the runaway loop from occurring in CICS. I'm not real concerned about the program logic. I am concerned that this can happen.
Back to top
View user's profile Send private message
John Poulakos

Active User


Joined: 13 Jun 2012
Posts: 178
Location: United States

PostPosted: Fri Aug 02, 2019 10:51 pm
Reply with quote

I don't think ICVR will catch the loop because the EXEC CICS command in P610 will reset the timer. I don't know of any other values I could set. Maybe something in DB2?
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3048
Location: NYC,USA

PostPosted: Fri Aug 02, 2019 11:53 pm
Reply with quote

John,
1. What's ICVR value in Test region form SIT, hope its not 0?
2. You can detect which CICS region the task is running and if its test region then control the fetch loop for 100 or whatever allowed numbers and that way you can force to quit the loop (IN TEST ONLY)
3. As per me CICS SUSPEND will only reset the ICVR timer and not any EXEC CICS.
Back to top
View user's profile Send private message
John Poulakos

Active User


Joined: 13 Jun 2012
Posts: 178
Location: United States

PostPosted: Tue Aug 06, 2019 7:03 pm
Reply with quote

ICVR is set to 3000. Display of guilty transaction shows:

Code:
 RUnaway        : System             System | 0 | 500-2700000


I think that means take system default of 3000.

So, an EXEC CICS command will not reset the ICVR timer?

Should I have just cancelled the task in CICS when I became aware of this situation? I contacted Operations instead. It took them 1/2 hour to kill the CICS test region. It has to go through the problem reporting procedure.
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3048
Location: NYC,USA

PostPosted: Tue Aug 06, 2019 7:39 pm
Reply with quote

Quote:
I think that means take system default of 3000.
If you specify ICVR=0 then RUNWAY=System makes it to use 0 which means no RUNWAY Detection. You may also contact DB2 DBA for any suggestion from Db2 side like setting up ASULMIT.
As far as I know WAIT and SUSPEND do reset it. Don't point#2 help in preventing that happening regardless of ICVR stuff?
Back to top
View user's profile Send private message
John Poulakos

Active User


Joined: 13 Jun 2012
Posts: 178
Location: United States

PostPosted: Tue Aug 06, 2019 10:02 pm
Reply with quote

Yes, Point #2 would have helped. But, that would take a program change that requires testing to ensure it would work.
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3048
Location: NYC,USA

PostPosted: Tue Aug 06, 2019 11:47 pm
Reply with quote

ibmmainframes.com/about66815.html
Quote:
Yes, Point #2 would have helped. But, that would take a program change that requires testing to ensure it would work.
But you already fixed the program by adding WS-EOF, right? but my point is XX > 99 is already a condition so why don't it get true in existing program? If SQLCODE = 0 XX should be incremented and if its not happening then that's a bug to fix than worrying about ICVR?
Back to top
View user's profile Send private message
John Poulakos

Active User


Joined: 13 Jun 2012
Posts: 178
Location: United States

PostPosted: Wed Aug 07, 2019 1:18 am
Reply with quote

No code was changed. The code is flawed, but the problem could not have occurred in a production environment because the missing key could not been entered. The bad key was entered manually by a developer doing a test to document the process for conversion.

Our mainframe system is going away. It is being replaced by customized enterprise software packages at the end of this year. We have only a caretaker staff remaining and program changes are not allowed unless they are absolute requirements. A bug that can only occur in test doesn't qualify. So, I do what I can to prevent problems from recurring. I document the cause and what could be done. But, sometimes that's all I can do.
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3048
Location: NYC,USA

PostPosted: Wed Aug 07, 2019 1:29 am
Reply with quote

The Flaw is XX isn't working so the Perform Fetch is going in a infinite loop? Is that a right understanding?
If yes then two options,
1. Change the ASULIMIT to lower value , DBA can help you.
2. If point#1 don't help then remove Index/s from the table so that SQLCODE = -904 would fail the task for bad missing key.
3. If none works above, Just keep minimal test data in the tables to test like not more than 100 rows or something.
Back to top
View user's profile Send private message
John Poulakos

Active User


Joined: 13 Jun 2012
Posts: 178
Location: United States

PostPosted: Wed Aug 07, 2019 2:11 am
Reply with quote

Thank you Rohit,

You have been very helpful, as usual.
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3048
Location: NYC,USA

PostPosted: Wed Aug 07, 2019 10:29 am
Reply with quote

Welcome John.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> CICS

 


Similar Topics
Topic Forum Replies
No new posts Using API Gateway from CICS program CICS 0
No new posts Calling an Open C library function in... CICS 1
No new posts How to 'Ping' a CICS region in JCL CICS 2
No new posts Parallelization in CICS to reduce res... CICS 4
No new posts How to avoid duplicating a CICS Web S... CICS 0
Search our Forums:

Back to Top