View previous topic :: View next topic
|
Author |
Message |
John Poulakos
Active User
Joined: 13 Jun 2012 Posts: 178 Location: United States
|
|
|
|
This probably doesn't have much to do with the problem, but just in case: I have a CICS transaction that is invoked as a web service provider using CICS channel/containers. It looped on a cursor and ran for 1/2 hour and wrote 47M lines to CEEMSG before it could be stopped.
The fix is obvious (I added it as a comment). But this was in test CICS, where mistakes happen. I am concerned that this type of problem could easily occur again. Here's the code. It's basically doing the fetch and writing a message to CEEMSG over and over...
Code: |
EXEC SQL OPEN GETDOCPD END-EXEC.
PERFORM P100-FETCH-DOCPD THRU P100-EXIT
UNTIL WS-EOF = 'Y' OR XX > 99.
EXEC SQL CLOSE GETDOCPD END-EXEC.
**************************************
P100-FETCH-DOCPD.
EXEC SQL
FETCH GETDOCPD
INTO :DOC-TYPE,
:BATCH,
:DOC-ID,
:DATE-TIME,
:MESSAGE-ID,
:MESSAGE-TEXT,
:SUSFDATE :SUSFNULL
END-EXEC.
IF SQLCODE = 100
** MOVE 'Y' TO WS-EOF *** missing statement ***
Display 'None related '
MOVE SPACES TO SciArray(1)
PERFORM P610-PUT-CHILD
GO TO P100-EXIT
END-IF.
.
.
.
P100-EXIT. EXIT.
*******************************************************
P610-PUT-CHILD.
COMPUTE OutMem = (Length of SciArray * XX).
MOVE XX TO SciArray-num.
EXEC CICS
PUT CONTAINER('SCISTAT3001')
FROM (SciArray-Hdr)
FLENGTH(OutMem)
END-EXEC.
|
|
|
Back to top |
|
|
John Poulakos
Active User
Joined: 13 Jun 2012 Posts: 178 Location: United States
|
|
|
|
I included the P610-PUT-CHILD routine because it contains an "EXEC CICS" command, which I think will impact loop detection. |
|
Back to top |
|
|
Rohit Umarjikar
Global Moderator
Joined: 21 Sep 2010 Posts: 3076 Location: NYC,USA
|
|
|
|
1. Why I don't see XX is incremented for every fetch?
2. Why you have Display in online module?
4. Check SQLCODE=0 for considering it to be successful fetch.
3. You can fetch into an array, close the cursor and then loop this array to PUT into container (P610-para). |
|
Back to top |
|
|
John Poulakos
Active User
Joined: 13 Jun 2012 Posts: 178 Location: United States
|
|
|
|
Rohit Umarjikar wrote: |
1. Why I don't see XX is incremented for every fetch?
2. Why you have Display in online module?
3. Check SQLCODE=0 for considering it to be successful fetch.
4. You can fetch into an array, close the cursor and then loop this array to PUT into container (P610-para). |
1. It only gets incremented on a found condition.
2. It is in test CICS to display unusual conditions… not in Production.
3. Agreed that using SQLCODE to limit the perform loop is a better idea. The WS-EOF switch was due to other DB2 functions being invoked that might have reset SQLCODE.
4. Found data undergoes a great deal of modification and other DB2 calls before being added to the array.
I have only shown the logic that relates to the problem. Not the logic that occurs on a found condition. A found condition could not have occurred. In this test, the programmer simply entered an value that wasn't found as a test.
My point is this was a simple mistake anyone could make when testing a program. My objective is to stop the runaway loop from occurring in CICS. I'm not real concerned about the program logic. I am concerned that this can happen. |
|
Back to top |
|
|
John Poulakos
Active User
Joined: 13 Jun 2012 Posts: 178 Location: United States
|
|
|
|
I don't think ICVR will catch the loop because the EXEC CICS command in P610 will reset the timer. I don't know of any other values I could set. Maybe something in DB2? |
|
Back to top |
|
|
Rohit Umarjikar
Global Moderator
Joined: 21 Sep 2010 Posts: 3076 Location: NYC,USA
|
|
|
|
John,
1. What's ICVR value in Test region form SIT, hope its not 0?
2. You can detect which CICS region the task is running and if its test region then control the fetch loop for 100 or whatever allowed numbers and that way you can force to quit the loop (IN TEST ONLY)
3. As per me CICS SUSPEND will only reset the ICVR timer and not any EXEC CICS. |
|
Back to top |
|
|
John Poulakos
Active User
Joined: 13 Jun 2012 Posts: 178 Location: United States
|
|
|
|
ICVR is set to 3000. Display of guilty transaction shows:
Code: |
RUnaway : System System | 0 | 500-2700000 |
I think that means take system default of 3000.
So, an EXEC CICS command will not reset the ICVR timer?
Should I have just cancelled the task in CICS when I became aware of this situation? I contacted Operations instead. It took them 1/2 hour to kill the CICS test region. It has to go through the problem reporting procedure. |
|
Back to top |
|
|
Rohit Umarjikar
Global Moderator
Joined: 21 Sep 2010 Posts: 3076 Location: NYC,USA
|
|
|
|
Quote: |
I think that means take system default of 3000.
|
If you specify ICVR=0 then RUNWAY=System makes it to use 0 which means no RUNWAY Detection. You may also contact DB2 DBA for any suggestion from Db2 side like setting up ASULMIT.
As far as I know WAIT and SUSPEND do reset it. Don't point#2 help in preventing that happening regardless of ICVR stuff? |
|
Back to top |
|
|
John Poulakos
Active User
Joined: 13 Jun 2012 Posts: 178 Location: United States
|
|
|
|
Yes, Point #2 would have helped. But, that would take a program change that requires testing to ensure it would work. |
|
Back to top |
|
|
Rohit Umarjikar
Global Moderator
Joined: 21 Sep 2010 Posts: 3076 Location: NYC,USA
|
|
|
|
ibmmainframes.com/about66815.html
Quote: |
Yes, Point #2 would have helped. But, that would take a program change that requires testing to ensure it would work. |
But you already fixed the program by adding WS-EOF, right? but my point is XX > 99 is already a condition so why don't it get true in existing program? If SQLCODE = 0 XX should be incremented and if its not happening then that's a bug to fix than worrying about ICVR? |
|
Back to top |
|
|
John Poulakos
Active User
Joined: 13 Jun 2012 Posts: 178 Location: United States
|
|
|
|
No code was changed. The code is flawed, but the problem could not have occurred in a production environment because the missing key could not been entered. The bad key was entered manually by a developer doing a test to document the process for conversion.
Our mainframe system is going away. It is being replaced by customized enterprise software packages at the end of this year. We have only a caretaker staff remaining and program changes are not allowed unless they are absolute requirements. A bug that can only occur in test doesn't qualify. So, I do what I can to prevent problems from recurring. I document the cause and what could be done. But, sometimes that's all I can do. |
|
Back to top |
|
|
Rohit Umarjikar
Global Moderator
Joined: 21 Sep 2010 Posts: 3076 Location: NYC,USA
|
|
|
|
The Flaw is XX isn't working so the Perform Fetch is going in a infinite loop? Is that a right understanding?
If yes then two options,
1. Change the ASULIMIT to lower value , DBA can help you.
2. If point#1 don't help then remove Index/s from the table so that SQLCODE = -904 would fail the task for bad missing key.
3. If none works above, Just keep minimal test data in the tables to test like not more than 100 rows or something. |
|
Back to top |
|
|
John Poulakos
Active User
Joined: 13 Jun 2012 Posts: 178 Location: United States
|
|
|
|
Thank you Rohit,
You have been very helpful, as usual. |
|
Back to top |
|
|
Rohit Umarjikar
Global Moderator
Joined: 21 Sep 2010 Posts: 3076 Location: NYC,USA
|
|
|
|
Welcome John. |
|
Back to top |
|
|
|