How to know in the spool that the job is in infinite loop

ashik.mohan · New User Joined: 13 Dec 2006 Posts: 2 Location: Chennai

how will we know in the spool that the job is in infinite loop, and whats the difference between job taking more time and the job going to infinite loop,

mull · New User Joined: 21 Sep 2006 Posts: 15 Location: india

if the job is in infinite loop the EXCP-Cnt field will not increase.
that means no input and output operation is taking place.
if it is running and taking long time still the EXCP-Cnt will increase slowly.

Phrzby Phil · Posted: Thu Apr 05, 2007 5:53 pm

Suppose I/O is inside the infinite loop?

expat · Posted: Thu Apr 05, 2007 6:34 pm

I was going to say that if I/O was included inside the loop then eventually an EOF condition would occur. But I won't say that now.

There was an example of that on this board not so long ago where a REXX program was looping because it never got to EOF because the way in which it was coded and kept closing a file, and then reading the exact same records as it did last time.

So NO, CPU and EXCPs going up isn't a guarantee that there is not a loop.

I'd guess that if you suspected a loop then add some displays in your code to help you trace what is happening inside the program.

satyabrat jena · New User Joined: 06 Apr 2007 Posts: 3 Location: BANGALORE

Hi,
Ashik i am a new member in this group. I think i can help u.

After successful execution maxcc will be 0.If ur job will not respond for a long time or goes to infinite loop.then u will be informed by ur administrator. Then u will no that ur job is in infinite loop.

ybhavesh · New User Joined: 17 Feb 2007 Posts: 46 Location: mumbai

Hi

There might be many reason the job get into the loop because there is some think wrong in the condition parameter
the second reason when the dataset are shared by other user
so some times our resources are blocked by other user so you thing our job has gone into loop so always check the other user is not accessing our data set
you can find it out by giving command
D GRS,A

Nimesh.Srivastava · New User Joined: 30 Nov 2006 Posts: 78 Location: SINGAPORE

Hi,
We had similar problems in our shop and there is somewhere a setting which tells that what is the maximum CPU time allocated for a job class to run. ex Class B - 30 mins. If the job exceeds that timelimit it would abend.
And if you are sure that the job is going to take time then you can specify the

expat · Posted: Wed Apr 11, 2007 12:36 pm

Nimesh.Srivastava

Why do you think that sysprogs go to the trouble of implementing rules like this ? To limit the CPU consumption by job class. It is because of jobs going into loops holding an initiator and stopping others from working, also consuming valuable CPU resource but doing nothing, again impacting others.

I had an example of some idiot using the TIME=1440 parameter, submitted his job on Friday, and when he returned to work on the Tuesday, lo and behold his job was still looping.

The whole idea of this is that if S322 abends happen, the programmer should check his code out rather than just throw resource at it for no apparent reason.

dick scherrer · Posted: Wed Apr 11, 2007 7:35 pm

Hello,

As has been posted in these forums several times - do not use 1440. Also, as has been posted before, many of the better sites will automatically abort jobs submitted with 1440 unless they are things like CICS or database.

If your job actually needs lots of cpu time, talk with your scheduling and/or operations people for the proper job class to run the job in.

It may be that the job is not really a "big" job, but is just very poorly coded. If you have 50,000 records it should not take lots of cpu. If you have 500,000,000 records, it probably needs to be run in a class that supports more cpu use.

Arun Raj · Posted: Thu Jul 10, 2008 12:45 pm

Hello,

We have a simiar issue, so just thought of posting here.

We have a job which is running long(3 hours) in production and the same job completed in 30 mins in test region with the same inputs.

We noticed the EXCP-CNT is not increasing in production. What could be the reason for this?

Thanks,
Arun

murmohk1 · Posted: Thu Jul 10, 2008 3:18 pm

Arun,

expat · Posted: Thu Jul 10, 2008 3:48 pm

Is the job swapped out for periods of time ?
Is the CPU usage increasing without EXCP increase ?
Is the input really the same data ?

Arun Raj · Posted: Thu Jul 10, 2008 4:34 pm

Murali,

dick scherrer · Posted: Thu Jul 10, 2008 8:44 pm

Hello Arun,

Well, something is different. . .

Either some different code is running or some data is different.

What you describe says the process is most likely in a loop.

How many dd statements are in this batch process? No symbolic parameters to determine a prod from a test run?

Are the contents of any of the files changing or is 100% of the data static?

Arun Raj · Posted: Thu Jul 10, 2008 9:37 pm

Hello Dick,

Something is really different. Thats what we are unable to find out.

We ran the exact production code in test region. If it was a looping mistake, the same should have happened in test region also.

Ok, let me explain more about this. There are no files involved in this.
Input comes from a driver cursor, based on that, it performs delete/insert/update on some other tables.

We tested after copying all production tables to test region. Still same results. Table indexes are also same in both regions.

Thanks,
Arun

Pedro · Posted: Thu Jul 10, 2008 10:05 pm

dick scherrer · Posted: Fri Jul 11, 2008 12:42 am

Hello,

If the data in all of the database table(s) used is not the exact same as production, different results would almost surely occur. . .

If the code re-processed the same row or small set of rows over and over, it could appear like the program was in a loop.

Arun Raj · Posted: Fri Jul 11, 2008 11:34 am

Hello dick,

All the tables in test and prod were in sync(exactly same rows) prior to testing.

Thanks,
Arun

dick scherrer · Posted: Fri Jul 11, 2008 8:09 pm

Hi Arun,

When your system promotes code from test to production, is the code re-compiled, re-linked, re-bound, etc?