DB2 Unload batch job monopolizes cpu

Michael Bieganski · New User Joined: 31 Jul 2007 Posts: 6 Location: Chicago

Hi! A while back a DBA ran a batch job to unload some partitions from a database. We know catagorically that the job's WLM service class was the lowest possible (Importance = 5 Velocity = 1), and yet the entire time the job was running, it was at the very top of CPU%, at the expense of production jobs and tasks that had higher service class importance and velocities.
The job was not looping. The WLM reports listing "promotions" for that time frame and service class do not explain how this bottom-feeder job goes to the top of the list in CPU% on a busy cpu with just about everything else running on it at a higher service class than this unload job.
(We've tried this 3 times and all 3 times this user job goes to the top of the cpu chain. SDSF shows the job remaining at the same low BATD svc class).

Is there something magical about simple DB2 load/unload batch that can cause it to override its low service class and become king of a busy lpar?
(again...we dont believe enq-lock promotion tells the whole story).
Does the batch job somehow take on the higher dispatching juice of the DB2 started task???
thanks....bewildered

Bill Woodger · Posted: Thu Aug 04, 2011 4:52 am

I'm sure a couple of guys will be around tommorow to give you a better answer.

If this sort of effect is "routine", then I'm sure it would be covered in the manual. Worth checking as some exceptional circumstances affecting performance might be covered.

Has your DBA unloaded other partitions with/without the problem occurring - or, another way, is it only these partitions which cause the effect?

If only those, is it all of those - if logical, can they be unloaded individually and if so do they all do the same thing to the CPU usage.

Having identified the CPU hoggers, what is it that is different about them from other partitions? And what does the manual say about the differences.

I have seen the effect you describe (inheriting priority from elsewher) on DOS/VSE. A program doing not much but reading a file and outputting print, which was set up to route to VM where the printers were attached. Whatever priority was assinged to it, it ended up with 50% of the CPU, with the other 50% used by POWER (sort of JES on VSE) and all we could do was wait for it to finish. Printers attached to a "VIRTUAL MACHINE" under VM for ease of swapping to a physically different CPU. Stuck with it.

Similar running a syntax checker under ICCF. Hogs the whole CPU (all other terminals lock on next transmit, batch jobs stop processing, tapes stop reeling, disks keep spinning, but only cos they never stop, not because they are being used).

Once we knew what was going on, we could deal with it. First job, don't schedule alongside other work. ICCF syntax checking, don't do it. Anybody who does it (everybody knows, cos lights dim) loses fingers, one at a time :-) Or buys all the drinks on Friday...

Robert Sample · Posted: Thu Aug 04, 2011 5:05 am

What was the CPU utilization while the unload job was running?

And importance=5, velocity=1 is not the very lowest priority in WLM -- that is reserved for DISCRETIONARY work.

Do you have MXG or MICS to analyze the SMF data? MXG's ASUMCEC and RMFINTRV data sets can provide some insight -- but only if you have the product.

sushanth bobby · Posted: Thu Aug 04, 2011 4:59 pm

Hi,

In our shop, the system administrator told us that DB2 is given 65%-70% priority in WLM. In our case, we need to be aware of queries coming from open-systems. Being, 1 LPAR shop, something being run in development or uat unknowing can cause lots of issues.

I don't know, if this helps, but definately a talk with system admin regarding WLM settings would help.

Thanks,
Sushanth