Portal | Manuals | References | Downloads | Info | Programs | JCLs | Master the Mainframes
IBM Mainframe Computers Forums Index
 
Register
 
IBM Mainframe Computers Forums Index Mainframe: Search IBM Mainframe Forum: FAQ Memberlist Usergroups Profile Log in to check your private messages Log in
 

 

Batch job tuning

 
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> Testing & Performance analysis
View previous topic :: :: View next topic  
Author Message
sgandhla

New User


Joined: 23 Mar 2017
Posts: 3
Location: USA

PostPosted: Fri Mar 24, 2017 9:41 pm    Post subject: Batch job tuning
Reply with quote

Hi Everyone,

This is my first post in the forum, I spent one month in analyzing an issue in my new role without any luck. Hope I find some assistance here.

we have a batch job hosted from different locations, both of them are running in EC12, everyday we recycle the job at 15:00 GMT , so job is down from 11:00 t0 15:00 when we restart the job one site is able to process ~15 million txn in 15 minutes interval other site can process ~3million txn, but by end of the day both are able to process same number of txn(~200 million), the slow running lpar catches up after 2 hours , it is able to process ~10 million in 15 minutes. so I looked few things like looked at system loads at that time, both are running less than 50% , WLM policies , its the same , job is exactly same(as per application team), I changed weights of the LPAR to get more vertical Highs. without any improvement in performance. I pulled numbers from SMFINTRV member which shows both consume same CPU time, but the LPAR that process slowly has higher I/O time than the other one. as one more attempt we made a WLM change to the slow running LPAR service class to increase I/O priority to high, the one unsolved puzzle is when the job process less number of txn it goes to DW status and does nothing for atleast 10 minutes in 15 minutes in when I looked in real time from SDSF.
Back to top
View user's profile Send private message

Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8054
Location: East Dubuque, Illinois, USA

PostPosted: Thu Mar 30, 2017 12:41 am    Post subject: Reply to: Batch job tuning
Reply with quote

IF the LPARs are defined the same (same memory, same processor weight, and so forth) I'd look at the I/O situation. Look at the SMF record type 70 and 72 records for each LPAR to see their I/O and channel stats.
Back to top
View user's profile Send private message
sgandhla

New User


Joined: 23 Mar 2017
Posts: 3
Location: USA

PostPosted: Thu Mar 30, 2017 1:28 am    Post subject: Re: Reply to: Batch job tuning
Reply with quote

Hi Robert,
Thanks for response, weights are different we have less number of LPARS in the CEC where we get faster response, but both the CEC'S are running less than 50%, I looked into type 70 and 72 channel stats looks fine, In I/O stats, over a day CPU time remains same but I/O time is 50% more in the slow processing sites. we are using same amount of flash drives in both the sides.
I did run a strobe report to see if there is any issues from application side, it shows
IEAVEWAT (wait service) as 88.48% in 10 min sample. again stuck with this unknown module (for me) to go further, google didn't help me much other than giving the some information like cross memory reference or linkage or I/O interrupts.
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8054
Location: East Dubuque, Illinois, USA

PostPosted: Thu Mar 30, 2017 2:33 am    Post subject:
Reply with quote

Quote:
weights are different we have less number of LPARS in the CEC where we get faster response, but both the CEC'S are running less than 50%
If the CEC are the same machine / model, then different weights automatically imply performance will be different between the two LPARs. If the machine / model are different for the two CEC's then you'd have to look at the weighted LPAR for each machine to make any kind of valid comparison.

And the CEC running less than 50% means what? For example, if the LPAR is capped and the other LPARs are running very low utilizations while the LPAR in question is running 100% CPU utilization then the CEC utilization being under 50% would mean absolutely nothing since the 100% LPAR utilization is what would matter.

I think you're going down the wrong way looking at application performance with STROBE. The difference in I/O rates (a 5:1 ratio between the 2 LPARs) in the system configuration is significant -- application performance is not likely to be relevant with such a difference. Something is not the same between the LPARs -- WLM policy, channels, I/O paths, or whatever -- to have such an impact on performance. You may have to start with the IODF for each LPAR and look at everything to find the reason for the difference, but it seems extremely likely that there is something making a difference.
Back to top
View user's profile Send private message
sgandhla

New User


Joined: 23 Mar 2017
Posts: 3
Location: USA

PostPosted: Thu Mar 30, 2017 3:19 am    Post subject:
Reply with quote

Robert Sample wrote:
Quote:
weights are different we have less number of LPARS in the CEC where we get faster response, but both the CEC'S are running less than 50%
If the CEC are the same machine / model, then different weights automatically imply performance will be different between the two LPARs. If the machine / model are different for the two CEC's then you'd have to look at the weighted LPAR for each machine to make any kind of valid comparison.

And the CEC running less than 50% means what? For example, if the LPAR is capped and the other LPARs are running very low utilizations while the LPAR in question is running 100% CPU utilization then the CEC utilization being under 50% would mean absolutely nothing since the 100% LPAR utilization is what would matter.



None of the LPARs are capped in both the CEC's and when I say CEC's are running at 50% , we have 50% wide space in terms of CPU for the LPARS to expand if they have demand and I also adjusted weights in both the sides to get same number of vertical High to maintain the same polarization. with all these the only difference I see in I/O time between both the sides, I am pretty new to Storage performance tuning, Is there something you can suggest me where to start for storage performance stats and what to look in to find out smoking guns. I will start looking at out from the bottom of both the systems
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8054
Location: East Dubuque, Illinois, USA

PostPosted: Thu Mar 30, 2017 4:31 am    Post subject:
Reply with quote

Quote:
we have 50% wide space in terms of CPU for the LPARS to expand if they have demand
I am not sure what you mean by this. CEC utilization can be important in a heavily used system, but almost all the time the LPAR utilization is VASTLY more important to batch job performance. Does your site have MXG or MICS or another SMF analysis tool? If so, look at this data rather than the raw SMF records since the raw data needs a lot of work to be usable. What is the LPAR utilization during this time period?

There has to be something different in the CEC / LPAR definitions to see such a radical difference in I/O performance. The hard part is figuring out what that difference is!
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> Testing & Performance analysis All times are GMT + 6 Hours
Page 1 of 1

 

Search our Forum:

Similar Topics
Topic Author Forum Replies Posted
No new posts HLIST Utility In Batch Virendra Shambharkar TSO/ISPF 4 Fri Apr 07, 2017 3:38 pm
No new posts MIPS/CPU consumption reduction in Batch vishwakotin DFSORT/ICETOOL 4 Sat Mar 18, 2017 5:46 pm
No new posts PL/I code tuning/Performance improvement Virendra Shambharkar PL/I & Assembler 4 Mon Dec 05, 2016 11:57 am
No new posts How does a called pgm know if its cal... Graeme Westerman COBOL Programming 5 Tue Nov 29, 2016 9:25 pm
This topic is locked: you cannot edit posts or make replies. MIPS reduction for Batch job Virendra Shambharkar All Other Mainframe Topics 8 Mon Nov 07, 2016 4:02 pm


Facebook
Back to Top
 
Mainframe Wiki | Forum Rules | Bookmarks | Subscriptions | FAQ | Tutorials | Contact Us