IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

DASD Response times


IBM Mainframe Forums -> JCL & VSAM
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
vasanthz

Global Moderator


Joined: 28 Aug 2007
Posts: 1742
Location: Tirupur, India

PostPosted: Wed Jul 09, 2014 8:58 pm
Reply with quote

Hi,

Quote:
DASD Response time = CONN + DISC + PEND + IOSQ

 IOSQ time is a delay time accumulated while the I/O is still in MVS and is waiting
for a UCB to allow the I/O against the device. (UCB an MVS control block)


Could you please explain in layman terms what IOSQ means.
Does it mean that UCB is unavailable for a particular request for IOSQ period of time?

An example for IOSQ in real time scenario would also be helpful to understand.

We are seeing more than 30ms response time with moderate IO activity rate. IOSQ value is high for some volumes(10-20 ms).

Thanks & Regards,
Back to top
View user's profile Send private message
Paul Voyner

New User


Joined: 26 Nov 2012
Posts: 52
Location: UK

PostPosted: Thu Jul 10, 2014 12:20 pm
Reply with quote

Vasanthz, you could probably have found the answer yourself if you'd googled it.
Here's one of many clear explanations "IOS Queue represents the average time that an I/O waits because the device is already in use by another task on this system, signified by the device's UCBBUSY bit being on"
In laymans terms, more than one address space* is trying to access the disk, so the requests have to queue until the disk - or more accurately the UCB for that disk -is free. Just like if only one checkout is available in the supermarket, so everyone queues up waiting their turn.
(* @pedants,yes it could also be one address space and multiple TCBs, but the guy wants a simple explanation)

BTW those response times are very bad. You'll need to investigate with a monitoring tool e.g. RMF, Omegamon
Back to top
View user's profile Send private message
vasanthz

Global Moderator


Joined: 28 Aug 2007
Posts: 1742
Location: Tirupur, India

PostPosted: Thu Jul 10, 2014 3:48 pm
Reply with quote

Hi Paul,

Thank you very much for the clear explanation. I now have a mental picture of what it is.
Quote:
You'll need to investigate with a monitoring tool e.g. RMF, Omegamon
We are using them to investigate.

Thanks & Regards
Back to top
View user's profile Send private message
Ed Goodman

Active Member


Joined: 08 Jun 2011
Posts: 556
Location: USA

PostPosted: Thu Jul 10, 2014 6:15 pm
Reply with quote

Is...is that a velociraptor with a machine gun and a bomb...riding a great white shark???

A case study from the old STROBE manual had a situation where a travel agency had a spike in DASD wait time. They discovered that the spot on the disk was for DisneyLand reservation information. So everyone was banging against the same spot on the disk.

If you are adding cases/invoices/customer records at the end of a file area, then doing most of your work against those new entries, this can happen.

If you are using mammoth buffers trying to speed things up, but one program is locking down records for updates, this can happen.
Back to top
View user's profile Send private message
Akatsukami

Global Moderator


Joined: 03 Oct 2009
Posts: 1788
Location: Bloomington, IL

PostPosted: Thu Jul 10, 2014 7:31 pm
Reply with quote

Ed Goodman wrote:
Is...is that a velociraptor with a machine gun and a bomb...riding a great white shark???

Indeed it is.
Back to top
View user's profile Send private message
vasanthz

Global Moderator


Joined: 28 Aug 2007
Posts: 1742
Location: Tirupur, India

PostPosted: Thu Jul 10, 2014 8:29 pm
Reply with quote

Hello Ed,

Quote:
Is...is that a velociraptor with a machine gun and a bomb...riding a great white shark???


Yes correct, except thats an uzi SMG :-)

Quote:
If you are using mammoth buffers trying to speed things up, but one program is locking down records for updates, this can happen.

I was having an understanding that large buffers equated to lengthy transfers and this time is attributed to CONN time, and not IOSQ time.

Or do large buffers affects both IOSQ and CONN?

I have a lot of reading to do :S

Thanks & Regards,
Back to top
View user's profile Send private message
steve-myers

Active Member


Joined: 30 Nov 2013
Posts: 917
Location: The Universe

PostPosted: Thu Jul 10, 2014 10:25 pm
Reply with quote

vasanthz wrote:
... Or do large buffers affects both IOSQ and CONN?

I have a lot of reading to do :S
Sometimes in performance analysis you run into (sorry about the language for a family oriented web site, but I don't think too many will be offended) situations where you're damned if you do and damned if you don't.

Most performance analysts regard large buffers, typically 1/2 track for 3380 and 3390 devices, and full track for older devices as a Good Thing for batch processing of basically sequential data, BUT (there's usually a BUT) if the device with the sequential data shares an I/O path with a device containing data used by an online process like CICS, the longer path busy time required to transfer the large records for the sequential process can interfere with the online process.
Back to top
View user's profile Send private message
expat

Global Moderator


Joined: 14 Mar 2007
Posts: 8797
Location: Welsh Wales

PostPosted: Mon Jul 14, 2014 6:22 pm
Reply with quote

Have you also thought about using PAV ?
Especially for the most utilised volumes

Maybe SDS (sequential Data Striping) could also help by spreading the naughty dataset over multiple volumes.
Back to top
View user's profile Send private message
vasanthz

Global Moderator


Joined: 28 Aug 2007
Posts: 1742
Location: Tirupur, India

PostPosted: Tue Jul 15, 2014 6:23 pm
Reply with quote

Quote:
Have you also thought about using PAV ?

I am not sure if PAV is enabled for these volumes, I would check, Thank you.

Quote:
sequential Data Striping

Is dataset striping still required? I remember reading that with the latest DASD boxes like DS8000, and RAID & RANK architechture... DASD boxes by default stripes a dataset at hardware level across multiple physical disks.
64 physical volumes for RAID 5 with 8 RANKS.

Regards,
Back to top
View user's profile Send private message
expat

Global Moderator


Joined: 14 Mar 2007
Posts: 8797
Location: Welsh Wales

PostPosted: Tue Jul 15, 2014 6:51 pm
Reply with quote

SDS is for logical volumes. I used this about 10 years ago for poor response times and it worked rather well.

If you stripe the dataset over say XX logical volumes, you can have up to XX simultaneous accesses to the dataset which has been spread over the the number of logical volumes with one access to each of the logical volumes. SMS sort of knows where all of the data is so it's pretty easy to implement and use.

It's been quite a while since I've been heavily involved with DASD farming so things may well have changed.
Back to top
View user's profile Send private message
Paul Voyner

New User


Joined: 26 Nov 2012
Posts: 52
Location: UK

PostPosted: Wed Jul 23, 2014 12:26 pm
Reply with quote

Striping works very well when the need is to improve performance for specific datasets. I did a test a while ago with a dataset which sustained an I/o rate of 5000 on a single disk, but 20000 as a 6-way stripe.
But I'd guess that Vasanth's problem with high IOSQ is more likely to be caused by high contention for a volume by a large number of users e.g. a TSO work volume. That won't be helped by striping.
Back to top
View user's profile Send private message
expat

Global Moderator


Joined: 14 Mar 2007
Posts: 8797
Location: Welsh Wales

PostPosted: Wed Jul 23, 2014 1:56 pm
Reply with quote

Paul, I've not come across a situation where SDS hasn't been beneficial before.

Even if the volumes are high usage by splitting the dataset over a number of volumes spreads and reduces the contention too.
Back to top
View user's profile Send private message
vasanthz

Global Moderator


Joined: 28 Aug 2007
Posts: 1742
Location: Tirupur, India

PostPosted: Wed Jul 23, 2014 4:25 pm
Reply with quote

Hi,

Thank you Paul & Expat for your views.

Please feel free to correct me if the below statements are wrong. I've been doing some reading & below is my understaning.

In modern DASD boxes, a dataset residing in single volume or multiple volume does not matter, due to hardware striping with RAID and RANK.

Even a logical single volume dataset may actually be residing in multiple physical volumes in the box. So the response time of a single logical volume is actually response time of a number of combination of physical volumes.

The logical volume comes into picture only when there is a contention for UCB. Since the logical dataset is placed in multiple physical volumes or available straight off the cache we can perform mutiple concurrent reads of a dataset, but z/OS is unable to do concurrent reads/writes eventhough the hardware supports it. So a concept of PAV was implemented to make z/OS think it is writing to multiple UCBs, but actually it writes to only single logical volume. HyperPAV gave the most performance by dynamically allocating multiple UCBs to a particular logical volume.

IOSQ delay occurs only when there is a shortage of UCB.
If we enable hyperpav then we could possibly mitigate the IOSQ delay.

Is it correct? or a bunch of nonsense? :S

In our shop we have hyperPAV enabled. Could you please let me know if it is possible to determine the utilization of UCBs? so we can determine if there is a shortage of UCBs from hyperpav's pool?

Thanks & Regards,
Back to top
View user's profile Send private message
Paul Voyner

New User


Joined: 26 Nov 2012
Posts: 52
Location: UK

PostPosted: Wed Jul 23, 2014 4:41 pm
Reply with quote

Vasanth - 10/10 for your summing up. You've nailed it. The only quibble I have is the wording "IOSQ delay occurs only when there is a shortage of UCB" when I think it should be something like "IOSQ delay occurs when UCB is busy because an IO is already active to the device".
Of course, enabling PAV isn't something you can do overnight. Striping is easier and can be implemented in SMS routines with minimal risk. Or, easier still, you could simply move some of the most heavily used datasets to another volume, if you know which they are (and that's not easy to find out without a nice tool like Omegamon)
Back to top
View user's profile Send private message
expat

Global Moderator


Joined: 14 Mar 2007
Posts: 8797
Location: Welsh Wales

PostPosted: Wed Jul 23, 2014 4:44 pm
Reply with quote

Hi Vasanthz,

Hardware striping and Sequential Data Striping are two completely different beasties. SDS stripes the dataset across a number of volumes making a larger UCB range for accessing it. SMS keeps a map of what data is where, it isn't as straight forward as putting nn GB on one volume and then the next nn onto the next volume. It writes a chunk on the first volume, and then the next volume until it reaches the specified stripe count and then starts again from volume 1 through to volume nn.

It really is worth investigating with your DASD farmers.

As for HyperPAV - it may be installed at your site but possibly not available to the volumes that you are having grief with. That's something else you will need to find out from your shop.

It might also be beneficial to take a look at the volume(s) that are causing the problems to see what else is on the volume and how heavily used it is. In the past I've done this analysis and moved a few datasets about and improved the situation greatly.

I know how much you love sifting through SMF and RMF data icon_lol.gif

Good luck
Back to top
View user's profile Send private message
vasanthz

Global Moderator


Joined: 28 Aug 2007
Posts: 1742
Location: Tirupur, India

PostPosted: Wed Jul 23, 2014 5:37 pm
Reply with quote

Quote:
you could simply move some of the most heavily used datasets to another volume,


Quote:
It might also be beneficial to take a look at the volume(s) that are causing the problems to see what else is on the volume and how heavily used it is. In the past I've done this analysis and moved a few datasets about and improved the situation greatly.


Thank you I would do that study.

Quote:
I know how much you love sifting through SMF and RMF data
SMF - Definitely fun :-)
Back to top
View user's profile Send private message
Pete Wilson

Active Member


Joined: 31 Dec 2009
Posts: 580
Location: London

PostPosted: Wed Jul 23, 2014 11:56 pm
Reply with quote

You need to speak to your hardware support and/or vendor people to establish if there's a hyperpav/ucb issue, if channels are busy or degraded, or if your PPRC/XRC mirrors struggling to keep up due to link/network errors etc which can all contribute to response times. GRS set up can still have an effect if volume reserves are not all converted to global enqueues

In the meantime expats suggestion of Striping and/or moving contentious data around other volumes/pools probably wouldn't go amiss, but that would be your call based on your better knowledge of the data.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> JCL & VSAM

 


Similar Topics
Topic Forum Replies
No new posts Parallelization in CICS to reduce res... CICS 4
No new posts Build a record in output file and rep... DFSORT/ICETOOL 11
No new posts MQ response when MQGET of a stopped q... Java & MQSeries 2
No new posts How can i link the RHDCSNON programa ... IDMS/ADSO 2
No new posts DASD - non SMS - volser change - VSAM... JCL & VSAM 2
Search our Forums:

Back to Top