VSAM File free space

sancraig16 · New User Joined: 27 Mar 2018 Posts: 26 Location: usa

Need to identify the VSAM files that close to their full capacity in terms of storage space. Need to run this once on a daily basis .

I tested this by using LISTCAT with LVL parameter to identify all the VSAM files. Then execute the EXAMINE command on the VSAM clusters to identify the FREE SPACE available. This process run in batch mode takes close to 1 hour for about 500 VSAM files. Are there any alternatives which are not so time consuming ?

Robert Sample · Posted: Fri Mar 30, 2018 1:27 am

EXAMINE is an integrity verification tool more than a free space finder. Why not just use LISTCAT and extract the free space and allocated values from the LISTCAT output and do a little programming to calculate the free space percentage?

Personally, I think the whole approach is fundamentally flawed. A VSAM data set may have plenty of free space today but due to a batch of record insertions be completely out of free space tomorrow. You have to look at how dynamic the data in the VSAM data set is to know which ones are in danger of running out. One data set may have 0 free space but it never has any additional records inserted since it is a static table and hence that would never require any free space.

sancraig16 · New User Joined: 27 Mar 2018 Posts: 26 Location: usa

I am willing to use LISTCAT to calculate the freespace . I have pasted below the attributes for data portion of vsam file. Can you help me with a formula to calculate free space?

Robert Sample · Posted: Fri Mar 30, 2018 2:20 am

High allocated RBA is 1179648000 and free space is 410521600 so do the division and get about 34.8% free space. There is no formula involved -- at this point it is straight division.

Bonus comment: this data set (z/OS only has files on tape or in Unix) has a pretty poor definition. CI size of 4096 with 10% CI free space leaves about 3679 (plus or minus -- I didn't look up the exact formula) usable bytes per CI. With records of 760 bytes, you're storing 4 records per CI and the remaining space cannot be used. So basically over 1,000 bytes of each 4,096 CI is unused when the data set is loaded. Inserts can use the free space but I don't see inserts in the LISTCAT. The impact is that it took over 886 million bytes to store 578 million bytes of data (761097 record times 760 bytes per record).

Many sites use 4096 bytes for the CI size without considering the impact -- for some data sets 4096 is a good value; for other data sets 4096 is a bad value. For this data set it is eating up extra space.

sancraig16 · New User Joined: 27 Mar 2018 Posts: 26 Location: usa

Thanks ! When I run the Examine on the cluster for the same dataset I get the below information in SYSPRINT

IDC01722I 51 PERCENT FREE SPACE

What does this 51 percent free space denote ? How is this different from 34.8% we got from our calculation using LISTCAT above ?

Robert Sample · Posted: Fri Mar 30, 2018 6:33 pm

Total space is 1,179,648,000 and bytes for data is 578,433,720. Subtraction gives 601,214,280 bytes. Divide that by 1,179,648,000 and you get 50.9% (or 51%). EXAMINE looks at every CI; using the LISTCAT output will be faster but less accurate. If you want to spend the time, use EXAMINE but otherwise LISTCAT output is going to be good enough in almost every case.

sancraig16 · New User Joined: 27 Mar 2018 Posts: 26 Location: usa

We decided to go ahead with calculating free space based on LISTCAT. Examine seems to take much longer . We want to run it multiple times a day to capture any sudden growth in data sets.

How did you arrive at bytes of data figure of 578,433,720 ?

Robert Sample · Posted: Wed Apr 04, 2018 7:41 pm

I misread the LISTCAT output to get my number, so my apologies. EXAMINE will definitely take longer than a LISTCAT -- EXAMINE is doing much more.

Why do you think you will have sudden growth in data sets? Unless data comes in from a third party, it is quite rare for there to be drastic jumps in the amount of data processed each day. The way to handle such a case is to have the VSAM data set big enough to hold a daily spurt, and then offload / delete / define / reload the data every night if it is required. Using LISTCAT (or EXAMINE) multiple times a day seems like a lot of wasted effort.

sancraig16 · New User Joined: 27 Mar 2018 Posts: 26 Location: usa

This will be a short term activity for some new projects coming in due to which we are expecting more volume. It will be turned off after few weeks.

Regarding the numbers, are you suggesting that it is not possible to calculate the accurate free space % based on LISTCAT results ?

Robert Sample · Posted: Wed Apr 04, 2018 8:32 pm

LISTCAT gives you accurate free space numbers. HOWEVER, you need to understand that free space is not necessarily the whole picture. Your LISTCAT output shows HI-A-RBA as 1,179,648,000 bytes and free space as 410,521,600 bytes. Subtraction gives 769,126,400 bytes of used space. 751,097 records times 750 bytes is 563,322,750 bytes so there is a difference of 205,803,650 bytes between the amount of used space in this VSAM data set and the number of bytes a sequential data set would require. Some of the difference is in the index component (which you did not provide), but some of the difference will be in the data component. I've seen cases where the data set is 2 to 3 times larger than required because the index CI size was poorly chosen and prevents access to all of the data CI in the CA (note I'm not saying that your data set has this problem but it is a problem I've seen before).

sancraig16 · New User Joined: 27 Mar 2018 Posts: 26 Location: usa

Thanks for the explanation . Dataset I gave was from TEST . Here are the attributes from production. EXAMINE gives free space as 6% as shown below . I am assuming EXAMINE gives more accurate representation of free space. Is this correct ?

Based on your response even if I SUM up INDEX AND DATA attributes , I still end up with 3.4% free space based on FREESPC and HI-A-RBA numbers below. Is it even possible to account for the remaining 2.6% of free space based on the LISTCAT attributes?

EXAMINE :

Robert Sample · Posted: Wed Apr 04, 2018 9:59 pm

Yes, EXAMINE is looking at every CI so it will be the most accurate accounting of free space. It is not possible to reconcile the LISTCAT output with the EXAMINE output all the time -- this data set shows why. Even though the AVGLRECL and MAXLRECL are 4,000 (implying a fixed record length), when I divide the HI-U-RBA by the 8,472,700 record count the actual average record length is 1,296 (1295.65 rounded up) and EXAMINE states the longest record is 4,003 bytes. Hence some (maybe all) of the 535,992 CI have unused space in them -- EXAMINE will count this as free space even though LISTCAT will not.

sancraig16 · New User Joined: 27 Mar 2018 Posts: 26 Location: usa

Are you suggesting that FREESPACE provided by EXAMINE also includes the unused space in each record ? But this space is not usable . Can I say that LISTCAT gives accurate USABLE FREESPACE as compared to EXAMINE ? Ideally I would need the FREESPACE percentage in terms of number of records that I can further insert into the VSAM file . How can i arrive at this number ? Will FREESPC calculation based on (FREESPC /HI-A-RBA ) * 100 be a good representation of actual usable FREE SPACE ?

Robert Sample · Posted: Wed Apr 04, 2018 11:55 pm

Did you notice that your LISTCAT indicates free space of 389,283,840 while the difference between HI-A-RBA and HI-U-RBA is 388,730,880? That is a difference of 552,960 bytes more free space -- and I have no idea why the difference (maybe due to the striping?). EXAMINE looks at each CI, so it can determine free space per CI whether or not that free space is usable.

Broadly speaking, AS LONG AS CI AND CA FREE SPACE PERCENTAGES ARE ZERO, you can divide the free space by the average record length (actual, not the LISTCAT value) to know about how many records could be inserted at the end of the KSDS. However, if CI free space or CA free space is not zero, you would have to factor that into your calculation. And this assumes that you do not have a high-values key in the KSDS.

Furthermore, since VSAM handles CI / CA splits automatically, it is entirely possible for a single insert into your KSDS could cause a change in the free space. It is not possible to say your KSDS could accommodate so many more records since the location of the inserted keys would be determining whether or not splits occur.

If you have not read the IBM Redbook VSAM Demystified I strongly recommend you download and read it. It may not answer all of your questions but it would be a good place to start.

Pete Wilson · Posted: Sat Jun 09, 2018 10:05 pm

I think it would be advisable to check the DATACLAS's assigned to the datasets as that can have a major bearing on how/if the datasets can cope with additional data, and you might find there won't be an issue. The top part of the LISTCAT indicates what the DATACLAS name is, if applicable, then in ISMF option 4 you can display the Dataclas to see what attributes are applied.

Depending on the DATACLAS settings they can allow further volumes to be dynamically added so the dataset can extend as required up to the DYNVOL COUNT setting in the DATACLAS. This can be a maximum of 59 volumes, but unless the DATACLAS also has the Extended Addressability attribute defined the maximum size a VSAM dataset can reach is 4GB, so that can be a limiting factor.

For datasets that do not have a DATACLAS you do have the option to do an IDCAMS ALTER ADDVOLUMES to assign more Candidate volumes for the dataset to extend to if required.