GETMAIN / FREEMAIN versus STORAGE OBTAIN / STORAGE RELEASE

steve-myers · Posted: Wed Feb 01, 2017 1:45 am

Over in the CICS forum that mentions the use of the GETMAIN and FREEMAIN macros and then goes on to claim the STORAGE macros are faster. This is a lie.

the following program produced these messages -

STORAGE REQUIRED 10% MORE CPU TIME THAN GETMAIN/FREEMAIN
GETMTIME = 6.540824, STORTIME = 7.228047

UmeySan · Posted: Wed Feb 01, 2017 5:46 pm

@Steve

Personally, with those mips to day, i don't give shit about a second more or less.

Regards, UmeySan

From IBM:

z/OS MVS Programming: Authorized Assembler Services Guide
SA22-7608-17

The decision about whether to use GETMAIN or STORAGE OBTAIN to obtain virtual storage and FREEMAIN or STORAGE RELEASE to release the storage depends on several conditions:

The address space control (ASC) mode of your program. If it is in AR mode, use the STORAGE macro.

The address space that contains the storage your program wants to obtain or release. If the storage is in an address space other than the primary, use the STORAGE macro.

Whether the program requires a branch entry or a stacking PC entry to the macro service. Using the branch entry on the GETMAIN or FREEMAIN macro is more difficult than using the STORAGE macro. Therefore, you might use STORAGE OBTAIN instead of GETMAIN for ease of coding, for example, when your program:
Is in SRB mode
Is in cross memory mode
Is running with an enabled, unlocked, task mode (EUT) FRR

The branch entry (BRANCH parameter on GETMAIN or FREEMAIN) requires that your program hold certain locks. STORAGE does not have any locking requirement.

If your program runs in an environment where it can issue the FREEMAIN macro (as specified by the conditions listed above), you can use FREEMAIN to free storage that was originally obtained using STORAGE OBTAIN. You can also use STORAGE RELEASE to release storage that was originally obtained using GETMAIN.

steve-myers · Posted: Wed Feb 01, 2017 7:48 pm

Well, of course your decision depends on many things.

Many times the decision comes down to whether some mechanism can be put in place to avoid the use of MVS storage management entirely. Most of my private service functions, for example, require the caller to supply a small (as in 100 or 200 byte) work area, or by minimizing register usage, use parts of the caller provided register save area as a work area. Here the decision about storage management is put off to some other agent!

Several years ago I had a program that allocated - like my test program - many thousands of small storage areas that were freed in a block at the end of the program. In testing this process seemed to require a long time, so I instrumented it using TIMEUSED as in my testing program and found it was using close to a minute of CPU time! At the time my solution was to dust off a private function that would sub allocate little storage areas in larger - 4K - storage blocks, and free those 4K blocks in a group at the end. The one minute of CPU time went down to less than a second!

Years ago when the BAKR instruction was published as a possible alternative to the IBM "standard" subroutine entry/exit convention I tried it and found it was much worse! After some thought I realized it was saving 16 32 bit registers AND 16 32 bit access registers in ESA 390. In z/Architecture it is saving 16 64 bit registers AND 16 32 bit access registers. Thanks IBM, I'll stick with conventional mechanisms!

In the fine print associated with the development of XPLINK back in the 1990s, you'll notice there was no mention of BAKR, no doubt for the reasons I had developed in my earlier tests. XPLINK did make the point that the "standard" linkage convention required too many registers to be saved and restored, something I now think about in my own work, though this often frees up tiny amounts of register save area storage for use as a work area.

Just last week I researched the now hopelessly obsolete Burrough B5500, a very innovative machine in the 1960s. The discussion started me thinking about register oriented architectures like System/360 compared to the nearly universal stack architectures we see now. Back then my thought was stack used portions of the stack as a pseudo register in storage, which seemed to me at the time as slower than a real register. Now I realize stack architectures do not have a wealth of registers, which simplify (and speed up, for that reason) status saving (as in interrupts, or subroutine calls) and this may be why they are now so popular. Well, just a thought.

UmeySan · Posted: Wed Feb 08, 2017 4:16 pm

@ Steve

Thank's a lot for these detailed descriptions an your elaborations.

Just as you, years ago when IBM unnounced some new instructions regarding 64bit and z/Arch i took a shot of some of those things, but decided to keep to my conventional methods.

I dont't see no need in using relative large destination branch instructions
like BRAS/BRXH or multiply/divide like ML/DL, when programming some
normal commercial applikations., as complex as they could be. Using the same approved techniques - highly structured - well designed - as programming in COBOL.

As i saw in your profile, you're a retired professional. My deep respect, you're still keep on beeing busy oneself with application development and system engeneering.

I had retired two years ago and now i'm back again on some migration projekts about assembler programming. I's some kind of addicting.
Exceptionally when a snug income is assured.

Seams to me, that assembler is not dead as all, as announced so many years ago. Most of the banks here in Germany still have some programms in stock.

Regards, UmeySan

steve-myers · Posted: Wed Feb 08, 2017 9:01 pm

When the relative branch instruction came out in the 1990s, my analysis is the main beneficiary would be the compilers, which do tend to write relatively large blocks of code compared to Assembler programmers. For a couple of years, say from 2010 to 2013, I got in a jag where I just used relative branch instructions, but them I just mostly switched to conventional BC instructions.

In 1991, when I was unemployed for a considerable period, I learned C, and learned the C qsort library function function. By the end of 1991 I was back in real machines and using Assembler again. In 1993 I wrote QSORT. Like qsort it uses a compare function. After a while, I realized that by using BRC instructions I did not need a base register for the compare function, so I could cut down the registers I use, and the registers I had to save and restore. Shades of XPLINK. QSORT calls the compare function using conventional linkage -

CALL compare,(address1,address2)

Most programs that use QSORT call it multiple times, with multiple compare functions. The compare functions pretty much use a common startup sequence, usually -

SAVE 14
LM 14,15,0(1)
LR 1,15
LA 15,1
* Insert compare code here.

Register 15 has an initial return code, the qsort staples - negative, 0, positive, with the same meanings.

All of the compare functions share a common suffix, generally

PeterHolland · Posted: Wed Feb 08, 2017 11:15 pm

But what are you trying to prove? That you did some programming?

Good for you.

steve-myers · Posted: Thu Feb 09, 2017 1:25 am

Those fractions of seconds do add up. And, surprise, surprise, they're only noticed on mainframes. On toy and baby machines, no one attempts to measure this kind of stuff, so no one cares because it can't be measured. This is one reason projects tend to gas out more often on toy and baby machines because no one checks this in advance. Measurement tools, though they are disappointing in their repeatability, have been in mainframes for decades, though the discipline to actually use them seems to have disappeared, probably because it is not in CS curricula (because it can't be done on toy and baby machines. These guys seem to depend on Moore's "law" to cover up their lapses.

The raw outer shell in my QSORT isn't all that great. Actually, I stole it from K & R, where it was simplified for demonstration purposes. But I can partly make up for it by being efficient in other places, like the compare function.

What I was trying to demonstrate is YOU HAVE TO NOTICE THE SMALL STUFF. ST 14 is a lot faster than STM 14,2. L 14 is much faster then L 14/LM 0,2. You will see a difference if you call the compare function 100,000 times.

UmeySan · Posted: Thu Feb 16, 2017 8:12 pm

Hi Steve

I Think this is still a endless debate. I still see some unavoidable necessity to to keep a close watch at the processing time of large and complex SQL-Statements. Could be more critical than some assembler instructions.
Could be measured with Strobe and Omegamon.

But like i said, i'm not the guy who cares about numbers of bytes or some seconds at run time. I don't use bit-switches with TM anymore. I see no need to care about, doing commercial programming. Some seconds give or take doesn't earn a victory or a golden watch.

Have a nice weekend, regards, UmeySan

Goedendag Peter

Hartelijk dank voor bemoediging.
Plezant weekeinde, groet, UmeySan

PeterHolland · Posted: Thu Feb 16, 2017 10:00 pm