IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Assembler - Performance of Compare Instructions


IBM Mainframe Forums -> All Other Mainframe Topics
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Binop B

Active User


Joined: 18 Jun 2009
Posts: 407
Location: Nashville, TN

PostPosted: Mon Aug 01, 2011 12:31 pm
Reply with quote

Hi All,

Its been sometime since I have been here... Hope everyone is doing good ... icon_smile.gif

Query
Have a doubt regarding the usage of Compare Instructions and its usage in Assembler Programming. Am not sure, but I guess somewhere in my college days it was taught that using COMPARE instructions would actually take more time to get executed than a normal MOVE instruction. Is this true ? I cant remember the reason behind this, but it was something related to the usage of NSI in PSW. Did some search in google for the last two days but am not able to find anything concrete ...

Scenario
Currently we have coded a program (subroutine) with a check for a particular field. This field could have values A1, A2, A3 or A4 and the processing is slightly different for each. For every call of this program, would it be a good idea to recode is such a way to use the second byte (1, 2, 3, 4 ) and put a logic to branch accordingly. Is it worth the effort or would it be better to simple leave it like that ... For a big transaction (functionality wise) this routine could be called thousands of times during a single task (online transaction).
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Mon Aug 01, 2011 1:05 pm
Reply with quote

Does the transaction appear "slow" when it is used? If it is getting a "normal" sort of response (hit the key causing transmission, and available for input almost straight away) then it's not going to be worth doing anything to improve it, no one will notice.

If the response is "slow", and the users want something done about it, and you've been given the task, then you have a different question, which is "why does it take so long" and you start looking at it from the highest level, and narrowing it down with what tools you have available.

It is unlikely, these days, that making the sort of change that you are talking about will make any difference unless you already know that there is some very heavily used code, which the stuff you are talking about is part of.

You are right, arranging branching on the 2nd byte would be "faster". With today's machines, it is much better to concentrate on understandability and maintainability than to look for performance which no-one is ever going to notice.

Do it how everyone else does it at your site (which probably means leaving the code alone). The thinking, coding, admin, testing, etc is not going to be worth it, then someone is going to come along on maintenance, not recognise it for what it is, and screw up, or waste time until they understand it.
Back to top
View user's profile Send private message
Binop B

Active User


Joined: 18 Jun 2009
Posts: 407
Location: Nashville, TN

PostPosted: Mon Aug 01, 2011 1:37 pm
Reply with quote

Thanks a lot Bill for your guidance ... icon_smile.gif

As of now, the users are having a little concern regarding the response time but they haven't raised any "official concern" or request as such .. As suggested, will leave the code as it is now but will come back to it in case any questions start to pop up ... icon_wink.gif

Also, thanks again for the confimation that - for the above scenario, the branching would be faster than the comparison logic.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Mon Aug 01, 2011 1:53 pm
Reply with quote

Binop,

If you pick up the performance thing in the future, don't go straight for that bit of code. Unless in a huge iteration, it is more likely something else, most likely I/O related.
Back to top
View user's profile Send private message
PeterHolland

Global Moderator


Joined: 27 Oct 2009
Posts: 2481
Location: Netherlands, Amstelveen

PostPosted: Mon Aug 01, 2011 1:55 pm
Reply with quote

Whatever you do, you have to do some comparing. By means of a CLC or
TM to trigger your branch.
The only way to do branching without comparing you need to set up a branch table, see : en.wikibooks.org/wiki/360_Assembly/Branch_Instructions

You need then 4 instructions to execute your branch, instead of 2 for CLC or TM.
Back to top
View user's profile Send private message
Binop B

Active User


Joined: 18 Jun 2009
Posts: 407
Location: Nashville, TN

PostPosted: Mon Aug 01, 2011 1:58 pm
Reply with quote

Sure Bill... will keep that in mind ..
there are a lot of I/O operations in the routine, but while coding itself we have tried to put it in the best possible way we could think of... icon_razz.gif
Back to top
View user's profile Send private message
Binop B

Active User


Joined: 18 Jun 2009
Posts: 407
Location: Nashville, TN

PostPosted: Mon Aug 01, 2011 2:09 pm
Reply with quote

Hi Peter.. Hope you doing good ... icon_smile.gif

Quote:
The only way to do branching without comparing you need to set up a branch table, see : http://en.wikibooks.org/wiki/360_Assembly/Branch_Instructions
Thanks Peter... this is the reference I was looking for .. icon_cool.gif ... Had something like this only in my mind as I have seen a similar logic in one of the programs here...

Quote:
You need then 4 instructions to execute your branch, instead of 2 for CLC or TM.
This should be okay, right ? the performance should be still better if I understood correctly. As Robert says - "TANSTAAFL" ... Some extra lines of code is probably what I need to pay for the extra performance... icon_wink.gif

Thanks again for the link ...
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Mon Aug 01, 2011 8:13 pm
Reply with quote

Hi Binop,

Yes, the compare should take a bit more cpu than the move.

Having said that, you will not be able to measure the difference unless you write loops that do a billion comparea and another that does a billion moves. Display the time at the beginning of the first loop, between the loops and after the 3rd loop to see what little difference.

If you have a situation that has performance problems, this would be one of the last places to spend time on (imho).
Back to top
View user's profile Send private message
Bill O'Boyle

CICS Moderator


Joined: 14 Jan 2008
Posts: 2501
Location: Atlanta, Georgia, USA

PostPosted: Mon Aug 01, 2011 8:54 pm
Reply with quote

Binop,

You could verify the contents with three CLI's. The first one for an 'A' and the second and third for not less than C'1' and not greater than C'4'.

If none of the above are false, then store the 2nd-byte (STC) in a work-register and "AND" this register with a F'15' (result is F'1' through F'4') and you're ready for branching. This will remove the need for a PACK and CVB. An SLL,3 will then result in a F'4' through F'16' in the work-register, if so desired.

The above instructions are very cheap.

Also, for testing a label (from 2-256 bytes in length) for X'00's, consider the OC (into itself) with a BZ indicating X'00's.

An alternative for loading a 4-Byte label into a register to test for X'00's, without having to clear the register beforehand, take a look at the ICM instruction, with BZ indicating X'00's.

ICM's and STCM's are used by COBOL for COMP-5 internal manipulation.

HTH....

Bill
Back to top
View user's profile Send private message
Bill O'Boyle

CICS Moderator


Joined: 14 Jan 2008
Posts: 2501
Location: Atlanta, Georgia, USA

PostPosted: Tue Aug 02, 2011 8:58 am
Reply with quote

Quote:

If none of the above are false, then store the 2nd-byte (STC) in a work-register

An STC is used to store the low-order contents of a work-register into a single byte. icon_eek.gif

I meant to say use an "IC" (Insert Character) to insert the 2nd-byte in the low-order of a work-register. Then continue with the "AND" using an F'15' mask, which will clear the work-register, except the low-order four-bit nibble. icon_wink.gif

Bill
Back to top
View user's profile Send private message
Binop B

Active User


Joined: 18 Jun 2009
Posts: 407
Location: Nashville, TN

PostPosted: Tue Aug 02, 2011 3:24 pm
Reply with quote

Quote:
If you have a situation that has performance problems, this would be one of the last places to spend time on (imho).
Thanks Dick .. Will surely keep this mind ... Frankly speaking, if any performance issue does come up - this is the only place that is coming to my mind now ... icon_razz.gif

Quote:
You could verify the contents with three CLI's. The first one for an 'A' and the second and third for not less than C'1' and not greater than C'4'.
icon_redface.gif .. Thanks Bill... Had forgotten about the field validation... this would mean from my current code - 4 CLCs would get changed to 3 CLIs, NC, SLL and a B instruction... icon_eek.gif ..
Back to top
View user's profile Send private message
Binop B

Active User


Joined: 18 Jun 2009
Posts: 407
Location: Nashville, TN

PostPosted: Tue Aug 02, 2011 3:44 pm
Reply with quote

icon_idea.gif Out of curiosity ... ...

A quote from the link the Peter has shared ...
Quote:
A conditional branch instruction causes the location counter in the PSW to be set to the address specified in the register or the register plus a 12-bit offset
Does this mean that if we have a condition check - the chances of which it will be TRUE is negligible, then, performance wise, is it better to have BNE rather than BE after a CLC (Compare) instruction ?

Option 1
CLC <condition>
BE <label2>
label1 ..
:: ::
label2 ..
:: ::

Option 2
CLC <condition>
BNE <label1>
B <label2>
label1 ..
:: ::
label2 ..
:: ::

If <condition> is almost never going to get satisfied, then isn't Option2 a better option than Option1.
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8696
Location: Dubuque, Iowa, USA

PostPosted: Tue Aug 02, 2011 4:09 pm
Reply with quote

Quote:
For a big transaction (functionality wise) this routine could be called thousands of times during a single task (online transaction).
Personally, I think you're looking in the wrong area. For you to see any difference due to your assembler instructions, those instructions are going to have to execute BILLIONS or HUNDREDS OF BILLIONS of times -- not mere thousands. Code optimization can improve performance, yes, but these days it is rare and those cases tend to be more obvious since the code is looping a lot.

Slow transaction response time is much more likely to be due to database usage, browsing VSAM files inappropriately, or using LINK or XCTL to jump between programs like a rabbit on meth -- looking at the instructions being used is almost always a waste of time.
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Tue Aug 02, 2011 5:06 pm
Reply with quote

not mentioned is the fact
nowadays, it seems that batch functions/processes are being performed in CICS Regions.

CLUE: CALLed thousands of times....
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Tue Aug 02, 2011 5:18 pm
Reply with quote

Binop B wrote:
[...]
Frankly speaking, if any performance issue does come up - this is the only place that is coming to my mind now ... [...]


Binop,

Look back through all the posts, and read them again. Nobody has suggested you will get far by trying to tune code at this level, with regards to the performance of your transaction. Unless this section of code is used an enormous number of times, you will not notice the difference, and you would do your analysis, coding, testing, cycling and implementation for no noticeable outcome.

Don't look at the machine code level. Most likely it is IO related. If not, next most likely, program doing something a dumb way, which still works, and so on down.

Start at the top, not at the bottom.
Back to top
View user's profile Send private message
Binop B

Active User


Joined: 18 Jun 2009
Posts: 407
Location: Nashville, TN

PostPosted: Tue Aug 02, 2011 5:19 pm
Reply with quote

Thanks Robert for your inputs ... icon_smile.gif

Quote:
database usage, browsing VSAM files inappropriately, or using LINK or XCTL to jump between programs like a rabbit on meth
Yes Robert... We are facing this situation where we have a very large amount of LINK / XCTL calls ... Not sure how to avoid this though ... Keeping the LINK / XCTL in place, are there anything else we can do ... something maybe like the RESIDENT option ?
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Tue Aug 02, 2011 5:52 pm
Reply with quote

Binop B wrote:
[...]
If <condition> is almost never going to get satisfied, then isn't Option2 a better option than Option1.


Yes, marginally. It is "old-school" to code for the minimal number of instructions executed, because it used to matter. Also techniques for using the least amount of storage for the program code itself, things like that, because it used to matter.

If you are going to pick up performance analysis on this, do some thinking about it.

Imagine the typical batch program with a look-up file, say a KSDS.

The first big improvement is if you are able to tune the file for general system-wide use, because it was define poorly.

The next is if you can put the whole thing in a Cobol table.

The next is to choose the optimal method to search the data, depending on the data itself (if a large majority of "hits" are on a small number of keys, do the table in "hit" order and a serial search: if "hits" have a reasonably even distribution, in key order with a binary search (SEARCH ALL for instance).

There are other possibilities to consider for any of the above, but the biggest chunk of saving is getting rid of the IO as much as possible.

If you have a big ugly program running like a dog with no legs, your best bet will be a full redesign and rewrite, rather than tinkering with the machine code. If you want to/have to live with the horrible, look to the IO, look to the dumb.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Tue Aug 02, 2011 7:50 pm
Reply with quote

Hello,

If the code really does "run too slow" one of the big ways to waste resources is to do things that aren't needed (again).

Often code will re-read the exact same record(s) it read on the previous iteration. Why not use the ones already read?

Code often searches an array when the search value is the same as the previous time thru the search.

Look to the code for processes that provide nothing to the process really, but merely waste reqources.

The work to flip-flop between a compare and a move is nothing useful - might be entertaining and even educational, but will not help the performance.
Back to top
View user's profile Send private message
Binop B

Active User


Joined: 18 Jun 2009
Posts: 407
Location: Nashville, TN

PostPosted: Tue Aug 02, 2011 10:04 pm
Reply with quote

Quote:
but the biggest chunk of saving is getting rid of the IO as much as possible
Surely Bill... Will keep this point in my mind always.. icon_cool.gif As said, while coding itself, we have tried our best to reduce the I/O operations... If a situation comes up, will review the codes again... Thanks ..

Quote:
The work to flip-flop between a compare and a move is nothing useful - might be entertaining and even educational, but will not help the performance.
Thanks Dick ... Guess my fear for the last couple of days has just come true ... icon_sad.gif
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> All Other Mainframe Topics

 


Similar Topics
Topic Forum Replies
No new posts Compare 2 files and retrive records f... DFSORT/ICETOOL 2
No new posts Compare 2 files(F1 & F2) and writ... JCL & VSAM 8
No new posts Compare only first records of the fil... SYNCSORT 7
No new posts exploiting Z16 performance PL/I & Assembler 2
No new posts Compare two files with a key and writ... SYNCSORT 3
Search our Forums:

Back to Top