Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
This topic has received some coverage in this forum before, but, from some recent questions (why did this not S0C7?) and idiotic code which "works" I think it is worth looking at.
Why? Because anyone with an ounce of programming-self-respect "feels" for the system when bad data gets into it.
Compiler option NUMPROC with NOPFD makes me sick to my stomach, at least initially.
First of all, what is PFD/NOPFD? PFD means "preferred", and NOPFD means "non-preferred" - however, don't take that as indicating the use of one over the other. What is "preferred", or not, is the sign symbol.
Ignoring "sign is seperate", IBM zoned and packed decimals have a "preferred" sign of C (positive), D (negative) or F (unsigned, assumed positive). "Non-preferred" opens up the possibility of other sign values, in fact A, B, C, D, E and F. Why? TBD.
Now, if you had a whole load of different sign values for your program to cater with, things would get more complicated. So, what NOPFD does is to "fix" the sign for you, so it can then only be a C, D or F when it is actually used for anything (IF, MOVE, calculation, any sort of reference) in your program. It doesn't "fix" the sign in the DATA DIVISION location, but in the progam's temporary storage area. Meaning that it does this each time any zoned (PIC (S)9(n) USAGE DISPLAY) or packed (COMP-3) field is referenced.
The way PFD (preferred sign) behaves is to assume that the DATA DIVISION fields all have a correct sign (C or D in PIC S9(n) and F in PIC 9(n)) with respect to the PICTURE.
The zoned-decimal looks like this - ZNZNSN
The packed-decimal looks like this - NNNS
Z = Zone = Value Irrelevant
S = Sign (C, D, F valid or A, B, C, D, E, F valid)
N = Number (0-9 valid)
At the moment, restricting the discussion to zoned-decimals, it goes like this.
With NUMPROC(NOPFD)
Valid for Sign = Any single-byte binary value where high-order nibble is numeric. 160 distinct values.
Valid for Zone = Any single-byte binary value where low-order nibble is numeric. 160 distinct values.
Any receiving field, result of a MOVE, calculation, will have a "preferred sign" C, D or F. An unsigned source to a signed receiving field, will have a C.
NUMPROC(PFD)
Valid for Sign = Any single-byte binary value where high-order nibble is numeric and low-order nibble is C, D or F. 30 distinct values.
Valid for Zone = Any single-byte binary value where low-order nibble is numeric. 160 distinct values.
Any receving field will have its sign propagated from the source field (except in the case of an unsigned source to a signed receiving field, which will have a C).
What does all this mean?
Firstly, fewer S0C7s with NOPFD. Not good.
Secondly, potential errorenous processing with PFD if a signed field has an F for a sign, ie it contains an unsigned value. Not good.
For example, a zoned-decimal containing spaces, or low-values, will be treated as zero with NOPFD. When processed to a packed-decimal receiving field it will look like a completely valid field. With PFD it will S0C7.
For example, for PFD, a signed packed-decimal with an F sign will not compare correctly to another equal-length packed-decimal if a Compare Packed is not generated (ie if Compare Logical Characters is generated by the compiler). With NOPFD the comparison will always be "correct".
So, what do you do? If all your signs conform to the PICTURE, then PFD will always work, produce less code, and optimise better.
If all your signs do not conform to the PICTURE, NOPFD will "fix them" in receiving fields and all processing will be consistent - but a lot more "invalid" data could be made "valid". NOPFD will produce more code, for every reference to zoned- and packed-decimal fields in your program, and will not optimise as well.
So, it seems a bit sloppy. I prefer the sound of PFD so far.
For intial testing, I would use PFD for definite. More likely to get S0C7s. Beyond initial testing, I'd stick to whatever was set in production.
At the end of the day, with proper validation of all data entering the system, they should be equivalent. So, again, I'd go for PFD.
What I'd think long and hard about would be changing the value in production. Could be a nightmare waiting to happen.
Oh, and if you are calling FORTRAN or PL/I from your Cobol program, IBM says you should use NOPFD. I wonder what they get up to? Even there, I'd prefer to handle the individual fields going to those programs, and still use PFD.
Any comments, questions, thoughts, suggestions, attempts to sway me, are welcome.
Pretty straightforward. Pack 'em into temporary storage, packed compare, branch on condition. Although the PACK leaves the signs alone, the CP "knows" about non-preferred signs. So here, no "sign-fixing" at all.
Now it gets more interesting. Pack the field with the potential non-preferred sign into Temporary Storage. Then ZAP (Zero and Add Packed) it to itself! What does that do? Well, the "ADD" part is what is crucial here, because, from the POP:
Quote:
The preferred sign codes are 1100 for plus and 1101 for minus. These are the sign codes generated for the results of the decimal-arithmetic instructions and the CONVERT TO DECIMAL instruction.
Alternate sign codes are also recognized as valid in the sign position: 1010, 1110, and 1111 are alternate codes for plus, and 1011 is an alternate code for minus. Alternate sign codes are accepted for any decimal source operand, but are not generated in the completed result of a decimal-arithmetic instruction or CONVERT TO DECIMAL. This is true even when an operand remains otherwise unchanged, such as when adding zero to a number. An alternate sign code is, however, left unchanged by MOVE NUMERICS, MOVE WITH OFFSET, MOVE ZONES, PACK, and UNPACK.
(my emphasis added).
So the ZAP is how NOPFD does the "sign-fixing" in this case (two signed four byte, zoned decimals, in a MOVE).
Having ZAP'd it, unpack it in to the receiving field.
For PFD, by comparison:
Code:
000052 IF
00035E D503 A000 A040 CLC 0(4,10),64(10) W-DISPNUM-SIZE4-S W-DISPNUM-SIZE4-ST
000364 4770 B142 BC 7,322(0,11) GN=5(00036E)
Now, this is doing a logical compare. If one of the signs is actually non-preferred, then the compare is not going to work properly. More on that later.
Joined: 03 Oct 2009 Posts: 1788 Location: Bloomington, IL
Bill Woodger wrote:
Ignoring "sign is seperate", IBM zoned and packed decimals have a "preferred" sign of C (positive), D (negative) or F (unsigned, assumed positive). "Non-preferred" opens up the possibility of other sign values, in fact A, B, C, D, E and F. Why? TBD.
It may be worth noting that an overpunched positive sign (punch in zone 12) is interpreted as {-I (EBCDIC C0-C9) and an overpunched negative sign (punch in zone 11) is interpreted as }-R (EBCDIC D0-D9), which is probably why C, D, and F are "preferred".
Overpunching goes back to 1928, IIRC; the association between overpunched digits and alphas is no later than the introduction of the 360 in 1964, and probably older. Unless IBM recorded the reasons (or lack thereof), anyone who could tell us the "whys" has probably long since shuffled off this mortal coil.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
Thanks Mr Akatsukami. Understood. Any ideas why the potential to encounter non-preferred signs exists in Cobol to the extent that IBM added the NUMPROC compiler option? VS Cobol had no special processing for them. I guess if they had appeared, they would have failed NUMERIC tests, but otherwise would have got along reasonably well. I have never seen one. I just assumed it was something "assembler guys" used to do, but we didn't need to. Now, there it is. Is it because of the Cobol 85 standard? Is it because of the Web stuff now available? Some managerial nonsense, having found it on some "to do" list from 1964?
Other than deliberately making one, has anyone ever seen a "non-preferred" sign? If they are not scattered all over DASD, why do we have a "new" compiler option for them?
This time an MVC is used, and the sign is "fixed" for the destination. Can we deduce, from this and the signed-to-signed example previously, that the method for the MOVE is dependent on how the sign is "fixed"? PACK-ZAP-UNPACK for the signed field, because the ZAP is doing the fixing, MVC for the unsigned field because the OI will do the fixing.
Which is the same as for NOPFD except for the assumption that the unsigned field has an F sign, rather than forcing it (in NOPFD) for the compare. CLC cannot be used, as the signs should be different.
Again, like the example previously posted, other than using OPT storage to save the value for later (elsewhere in the program this time) the IF is the same as the unoptimised one.
Both sets of code mirror the non-optimised PFD, except for the preparatory storage for use later.
When comparing the same length signed zoned to unsigned zoned, the code generated of NOPFD and PFD is very similar (one extra OI for NOPFD) and is only open to any optimisation at all where a non-receiving field is used more than once.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
Bill Woodger wrote:
[...]
Other than deliberately making one, has anyone ever seen a "non-preferred" sign? If they are not scattered all over DASD, why do we have a "new" compiler option for them?
I may have been looking at this the wrong way. What NUMPROC(PFD) does is change the way the Cobol compiler generates code for decimals in some common instances. For instance, using CLC instead of CP, MVC instead of ZAP. Goes much faster, well, faster, but your data has to conform to the "preferred" signs. A CP can compare sign F to sign C and find them equal (if the amounts are equal, of course). A CLC cannot.
So maybe NUMPROC(PFD) is NUMPROC(GO-FASTER) and NUMPROC(NOPFD) is NUMPROC(SLOW-DOWN-THE-SIGNS-MIGHT-BE-ODD).
So it is not really all the other odd signs getting in the way, it is when you get an F in a signed field. Not difficult to do (if you want to do that), but when it happens in error it is difficult to get through testing. Well, it should be.
Again, a straight PACK, PACK, Compare Packed. Just to emphasise, as is indicated in the manual, NOPFD does not always generate "sign fixing" code. Here it does not. It doesn't need to, because the CP can understand all the possibile signs and do a correct comparison.
Here, an interesting development. Packing both fields to four-bytes and then a logical compare. Why not pad with three leading zeros the four-byte zoned field in temporary storage and then a CLC? I don't know.
Here, the four-byte signed, zoned, field has already been allocated space in the Optimisation area, ie in OPT=0 there is a packed representation of the field, three bytes long. So, then compare packed. Wait, though, although it doesn't matter, what is the sign-type of OPT=0. Let's look back at the last post with code in. OK, so it is still potentially non-preferred.
Now, we ZAP the OPT=0, and unpack it to the receiving field. Both OPT=0 and the receiving field are now preferred signs. The optimiser has removed two pack instructions in total, by using the OPT=0 field established earlier.
Sticking with the theme, pack both fields to four bytes, and do a logical compare. Note that W-DISPNUM-SIZE4-S is now in two OPT locations, firstly as three bytes packed, secondly as four bytes packed.
Identical to the un-optimised MOVE. Three zeros padded on the front, then an MVC.
When comparing two signed zoned fields, the code for NOPFD differs substantially from PFD. NOPFD sticks to decimal instructions, PFD uses CLC and a couple of MVC's.
Hang about. Why is that setting OPT=0 again? Well, I missed it earlier, but what is happening is that every time the compiler generates the ZAP-to-itself as part of the MOVE, to get the "preferred sign", then it has to reset OPT=0 (in this case). The result being that for NOPFD the OPT fields for packed data have less of a life-span. The compiler knows the value has changed, if it has carried out "sign fixing", so cannot use it again without re-setting it first. Yet, if it did use it again without resetting, it would make no difference at all to the results of anything. The CP does not mind about the signs as long as they are valid. The receiving field always contains a preferred sign. The manual hints at problems like this when discussing performance.
There is nothing remarkable about the MOVE, as it is the same as the two above.
For OPT, PFD one instruction is saved through the use of the 3-byte packed OPT=0, otherwise the code is the same as for NOOPT, PFD.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
Having looked at a 4-byte signed zoned decimal being compare 4- and 7-byte signed and unsiged zoned decimals, let's have a look at comparing it to a packed decimal.
There is nothing unusual here, except, why is it doing the ZAP? OK, it does the ZAP to get a preferred sign, but why does it want a preferred sign, the CP can handle non-preferred as well. In none of the code generated for the zoned fields, which were all packed before doing a CP, did the ZAP-to-itself get done.
Overall, to this packed field, the only unexpected this is the ZAP-in-place for the IF with NOPFD, both NOOPT and OPT. Still can't think why it does it.
000087 IF
000304 D202 D0F8 A058 MVC 248(3,13),88(10) TS2=8 W-COMP3PD-SIZE4-UT
00030A 960F D0FA OI 250(13),X'0F' TS2=10
00030E F922 D0F8 D100 CP 248(3,13),256(3,13) TS2=8 OPT=0
000314 4770 B10A BC 7,266(0,11) GN=10(000322)
MVC and OI for the packed unsigned field, then compare packed to the OPT=0 field previously stored. Curiously, this time the OPT=0 hasn't been refreshed, even though the current value is a preferred sign.
Because it is interesting above that an OPT=0 with a preferred sign is used for OPT, NOPFD, let's just check by looking at where it is established as well.
The ZAP-to-itself at displacement 0002E8 is the last instruction which updates OPT=0. Out of the ZAP the sign is preferred. When we go to the next IF instruction, the already preferred OPT=0 is used (even though the sign doesn't matter, as long as valid, for the CP).
This does not follow the previous usage of OPT=0 in these examples. There, if the sign was "rectified", the OPT=0 would be refreshed. Here, not.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
I suppose I am a little "over interested"? When I see those tantalising little bits in the manual, I can't help but wonder why? Sometimes helps. Never hurts (me :-) ).
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
OK, starting to wind this up.
NUMPROC(NOPFD) does "sign fixing" (another term for this is "sign rectification"). This is not as bad as it sounded to me originally. Previously I had known "sign fixing" in Cobol as the extra code generated to ensure that an unsigned field remained unsigned, so I was concerned about how NOPFD did this for potential "non-preferred signs".
From the Priciples of Operation (POP).
Quote:
The preferred sign codes are 1100 for plus and
1101 for minus. These are the sign codes generated
for the results of the decimal-arithmetic
instructions and the CONVERT TO DECIMAL
instruction.
The result generated, by the CPU, before anyone can get their sticky fingers on it, can only be sign C or D. ZAP is a decimal-arithmetic instruction. ZAP-to-itself is possible (operates right-to-left in this case). So a field which might potentially contain a non-preferred sign can have it "recitified" by a ZAP-to-itself. Compare Packed knows about non-preferred signs, and will compare them correctly.
One other "rectification" is carried out with NOPFD. If a field is unsigned, NOPFD will ensure that it is unsigned for all references to that field (source or receiver).
Two other rectifications are not to do with the NOPFD option, but are the way that Cobol works.
An alphanumeric (PIC X(n)) as a sending field always results in a positive (C) in a signed field (and an F in an unsigned field).
An unsigned field as the source always gives a positive (C) in a signed receiving field.
For a packed receiver this is achieved by using MVN (Move Numerics) on the last byte of the field (ironic, because in that byte the "numeric" is the sign, but it achieves the affect).
For a zoned receiver this is achieved by Or Immediate (OI) to get the F (unsigned) and OI followed by And Immediate (NI) to get the positive (C).
NOPFD relieves heavily on Compare Packed (CP) for "IF tests". This is traditionally the case. With decimal fields, you expect a decimal compare. You can also expect a S0C7 if either field is invalid.
NOPFD will use Compare Logical (CLC) for some compares. If you are comparing two unsigned decimals, either zoned or packed or mixed, and of equal or different size, NOPFD will generate a CLC. You cannot expect an S0C7 (or any abend) in this instance if either field is invalid.
NUMPROC(PFD) does not do "sign fixing", except where required to obey the Cobol rules, as NOPFD does. PFD assumes that a signed field can only contain a C or D sign (F may not give valid results) and an unsiged field can only contain an F sign.
PFD uses Compare Logical (CLC) for "IF tests" more heavily than does NOPFD. You cannot expect an S0C7 (or any abend) in this instance if either field is invalid.
PFD will use Compare Packed (CP) for some compares. If the fields are not the same length by the time a compare of some sort is going to be generated, or one field is signed and the other not, PFD will generate a CP (yes, there is a chicken-or-the-egg situation that is resolved for this). You can expect a S0C7 if either field is invalid.
Both NOPFD and PFD generate CLCs. Both generate CPs. NOPFD will likely generate fewer CLCs.
NOPFD will generate more code, and will tend to generate code options which take longer to execute (because the need for non-preferred processing precludes shorter execution time options).
NOPFD does not optimise very well. Using the ZAP for sign-rectification reduces optimisation options, even the use of OPT fields to retain a value which can be used more than once in a block of code (if a field theoretically has to be packed five times in a block of code, but its value does not change, then it only needs to be packed once and then the value is already available for the other four places).
PFD already produces fairly "good" code from a performance point of view, but still has more possibilities for optimisation that NOPFD.
For decimal fields, PFD OPT is up to 20% faster than NOPFD OPT. This does not mean your program will be up to 20% faster, because your program will be doing many things other than processing decimal fields. If you are processing a lot of decimal fields a lot of time you would expect to be able to notice the difference in CPU usage between the two options.
Remember, you can't just decide to use a different option for your development cycle. Use what you use in production for that. Only change an option in production after very long consideration if potential impacts. PFD works is all your fields are signed correctly, NOPFD works if some (or even all) of your fields have non-preferred signs.
The worst culprits for the amount of code generated are unsigned fields. If you are calculating with the field, give it a sign. If you are not calculating with it, unsigned makes sense. The Oldies amongst us know this from way back. However, with NOPFD, even testing an unsigned field gets you sign-rectification. As does moving to an unsiged field.
Last thought for this post. A non-preferred sign that you do anything to as a receiving field, becomes a preferred sign, with NOPFD. With PFD, if a field contains a non-preffered or F sign, any calculation (including ZAP) will "rectify" it.