prino
Senior Member
Joined: 07 Feb 2009 Posts: 1306 Location: Vilnius, Lithuania
|
|
|
|
Apologies for this rant on Christmas day...
Let's assume you gate-crash an IBM GSE meeting, way back in 2010 and talk to an IBM developer about optimizing code, and he tells you that IBM is well ahead (by "at least" five years) of the Open Source community!
So on this Christmas day you decide, having nothing better to do than to listen to your relatives blabbering away in Lithuanian, the language of your wife of fifteen years, but a language that you still do not speak, to look at some more code emitted by, admittedly not the latest and greatest of IBM's Enterprise PL/I compilers (V4.3.0), and compare that to the code of the long out-of-support V2.3.0 OS compiler.
Now, you've already had, more than once, discussions with one of its lead-developers about the quality of code emitted by the EPLI backend after earlier remarks that it looks pretty bad, but with said developer having access to the latest technology, there's little you can put up against his arguments, although a claim that a loop to initialize a PL/I array is faster than using MVC or overlapping MVC on the latest z13 systems still makes you wonder.
So in stead, lets look at some very simple PL/I code, a very frequently called routine that calculates average times based on two inputs, seconds and a divisor, which should result in some pretty simple assembler code.
Here is the output of the compiler, and lets start with the code generated by the old OS compiler, and for what it's worth, w_hh and w_mm are defined externally as "fixed dec (3)"
Code: |
3829 AVERAGE_TIME: PROC(SECONDS, N);
3830 DCL SECONDS FIXED DEC (15,6);
3831 DCL N FIXED DEC (5);
3833 DCL D FIXED DEC (9);
3834 DCL S FIXED DEC (15,6) INIT (SECONDS);
3841 W_HH = TRUNC(SECONDS / (N * 3600));
3842 SECONDS = SECONDS - W_HH * (N * 3600);
3843 W_MM = TRUNC(SECONDS / (N * 60));
3844 SECONDS = SECONDS - W_MM * (N * 60);
|
And I've added this code, to help the compiler eliminate common sub-expressions, as the V2.3.0 compiler wasn't all that optimizing, especially for large programs, as indicated by these two messages:
Code: |
IEL0919I W 2105 VARIABLES IN PROGRAM. GLOBAL OPTIMIZATION PERFORMED FOR 255 VARIABLES. LOCAL OPTIMIZATION PERFORMED ON REMAINDER.
IEL0917I W 1 BLOCK CONTAINS 424 FLOW UNITS. GLOBAL OPTIMIZATION PERFORMED ONLY IN 'DO' GROUPS.
3847 D = N * 3600;
3848 W_HH = TRUNC(S / D);
3849 S = S - W_HH * D;
3850 D = N * 60;
3851 W_MM = TRUNC(S / D);
3852 S = S - W_MM * D;
|
And what comes out, slightly edited (removal of spaces and flowing code on single lines) to show statement and generated code next to each other:
Code: |
3841 W_HH = TRUNC(SECONDS / (N * 3600));
* STATEMENT NUMBER 3841
01D7E0 58 40 D 0D4 L 4,212(0,13)
01D7E4 F8 52 D 098 4 000 ZAP WKSP.78+32(6),N(3)
01D7EA FC 52 D 098 8 BDD MP WKSP.78+32(6),3037(3,8)
01D7F0 D2 05 D 128 D 098 MVC 296(6,13),WKSP.78+32
01D7F6 58 90 D 0D0 L 9,208(0,13)
01D7FA F8 D7 D 098 9 000 ZAP WKSP.78+32(14),SECONDS(8)
01D800 FD D5 D 098 D 128 DP WKSP.78+32(14),296(6,13)
01D806 D2 07 D 12E D 098 MVC 302(8,13),WKSP.78+32
01D80C 58 70 6 0CC L 7,204(0,6)
01D810 D2 01 7 57C D 131 MVC LIFT_WORK.WT_AVG.W_HH(2),305(13)
01D816 D1 00 7 57D D 135 MVN LIFT_WORK.WT_AVG.W_HH+1(1),309(13)
3842 SECONDS = SECONDS - W_HH * (N * 3600);
* STATEMENT NUMBER 3842
01D81C F8 52 D 098 4 000 ZAP WKSP.78+32(6),N(3)
01D822 FC 52 D 098 8 BDD MP WKSP.78+32(6),3037(3,8)
01D828 D2 05 D 128 D 098 MVC 296(6,13),WKSP.78+32
01D82E F8 71 D 098 7 57C ZAP WKSP.78+32(8),LIFT_WORK.WT_AVG.W_HH(2)
01D834 FC 75 D 098 D 128 MP WKSP.78+32(8),296(6,13)
01D83A D2 07 D 12E D 098 MVC 302(8,13),WKSP.78+32
01D840 D2 07 D 098 D 12E MVC WKSP.78+32(8),302(13)
01D846 94 F0 D 09F NI WKSP.78+39,X'F0'
01D84A D7 02 D 0A0 D 0A0 XC WKSP.78+40(3),WKSP.78+40
01D850 D1 00 D 0A2 D 135 MVN WKSP.78+42(1),309(13)
01D856 FB A7 D 098 9 000 SP WKSP.78+32(11),SECONDS(8)
01D85C F8 7A D 09B D 098 ZAP WKSP.78+35(8),WKSP.78+32(11)
01D862 97 01 D 0A2 XI WKSP.78+42,X'01'
01D866 D2 07 9 000 D 09B MVC SECONDS(8),WKSP.78+35
3843 W_MM = TRUNC(SECONDS / (N * 60));
* STATEMENT NUMBER 3843
01D86C F8 42 D 098 4 000 ZAP WKSP.78+32(5),N(3)
01D872 FC 41 D 098 8 91B MP WKSP.78+32(5),2331(2,8)
01D878 D2 04 D 128 D 098 MVC 296(5,13),WKSP.78+32
01D87E F8 C7 D 098 9 000 ZAP WKSP.78+32(13),SECONDS(8)
01D884 FD C4 D 098 D 128 DP WKSP.78+32(13),296(5,13)
01D88A D2 07 D 12D D 098 MVC 301(8,13),WKSP.78+32
01D890 D2 01 7 57F D 130 MVC LIFT_WORK.WT_AVG.W_MM(2),304(13)
01D896 D1 00 7 580 D 134 MVN LIFT_WORK.WT_AVG.W_MM+1(1),308(13)
3844 SECONDS = SECONDS - W_MM * (N * 60);
* STATEMENT NUMBER 3844
01D89C F8 42 D 098 4 000 ZAP WKSP.78+32(5),N(3)
01D8A2 FC 41 D 098 8 91B MP WKSP.78+32(5),2331(2,8)
01D8A8 D2 04 D 128 D 098 MVC 296(5,13),WKSP.78+32
01D8AE F8 61 D 098 7 57F ZAP WKSP.78+32(7),LIFT_WORK.WT_AVG.W_MM(2)
01D8B4 FC 64 D 098 D 128 MP WKSP.78+32(7),296(5,13)
01D8BA D2 06 D 12D D 098 MVC 301(7,13),WKSP.78+32
01D8C0 D2 06 D 098 D 12D MVC WKSP.78+32(7),301(13)
01D8C6 94 F0 D 09E NI WKSP.78+38,X'F0'
01D8CA D7 02 D 09F D 09F XC WKSP.78+39(3),WKSP.78+39
01D8D0 D1 00 D 0A1 D 133 MVN WKSP.78+41(1),307(13)
01D8D6 FB 97 D 098 9 000 SP WKSP.78+32(10),SECONDS(8)
01D8DC F8 79 D 09A D 098 ZAP WKSP.78+34(8),WKSP.78+32(10)
01D8E2 97 01 D 0A1 XI WKSP.78+41,X'01'
01D8E6 D2 07 9 000 D 09A MVC SECONDS(8),WKSP.78+34
3847 D = N * 3600;
* STATEMENT NUMBER 3847
01D976 F8 52 D 098 4 000 ZAP WKSP.78+32(6),N(3)
01D97C FC 52 D 098 8 BDD MP WKSP.78+32(6),3037(3,8)
01D982 D2 04 D 0C0 D 099 MVC D(5),WKSP.78+33
3848 W_HH = TRUNC(S / D);
* STATEMENT NUMBER 3848
01D988 F8 C7 D 098 D 0B8 ZAP WKSP.78+32(13),S(8)
01D98E FD C4 D 098 D 0C0 DP WKSP.78+32(13),D(5)
01D994 D2 07 D 128 D 098 MVC 296(8,13),WKSP.78+32
01D99A D2 01 7 57C D 12B MVC LIFT_WORK.WT_AVG.W_HH(2),299(13)
01D9A0 D1 00 7 57D D 12F MVN LIFT_WORK.WT_AVG.W_HH+1(1),303(13)
3849 S = S - W_HH * D;
* STATEMENT NUMBER 3849
01D9A6 F8 61 D 128 7 57C ZAP 296(7,13),LIFT_WORK.WT_AVG.W_HH(2)
01D9AC FC 64 D 128 D 0C0 MP 296(7,13),D(5)
01D9B2 D7 0A D 098 D 098 XC WKSP.78+32(11),WKSP.78+32
01D9B8 D2 06 D 099 D 128 MVC WKSP.78+33(7),296(13)
01D9BE 94 F0 D 09F NI WKSP.78+39,X'F0'
01D9C2 D1 00 D 0A2 D 12E MVN WKSP.78+42(1),302(13)
01D9C8 FB A7 D 098 D 0B8 SP WKSP.78+32(11),S(8)
01D9CE F8 7A D 09B D 098 ZAP WKSP.78+35(8),WKSP.78+32(11)
01D9D4 97 01 D 0A2 XI WKSP.78+42,X'01'
01D9D8 D2 07 D 0B8 D 09B MVC S(8),WKSP.78+35
3850 D = N * 60;
* STATEMENT NUMBER 3850
01D9DE F8 42 D 0C0 4 000 ZAP D(5),N(3)
01D9E4 FC 41 D 0C0 8 91B MP D(5),2331(2,8)
3851 W_MM = TRUNC(S / D);
* STATEMENT NUMBER 3851
01D9EA F8 C7 D 098 D 0B8 ZAP WKSP.78+32(13),S(8)
01D9F0 FD C4 D 098 D 0C0 DP WKSP.78+32(13),D(5)
01D9F6 D2 07 D 128 D 098 MVC 296(8,13),WKSP.78+32
01D9FC D2 01 7 57F D 12B MVC LIFT_WORK.WT_AVG.W_MM(2),299(13)
01DA02 D1 00 7 580 D 12F MVN LIFT_WORK.WT_AVG.W_MM+1(1),303(13)
3852 S = S - W_MM * D;
* STATEMENT NUMBER 3852
01DA08 F8 61 D 128 7 57F ZAP 296(7,13),LIFT_WORK.WT_AVG.W_MM(2)
01DA0E FC 64 D 128 D 0C0 MP 296(7,13),D(5)
01DA14 D7 0A D 098 D 098 XC WKSP.78+32(11),WKSP.78+32
01DA1A D2 06 D 099 D 128 MVC WKSP.78+33(7),296(13)
01DA20 94 F0 D 09F NI WKSP.78+39,X'F0'
01DA24 D1 00 D 0A2 D 12E MVN WKSP.78+42(1),302(13)
01DA2A FB A7 D 098 D 0B8 SP WKSP.78+32(11),S(8)
01DA30 F8 7A D 09B D 098 ZAP WKSP.78+35(8),WKSP.78+32(11)
01DA36 97 01 D 0A2 XI WKSP.78+42,X'01'
01DA3A D2 07 D 0B8 D 09B MVC S(8),WKSP.78+35
|
Code: |
L : 3
ZAP : 18
MP : 10
MVC : 23
DP : 4
MVN : 8
NI : 4
XC : 4
XI : 4
SP : 4
---
82 instructions, 470 bytes of code
|
And obviously common-subexpression elimination didn't make it, or, as the above messages indicated, the source was simply too complex, this is a procedure in a 13+K lines PL/I program.
Jump forward two (or even more) decades to 2014-03-10, the version of Enterprise PL/I V4.3.0 on the system where this test was done, and weep?
Code: |
11200.0 average_time: proc(seconds, n);
11201.0 dcl seconds fixed (15,6);
11202.0 dcl n fixed (5);
11205.0 dcl d fixed (9);
11206.0 dcl s fixed (15,6) init (seconds);
11216.0 w_hh = trunc(seconds / (n * 3600));
11217.0 seconds = seconds - w_hh * (n * 3600);
11218.0 w_mm = trunc(seconds / (n * 60));
11219.0 seconds = seconds - w_mm * (n * 60);
11224.0 d = n * 3600;
11225.0 w_hh = trunc(s / d);
11226.0 s = s - w_hh * d;
11228.0 d = n * 60;
11229.0 w_mm = trunc(s / d);
11230.0 s = s - w_mm * d;
|
Note: despite some heavy editing (removing spaces) some lines below will quite likely still wrap, sorry about that.
Code: |
AVERAGE_TIME
00C0 D207 D188 D160 11216 | MVC #pd32554_16(8,r13,392),#pd20033_16(r13,352)
00C6 D202 D16B 2F80 11216 | MVC #pd34285_16(3,r13,363),+CONSTANT_AREA(r2,3968)
00CC D702 D168 D168 11216 | XC #pd34285_16(3,r13,360),#pd34285_16(r13,360)
00D2 D202 D1AB 2F80 11217 | MVC #pd34287_16(3,r13,427),+CONSTANT_AREA(r2,3968)
00D8 FC52 D168 D158 11216 | MP #pd34285_16(6,r13,360),#pd10660_16(3,r13,344)
00DE D205 D168 D168 11216 | MVC #pd34285_16(6,r13,360),#pd34285_16(r13,360)
00E4 F845 D170 D168 11216 | ZAP #pd20035_16(5,r13,368),#pd34285_16(6,r13,360)
00EA D204 D170 D170 11216 | MVC #pd20035_16(5,r13,368),#pd20035_16(r13,368)
00F0 F854 D178 D170 11216 | ZAP #pd20036_16(6,r13,376),#pd20035_16(5,r13,368)
00F6 D707 D180 D180 11216 | XC #pd32554_16(8,r13,384),#pd32554_16(r13,384)
00FC D702 D199 D199 11216 | XC #pd34286_16(3,r13,409),#pd34286_16(r13,409)
0102 FDF5 D180 D178 11216 | DP #pd32554_16(16,r13,384),#pd20036_16(6,r13,376)
0108 A75A 3FC0 11216 | AHI r5,H'16320'
010C D208 D190 D181 11216 | MVC #pd20039_16(9,r13,400),#pd32554_16(r13,385)
0112 D205 D19C D190 11216 | MVC #pd34286_16(6,r13,412),#pd20039_16(r13,400)
0118 D702 D1A8 D1A8 11217 | XC #pd34287_16(3,r13,424),#pd34287_16(r13,424)
011E D100 D1A1 D198 11216 | MVN #pd34286_16(1,r13,417),#pd20039_16(r13,408)
0124 D201 5444 D1A0 11216 | MVC W_HH(2,r5,1092),#pd34286_16(r13,416)
012A FC52 D1A8 4000 11217 | MP #pd34287_16(6,r13,424),_shadow13(3,r4,0)
0130 D205 D1A8 D1A8 11217 | MVC #pd34287_16(6,r13,424),#pd34287_16(r13,424)
0136 F845 D1B0 D1A8 11217 | ZAP #pd34251_16(5,r13,432),#pd34287_16(6,r13,424)
013C F854 D1B8 D1B0 11217 | ZAP #pd34252_16(6,r13,440),#pd34251_16(5,r13,432)
0142 D205 D1C2 D1B8 11217 | MVC #pd20044_16(6,r13,450),#pd34252_16(r13,440)
0148 D701 D1C0 D1C0 11217 | XC #pd20044_16(2,r13,448),#pd20044_16(r13,448)
014E FC71 D1C0 5444 11217 | MP #pd20044_16(8,r13,448),W_HH(2,r5,1092)
0154 D207 D1C8 D1C0 11217 | MVC #pd20045_16(8,r13,456),#pd20044_16(r13,448)
015A D207 D1DF E000 11217 | MVC #pd20047_16(8,r13,479),_shadow11(r14,0)
0160 91F0 D1C8 11217 | TM #pd20045_16(r13,456),240
0164 A784 0008 11217 | JE @16L10461
0168 F070 D1C8 0001 11217 | SRP #pd20045_16(8,r13,456),1,0
016E F070 D1C8 0FFF 11217 | SRP #pd20045_16(8,r13,456),-1,0
0174 11217 | @16L10461DS 0H
0174 D207 D1D0 D1C8 11217 | MVC #pd20046_16(8,r13,464),#pd20045_16(r13,456)
017A D100 D1DA D1CF 11217 | MVN #pd20046_16(1,r13,474),#pd20045_16(r13,463)
0180 D40A D1D0 2F84 11217 | NC #pd20046_16(11,r13,464),+CONSTANT_AREA(r2,3972)
0186 D702 D1DC D1DC 11217 | XC #pd20047_16(3,r13,476),#pd20047_16(r13,476)
018C FBAA D1DC D1D0 11217 | SP #pd20047_16(11,r13,476),#pd20046_16(11,r13,464)
0192 F87A D1E8 D1DC 11217 | ZAP #pd10676_16(8,r13,488),#pd20047_16(11,r13,476)
0198 D207 E000 D1E8 11217 | MVC _shadow11(8,r14,0),#pd10676_16(r13,488)
019E D202 D1F2 4000 11218 | MVC #pd34288_16(3,r13,498),_shadow13(r4,0)
01A4 D701 D1F0 D1F0 11218 | XC #pd34288_16(2,r13,496),#pd34288_16(r13,496)
01AA D207 D188 E000 11218 | MVC #pd32554_16(8,r13,392),_shadow11(r14,0)
01B0 FC41 D1F0 2E40 11218 | MP #pd34288_16(5,r13,496),+CONSTANT_AREA(2,r2,3648)
01B6 D204 D1F0 D1F0 11218 | MVC #pd34288_16(5,r13,496),#pd34288_16(r13,496)
01BC F834 D1F8 D1F0 11218 | ZAP #pd20049_16(4,r13,504),#pd34288_16(5,r13,496)
01C2 D203 D1F8 D1F8 11218 | MVC #pd20049_16(4,r13,504),#pd20049_16(r13,504)
01C8 F843 D200 D1F8 11218 | ZAP #pd20050_16(5,r13,512),#pd20049_16(4,r13,504)
01CE D707 D180 D180 11218 | XC #pd32554_16(8,r13,384),#pd32554_16(r13,384)
01D4 D702 D20E D20E 11218 | XC #pd34289_16(3,r13,526),#pd34289_16(r13,526)
01DA FDF4 D180 D200 11218 | DP #pd32554_16(16,r13,384),#pd20050_16(5,r13,512)
01E0 D208 D205 D182 11218 | MVC #pd20053_16(9,r13,517),#pd32554_16(r13,386)
01E6 D205 D211 D205 11218 | MVC #pd34289_16(6,r13,529),#pd20053_16(r13,517)
01EC D100 D216 D20D 11218 | MVN #pd34289_16(1,r13,534),#pd20053_16(r13,525)
01F2 D201 5447 D215 11218 | MVC W_MM(2,r5,1095),#pd34289_16(r13,533)
01F8 D202 D21A 4000 11219 | MVC #pd34290_16(3,r13,538),_shadow13(r4,0)
01FE D701 D218 D218 11219 | XC #pd34290_16(2,r13,536),#pd34290_16(r13,536)
0204 FC41 D218 2E40 11219 | MP #pd34290_16(5,r13,536),+CONSTANT_AREA(2,r2,3648)
020A D204 D218 D218 11219 | MVC #pd34290_16(5,r13,536),#pd34290_16(r13,536)
0210 F834 D220 D218 11219 | ZAP #pd34257_16(4,r13,544),#pd34290_16(5,r13,536)
0216 F843 D228 D220 11219 | ZAP #pd34258_16(5,r13,552),#pd34257_16(4,r13,544)
021C D204 D232 D228 11219 | MVC #pd20058_16(5,r13,562),#pd34258_16(r13,552)
0222 D701 D230 D230 11219 | XC #pd20058_16(2,r13,560),#pd20058_16(r13,560)
0228 FC61 D230 5447 11219 | MP #pd20058_16(7,r13,560),W_MM(2,r5,1095)
022E D207 D24C E000 11219 | MVC #pd20061_16(8,r13,588),_shadow11(r14,0)
0234 D206 D238 D230 11219 | MVC #pd20059_16(7,r13,568),#pd20058_16(r13,560)
023A 91F0 D238 11219 | TM #pd20059_16(r13,568),240
023E A784 0008 11219 | JE @16L10462
0242 F060 D238 0001 11219 | SRP #pd20059_16(7,r13,568),1,0
0248 F060 D238 0FFF 11219 | SRP #pd20059_16(7,r13,568),-1,0
024E 11219 | @16L10462DS 0H
024E D206 D240 D238 11219 | MVC #pd20060_16(7,r13,576),#pd20059_16(r13,568)
0254 D100 D249 D23E 11219 | MVN #pd20060_16(1,r13,585),#pd20059_16(r13,574)
025A D409 D240 2F90 11219 | NC #pd20060_16(10,r13,576),+CONSTANT_AREA(r2,3984)
0260 D701 D24A D24A 11219 | XC #pd20061_16(2,r13,586),#pd20061_16(r13,586)
0266 FB99 D24A D240 11219 | SP #pd20061_16(10,r13,586),#pd20060_16(10,r13,576)
026C F879 D258 D24A 11219 | ZAP #pd10691_16(8,r13,600),#pd20061_16(10,r13,586)
0272 D207 E000 D258 11219 | MVC _shadow11(8,r14,0),#pd10691_16(r13,600)
0348 D207 D188 D0D4 11225 | MVC #pd32554_16(8,r13,392),S(r13,212)
034E D202 D293 2F80 11224 | MVC #pd34293_16(3,r13,659),+CONSTANT_AREA(r2,3968)
0354 D702 D290 D290 11224 | XC #pd34293_16(3,r13,656),#pd34293_16(r13,656)
035A D207 D2C7 D0D4 11226 | MVC #pd20080_16(8,r13,711),S(r13,212)
0360 FC52 D290 4000 11224 | MP #pd34293_16(6,r13,656),_shadow13(3,r4,0)
0366 D205 D290 D290 11224 | MVC #pd34293_16(6,r13,656),#pd34293_16(r13,656)
036C F845 D298 D290 11224 | ZAP #pd34262_16(5,r13,664),#pd34293_16(6,r13,656)
0372 D204 D298 D298 11224 | MVC #pd34262_16(5,r13,664),#pd34262_16(r13,664)
0378 D204 D0C8 D298 11224 | MVC D(5,r13,200),#pd34262_16(r13,664)
037E D707 D180 D180 11225 | XC #pd32554_16(8,r13,384),#pd32554_16(r13,384)
0384 D702 D2A6 D2A6 11225 | XC #pd34294_16(3,r13,678),#pd34294_16(r13,678)
038A FDF4 D180 D0C8 11225 | DP #pd32554_16(16,r13,384),D(5,r13,200)
0390 D701 D2B0 D2B0 11226 | XC #pd20077_16(2,r13,688),#pd20077_16(r13,688)
0396 D208 D29D D182 11225 | MVC #pd20073_16(9,r13,669),#pd32554_16(r13,386)
039C D204 D2B2 D0C8 11226 | MVC #pd20077_16(5,r13,690),D(r13,200)
03A2 D205 D2A9 D29D 11225 | MVC #pd34294_16(6,r13,681),#pd20073_16(r13,669)
03A8 D100 D2AE D2A5 11225 | MVN #pd34294_16(1,r13,686),#pd20073_16(r13,677)
03AE D201 5444 D2AD 11225 | MVC W_HH(2,r5,1092),#pd34294_16(r13,685)
03B4 FC61 D2B0 5444 11226 | MP #pd20077_16(7,r13,688),W_HH(2,r5,1092)
03BA D206 D2B8 D2B0 11226 | MVC #pd20079_16(7,r13,696),#pd20077_16(r13,688)
03C0 D202 D2DA 4000 11228 | MVC #pd34295_16(3,r13,730),_shadow13(r4,0)
03C6 D100 D2C1 D2B6 11226 | MVN #pd20079_16(1,r13,705),#pd20077_16(r13,694)
03CC D409 D2B8 2F90 11226 | NC #pd20079_16(10,r13,696),+CONSTANT_AREA(r2,3984)
03D2 D702 D2C4 D2C4 11226 | XC #pd20080_16(3,r13,708),#pd20080_16(r13,708)
03D8 FBA9 D2C4 D2B8 11226 | SP #pd20080_16(11,r13,708),#pd20079_16(10,r13,696)
03DE F87A D2D0 D2C4 11226 | ZAP #pd10718_16(8,r13,720),#pd20080_16(11,r13,708)
03E4 D207 D0D4 D2D0 11226 | MVC S(8,r13,212),#pd10718_16(r13,720)
03EA D701 D2D8 D2D8 11228 | XC #pd34295_16(2,r13,728),#pd34295_16(r13,728)
03F0 D207 D188 D0D4 11229 | MVC #pd32554_16(8,r13,392),S(r13,212)
03F6 FC41 D2D8 2E40 11228 | MP #pd34295_16(5,r13,728),+CONSTANT_AREA(2,r2,3648)
03FC D204 D2D8 D2D8 11228 | MVC #pd34295_16(5,r13,728),#pd34295_16(r13,728)
0402 F834 D2E0 D2D8 11228 | ZAP #pd34265_16(4,r13,736),#pd34295_16(5,r13,728)
0408 D203 D2E0 D2E0 11228 | MVC #pd34265_16(4,r13,736),#pd34265_16(r13,736)
040E F843 D2E8 D2E0 11228 | ZAP #pd34266_16(5,r13,744),#pd34265_16(4,r13,736)
0414 D204 D0C8 D2E8 11228 | MVC D(5,r13,200),#pd34266_16(r13,744)
041A D707 D180 D180 11229 | XC #pd32554_16(8,r13,384),#pd32554_16(r13,384)
0424 D207 D317 D0D4 11230 | MVC #pd20085_16(8,r13,791),S(r13,212)
042A FDF4 D180 D0C8 11229 | DP #pd32554_16(16,r13,384),D(5,r13,200)
0430 D702 D2F6 D2F6 11229 | XC #pd34296_16(3,r13,758),#pd34296_16(r13,758)
0436 D701 D300 D300 11230 | XC #pd20082_16(2,r13,768),#pd20082_16(r13,768)
043C D208 D2ED D182 11229 | MVC #pd34271_16(9,r13,749),#pd32554_16(r13,386)
0442 D204 D302 D0C8 11230 | MVC #pd20082_16(5,r13,770),D(r13,200)
0448 D205 D2F9 D2ED 11229 | MVC #pd34296_16(6,r13,761),#pd34271_16(r13,749)
044E D100 D2FE D2F5 11229 | MVN #pd34296_16(1,r13,766),#pd34271_16(r13,757)
0454 D201 5447 D2FD 11229 | MVC W_MM(2,r5,1095),#pd34296_16(r13,765)
045A FC61 D300 5447 11230 | MP #pd20082_16(7,r13,768),W_MM(2,r5,1095)
0460 D206 D308 D300 11230 | MVC #pd20084_16(7,r13,776),#pd20082_16(r13,768)
0466 D100 D311 D306 11230 | MVN #pd20084_16(1,r13,785),#pd20082_16(r13,774)
046C D409 D308 2F90 11230 | NC #pd20084_16(10,r13,776),+CONSTANT_AREA(r2,3984)
0472 D702 D314 D314 11230 | XC #pd20085_16(3,r13,788),#pd20085_16(r13,788)
0478 FBA9 D314 D308 11230 | SP #pd20085_16(11,r13,788),#pd20084_16(10,r13,776)
047E F87A D320 D314 11230 | ZAP #pd10723_16(8,r13,800),#pd20085_16(11,r13,788)
0484 D207 D0D4 D320 11230 | MVC S(8,r13,212),#pd10723_16(r13,800)
|
First thing that's noticable is that the code has become totally unreadable to those with only a little knowledge of z/OS assembly language, everything is #pdnnnnn and variable names are mostly gone and, hey, it looks like it's a bit longer, so lets count:
Code: |
ZAP : 15
MP : 10
MVC : 52
DP : 4
MVN : 8
AHI : 1
XC : 22
TM : 2
SP : 4
NC : 4
SRP : 4
JE : 2
---
128 instructions, 758 bytes of code
|
And the count, or rather the act of counting, immediately raises five very, very significant questions:
- What is the performance of the old code on a new z13 system compared to the code that is now emitted?
- There are still 10 MP instructions, and inspection of the code, which was compiled ARCH(10) OPT(3) (in other words, as optimal as possible on the hardware I have access to), reveals that Enterprise PL/I AD 2014 still doesn't seem to know anything about common sub-expression elimination.
- Why the abso-eff-ing hell (sorry is these words cause some offence) are there two JE instructions in this code, as any conditional jump has the ability of causing significant stalls due to breaking the pipeline? We are TRUNCATING, not rounding!
- Why the flucking 'ell does the added multiply of "n * 3600" take only three instructions using the OS compiler, but no less than 7 (seven, SEVEN, S*E*V*E*N!) using Enterprise PL/I. And the "n * 60" multiply is even worse, two versus eight!
- Any compiler worth the adjective "Optimizing" knows about the underlying hardware, and this is what the POP tells me about the DP instruction:
Principles of Operation - IBM wrote: |
The first operand (the dividend) is divided by the
second operand (the divisor). The resulting quotient
***and*** remainder are placed at the first-operand
location. The operands and results are in the
packed format." (Emphasis added)
|
So given that, and the fact that what I'm doing here is calculating both a quotient and a remainder, the compiler should be able to generate rather more efficient code, maybe anyone can code the same in C (on both z/OS and using GCC or MSVC) and see how that comes out. Or maybe the maintainers of PL/I should give some serious thought about a QUOTREM() builtin function...
Hell, there are probably still posts on Google in comp.lang.pascal.borland dating back to the mid-1980'ies where people were told that retrieving DX after a "A DIV B" operation would give the remainder in just three bytes of code, and without having to do a multiply, as the x86 DIV instruction (and probably/possibly many others) exhibit exactly the same behaviour, in that both a quotient *and* remainder are calculated at the same time.
I've been out of a job for many years, and don't expect to ever work again, the PL/I market, even here in Europe is small, and being in the second half of my fifties also doesn't really help, but some of you will be working at sites where PL/I is in use. Just do yourself and your employer/client a favour, have a look at what kind of code this compiler and its newer siblings generate, and then ask IBM why you are paying a large amount of money for a compiler that generates code that may well significantly negate the increased speed your latest shiny z13 system is supposed to deliver.
It all brings back memories of The Bloatware Debate ...
Merry Christmas,
Robert
PS: I promise that in 2017, 25 years after I made it public (at least within Wills, Faber and Dumas) I'll post "RAP00100" and all of its companions, panels, skeletons, PL/I code and what have you @ Formatted Browse, and I hope that some of you will be able to help me to make it capable of handling some of the more esoteric features of PL/I structures. *You* might even add a RAP00120 exec to handle COBOL, or a RAP00130 for C... |
|