Record should be sorted as a group

sushanth bobby · Posted: Thu Aug 06, 2015 2:38 pm

Hi,

I couldn't find sorting similar to this.

I have input file as below

Bill Woodger · Posted: Thu Aug 06, 2015 3:17 pm

INREC operates before SORT. So any manipulation required to allow a specific SORT is done in INREC.

You will need to use WHEN=GROUP with BEGIN= for your C'S:' in 1,2. Then PUSH the information required for sequencing so that it appears, in the same place, on each record of the group.

Then you SORT on that new key, with OPTION EQUALS (or EQUALS on the SORT) to preserve the original order of records.

Then, with OUTREC or OUTFIL you use BUILD to return the records to their original size and content.

That's the general outline.

You have an issue, because it looks like the second part of your key may be variably-located.

This means you have to use IFTHEN=(WHEN=INIT in INREC to first prepare the key, with PARSE to get the second element of the key.

The first element of the key is fixed-position and fixed-length, so you use OVERLAY on another INIT to extend the record to include the fixed-position value and the variable-position value (you're going to have to decide on a maximum length, the logical maximum of that content).

Then with the GROUP mentioned earlier, you PUSH the two elements to themselves, thus marking each record in the group with the same key.

Where to extend depends on whether fixed-length or variable-length records. Fixed-length, at the end of the record, variable-length at the beginning (like BUILD=(1,4,15X,5) to temporarily extend, BUILD=(1,4,20) to return to original size).

sqlcode1 · Active Member Joined: 08 Apr 2010 Posts: 577 Location: USA

Assuming 80 FB, see if below works...

Bill Woodger · Posted: Thu Aug 06, 2015 8:35 pm

With the EQUALS needed somewhere.

Otherwise fine as long as the second field always starts there. Simple enough for the variable-length version as well.

Long time no see. Nice to see you back.

sqlcode1 · Active Member Joined: 08 Apr 2010 Posts: 577 Location: USA

DFSort default ships with OPTIONS EQUALS and same is the case here at my site but otherwise I would agree, its needed.

Just coming back

Thanks

Bill Woodger · Posted: Thu Aug 06, 2015 9:44 pm

So the document says, EQUALS is the delivered defualt. I never realised that, having never seen a site with EQUALS=Y as the default :-)

There's a performance penalty, additional storage requirements and a minor reduction in the total size of keys.

Of course, if you have EQUALS=Y as the default don't just change it. It can/will change the order of output (for non-unique keys only). Use OPTION NOEQUALS or NOEQUALS on the SORT/MERGE statement instead.

sushanth bobby · Posted: Thu Aug 06, 2015 10:36 pm

Thank You Very Much Bill for taking your time and elaborating step-by-step on what commands i should code in order, i am pretty much a beginner in SORT tools. When i finished the below part,

Bill Woodger · Posted: Mon Aug 10, 2015 9:17 pm

I've been doing some poking about.

Firstly, I misread the Customisation and Installation Guide. Penalty for looking positively, rather than negatively.

Secondly, here is what happens.

The IBM-supplied default value for the EQUALS installation option is VBLKSET.

This means that operations which use the BLOCKSET technique (recommended to use where possible) would use EQUALS=Y for variable-length records.

Operations which enforce the use of EQUALS (JOINKEYS, ICETOOL operators) do so irrespective of installation default.

Assuming that the options are listed per run, look at the SYSOUT for SORT steps. If it says EQUALS=Y, confirm that this is required, and consider the use of NOEQUALS for the step if EQUALS is not required.

What does EQUALS do?

It makes a new low-order key-field for SORT/MERGE, a sequence number. This requires more storage for the key. The sequence number is four bytes long. So an existing four-byte key will be doubled in size. CPU time is increased also, to process the new part of the key.

When is EQUALS useful?

When you can have duplicate keys. No other time. If keys are unique, then using EQUALS is wasteful. If keys are not unique, but the order of the output (including SUMming) is not relevant, then EQUALS is wastefull. If keys are not unique, and the order of the input needs to be preserved through to the output, then use EQUALS.

In technical terms, EQUALS produces a "stable sort" when there are duplicate keys. With unique keys, a sort is always stable.

Why do JOINKEYS and various ICETOOL operators force the use of EQUALS?

To ensure they work as described.