SUM FIELDS=NONE with INCLUDE

kranthikumarb · Posted: Mon Dec 28, 2015 12:18 pm

Hello Experts,

Would you please suggest on how this can be done. I have the below input file: Position 1 to 5 is acc number and 6 to 7 is tran code. I need to get only one record per account number. So I used sum fields=none. But requirement changed: I need to pick records selectively - if an account tran code is 01 or 02 or 51 pick any one of those records. If an account doesn't have any of these 3, then pick any record. I.e the prioritizing while removing duplicates. I know include condition is executed before sum fiels=none. So I am not getting the correct results. Please suggest how to get this.

kranthikumarb · Posted: Mon Dec 28, 2015 1:22 pm

Please note I have a way to do it in 3 steps. But wanted to know if there is a way to do it in 1 or 2 steps.

Pandora-Box · Posted: Mon Dec 28, 2015 1:45 pm

It is better to have a two step solution than a single step.

kranthikumarb · Posted: Mon Dec 28, 2015 1:48 pm

I was thinking to split the input file into 2
1. File 1 with 01, 02, 51 records, with sum fields=none
2. File 2 with sum fields=none.

then overlay record from file 1 if it is found in file 2, if not leave file 2 record as is. This seems little round about way. So, I was thinking if there is any other easy way to do it.

Pandora-Box · Posted: Mon Dec 28, 2015 2:04 pm

Try this

Step 1
1.Overlay when records has 01,02,51 with a value say '01' position 20
2.Sort the file based on Account asc,Position 20 desc and tran code asc

Step 2
1.Elimnate duplicates by sorting based on account

Bill Woodger · Posted: Mon Dec 28, 2015 2:57 pm

boyti ko · New User Joined: 03 Nov 2014 Posts: 78 Location: Malaysia

Try this one. I have tested it only for the input you have given.
You may enhance it to give much better approach or if I have missed something.

Pandora-Box · Posted: Mon Dec 28, 2015 3:01 pm

Bill,

I used better in this context as to say prefer to have 2 step than having 1 complicated step.

And Yes Boyti provided one step solution

Bill Woodger · Posted: Mon Dec 28, 2015 3:21 pm

kranthikumarb,

You show your data to be already in sequence.

So any solution involving SORT is absurd: wasteful of resources. Unless absolutely necessary :-)

For information, you can use MERGE for a single file if you want to insist on using SUM, but OUTFIL reporting functions REMOVECC,NODETAIL and SECTIONS with TRAILER3 will do what you wanted previously.

Can you confirm that the file is in order by the data?

Nic Clouston · Posted: Mon Dec 28, 2015 3:53 pm

Kranthikumarb,
In your first post you have a section "My JCL". In this section you have absolutely no JCL at all. What you do have is sort control cards. Please use the correct terminology. JCL is defined in the JCL language reference manual, sort control cards in the sort manual.

Rohit Umarjikar · Posted: Mon Dec 28, 2015 10:48 pm

I am sure Bill wouldn't advise this card but you can take a look for performance, I see EXCP is lesser for ICETOOL compared to Boyti ko card.

kranthikumarb · Posted: Tue Dec 29, 2015 9:57 am

Thanks everyone for ideas.

@Boyti ko, I am sorry. I should've specified more clearly that all the accounts may not have only 5 transactions. So, restart (1,5) may not be used. After looking at your solution, I am thinking, If I can establish groups and attach seq numbers based on groups and then filter only top row of the group just as you did. But I used begin and end group to create groups. Can we form/create a group when ever account number at position 1,5 is changed? I mean how to make all transations related to one account a group.

On the other hand, I was able to do it with ICETOOL in 1 step. Here it is.

kranthikumarb · Posted: Tue Dec 29, 2015 10:08 am

@ Nic Clouston - Sure I will use proper wording going forward. I wanted to copy paste entire JCL initially, later I thought sort cards alone would do. That was where it went wrong :-)

But what you said is right. I should've used correct wording.

@ Bill, Data is in sorted order in the file. But, some times we may have to pick 1st record in the group, some times 2nd record. Sometimes nth record. We need to sort the data based on the priority of tran codes. If tran codes 01,02, 51 exist, pick anyone of them, if they are not present, pick anyone of the transaction. Also, as I mentioned in my previous reply, an account doesn't always have only 5 transactions. I am very interested to know how this can be achieved using the below.

kranthikumarb · Posted: Tue Dec 29, 2015 11:19 am

@ boyti ko. You are correct. I have a wrong understanding on restrat=(1,5). It worked perfectly

Rohit Umarjikar · Posted: Tue Dec 29, 2015 11:27 am

why do you not read manuals or other posts on sections and think of logic by yourself? why would you expect everything coded?if you're interested to learn then step one is to do research which I don't see from you.

kranthikumarb · Posted: Tue Dec 29, 2015 1:33 pm

@ Rohit, I didn't ask for logic. I have a logic to achieve it. But I was looking for any other ways to do the same ( to reduce the coding from 2 steps to 1 step). Please see the entire thread.

Rohit Umarjikar · Posted: Tue Dec 29, 2015 8:52 pm

I have seen the thread; Since I am a regular member "Sections" in DFSORT has been recently discussed, I will let you find out. The point if you are interested then why you want Bill or others to provide everything ready. You got to search on Google , read manuals or forum and find everything about "Sections", think and wrap your logic.

Bill Woodger · Posted: Fri Jan 01, 2016 8:51 am

You go away for a few days...

Design your solution first. This means, "how do I turn this input into this output". Once you've worked that out, you're 90% of the way there.

Case 1. When there are a mix of types, take any of 01, 02, 51.

Case 2. When there is only 01, 02, 51, take any of 01, 02, 51.

Case 3. Take any of the records (there are no 01, 02, 51 in this).

Because for all cases any of the relevant records can be taken, it seems sensible to decide that one particular record should be taken. Can't take the first, because you won't know in the case of the mixed. So take the last.

Taking the last record is easy, with SECTIONS and TRAILER3 with REMOVECC and NODETAIL in OUTFIL.

As long as you can identify the single group of record-types, that's fine. What about the "mixed" record-types? Only 01, 02, 51 are needed from them, so get rid of the rest with OUTFIL INCLUDE=/OMIT=.

INCLUDE for Case 1 and Case 2, all 01, 02, 51. For Case 3 all records.

Now you just need to arrange the input so that the INCLUDE will work, and you're home free.

A simple JOINKEYS, using the same dataset for both inputs. In JNF2CNTL INCLUDE just 01, 02, 51.

The key for the join is the key for the file. SORTED,NOSEQCK of course (data - assumed, not confirmed by TS/OP - in correct order).

JOIN UNPAIRED,F1

There will be up to three records on F2 for a key, so multiple REFORMAT records may be created. This does not particularly matter, as only the last will be output. Include the match-marker in the REFORMAT record.

For Case 1 and Case 2, include B (match marker) and type 01, 02, 51. For case three, include 1 (match marker).

I'm not sure what would be so complex about such a solution.

A single-step non-JOINKEYS WHEN=GROUP solution is also possible. Is it too difficult to understand? If you never try, then yes. Otherwise, no.

And yes, you can make variations to the selection.

What should always be avoided, where possible, if the data is already in order, is to re-sort. Which mostly precludes a SORT with SUM FIELDS=NONE. MERGE is available, as is SECTIONS with either HEADER3 (first record available) or TRAILER3 (last record available). There's ICETOOL's SELECT operator.

Here's a rule of thumb. If the SORT solution requires more than two passes of the data, except on rare occasions when three passes are required, then it is not a good SORT solution. If the only possible solution requires three (mostly) or more passes of the data, then write a program, including as a first consideration (if to be included in an existing SORT, then for sure) write a SORT EXIT.

Please don't post SORT "solutions" with three or more passes of the data. Those are not solutions, they are... thoughtless. They will be deleted from here.

Please don't post SORT "solutions" which SORT data which is already in order. They will be deleted from here.

Now, I think at least a couple of you could come up with a one-step WHEN=GROUP solution. Want to try? It may take some time, but breaking through on how to think about SORT solutions is well worth it.

If you want to try, come up first with some really good test-data. Because there is no reliable END to a GROUP, watch for overflow from one key to the next. Think of sequence numbers, RESTART, SEQ, multiple conditions and PUSH to multiple locations.