IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

SORT CARD to remove duplicates & add new column with cou


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
mkk157

Active User


Joined: 17 May 2006
Posts: 310

PostPosted: Tue Jul 08, 2014 9:04 pm
Reply with quote

Subject :
SORT CARD to remove duplicates and add new column with record count

I have a input file with following format.

File format = VB
LRECL = 189

Out of 189 characters, I'm interested in positions 1-8, 16-8 and 27-7.

If any 2 records have same values at these positions, Output file should retain the first record and count the total number of records in a new column removing the duplicate records.

Example
Code:
<.1-8..><.9-15.><16-23 ><-><27-7 ><..34---189.....>
ABCDABCDxxxxxxxxAABBCCDDxxxAGHJKIMxxxxxxxxxxxxxxxxx
DBGHETTAxxxxxxxxBHUDUDFHxxxAGHJKIMxxxxxxxxxxxxxxxxx
ABCDABCDxxxxxxxxAABBCCDDxxxAGHJKIMxxxxxxxxxxxxxxxxx
ABCDABCDxxxxxxxxAABBCCDDxxxAGHJKIMxxxxxxxxxxxxxxxxx
ABCDABCDxxxxxxxxAABBCCDDxxxAGHJKIMxxxxxxxxxxxxxxxxx
DBGHETTAxxxxxxxxBHUDUDFHxxxAGHJKIMxxxxxxxxxxxxxxxxx
AAAAAAAAxxxxxxxxDDDDDDDDxxxCCCCCCCxxxxxxxxxxxxxxxxx
ABCDABCDxxxxxxxxAABBCCDDxxxAGHJKIMxxxxxxxxxxxxxxxxx
AAAAAAAAxxxxxxxxDDDDDDDDxxxCCCCCCCxxxxxxxxxxxxxxxxx


Output
Code:
<.1-8..><.9-15.><16-23 ><-><27-7 ><..34---189.....><190-195>
ABCDABCDxxxxxxxxAABBCCDDxxxAGHJKIMxxxxxxxxxxxxxxxxx000000005
DBGHETTAxxxxxxxxBHUDUDFHxxxAGHJKIMxxxxxxxxxxxxxxxxx000000002
AAAAAAAAxxxxxxxxDDDDDDDDxxxCCCCCCCxxxxxxxxxxxxxxxxx000000002


Please suggest me the sort card for this requirement.
Back to top
View user's profile Send private message
mkk157

Active User


Joined: 17 May 2006
Posts: 310

PostPosted: Tue Jul 08, 2014 9:41 pm
Reply with quote

Iam using below SORT CARD ,,

SORT FIELDS=(5,8,CH,A,16,8,CH,A,27,03,CH,A)
SUM FIELDS=NONE
OUTFIL REMOVECC,NODETAIL,
TRAILER1=(32,5,COUNT=(M11,LENGTH=5))
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3048
Location: NYC,USA

PostPosted: Tue Jul 08, 2014 10:03 pm
Reply with quote

Untested, but please try

Code:
//STEP0001 EXEC PGM=SORT                                               
//SYSOUT   DD SYSOUT=*                                                 
//SORTIN   DD *                                                         
your data                                               
//SYSIN    DD *                                                         
  SORT FIELDS=COPY                                                     
  OUTFIL REMOVECC,NODETAIL,                                             
    SECTIONS=(N,M,                                                     
       TRAILER3=(N,M,COUNT=(M10,LENGTH=3)))         
/* 


N and M are the sort poistions, please cascade that as per the requirement
Back to top
View user's profile Send private message
mkk157

Active User


Joined: 17 May 2006
Posts: 310

PostPosted: Tue Jul 08, 2014 10:08 pm
Reply with quote

I have Multple Positions to be verified.....You mean like below...
SECTIONS=(5,8,16,8,28,03,
TRAILER3=(5,8,16,8,28,03,COUNT=(M10,LENGTH=3)))
Back to top
View user's profile Send private message
mkk157

Active User


Joined: 17 May 2006
Posts: 310

PostPosted: Tue Jul 08, 2014 10:20 pm
Reply with quote

Hi Rohit,

thank you for the reply,,,, It didnt worked as per the requirement....

any other sugetions...
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Tue Jul 08, 2014 10:47 pm
Reply with quote

In what way did it not work (I can see two, one you can deal with, the other not)?

Is the data already in sequence, or do you have to SORT it?
Back to top
View user's profile Send private message
mkk157

Active User


Joined: 17 May 2006
Posts: 310

PostPosted: Tue Jul 08, 2014 11:46 pm
Reply with quote

DATA need to be sorted ,,,,,,


I required to sort them based on 3 positions 5,8 and 16,8 and 27,3

when i am using the above Information ,,

count is not working properly... Duplicates are still there ....
Back to top
View user's profile Send private message
jerryte

Active User


Joined: 29 Oct 2010
Posts: 202
Location: Toronto, ON, Canada

PostPosted: Wed Jul 09, 2014 1:57 am
Reply with quote

Do you need the data that is in between each sort field? If not then use an ICETOOL SELECT statement
Code:

OCCUR FROM(FILEIN) LIST(REPORT) -
ON(5,8) HEADER('FIELD1') -
ON(20,8) HEADER('FIELD2') -
ON(31,8) HEADER('FIELD3') -
ON(VALCNT,A1) HEADER('COUNT')

This will generate a report showing each field with a record.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Wed Jul 09, 2014 5:21 am
Reply with quote

Jerrytem you write SELECT but code OCCUR.

mkk157,

Specify OPTION EQUALS.

Use INREC to extend you record a field large enough to hold the maximum value you want to count, and give it a value of one (1).

Use SORT, for the obvious purpose.

Use SUM to sum your records with the value of 1.
Back to top
View user's profile Send private message
mkk157

Active User


Joined: 17 May 2006
Posts: 310

PostPosted: Wed Jul 09, 2014 5:36 pm
Reply with quote

Finally i Got the Out put.. Below is the card that worked,,,,,


SORT FIELDS=(5,8,CH,A,16,8,CH,A,27,03,CH,A)
OUTFIL REMOVECC,NODETAIL,
SECTIONS=(5,8,16,8,28,03,
TRAILER3=(5,36,2x,COUNT=(M11,LENGTH=3)))
_________________

Thanks every one for help and sugetions....
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Wed Jul 09, 2014 6:05 pm
Reply with quote

Really? What about the part of the requirement to base the record-from-duplicates on the first input record?
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3048
Location: NYC,USA

PostPosted: Wed Jul 09, 2014 9:25 pm
Reply with quote

Bill, the above card would retain one amongst the duplicate with the count, but is that what you mean here or want to ask?
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Wed Jul 09, 2014 10:06 pm
Reply with quote

Quote:
If any 2 records have same values at these positions, Output file should retain the first record and count the total number of records in a new column removing the duplicate records.


The TRAILER3 will use the final OUTFIL record which is duplicate according to the SECTIONS description.

With NOEQUALS, this will sometimes be the first input record of that key, and sometimes not. With EQUALS, it will never be the first input record of that key.

With SUM (since the SORT is needed) and NOEQUALS it will sometimes be the first input record of that key, and sometimes not, that is the base. With SUM and EQUALS it will always be the first input record of that key that is the base.
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3048
Location: NYC,USA

PostPosted: Thu Jul 10, 2014 12:33 am
Reply with quote

Right, So TS might also wants to add EQUALS if he is really interested in Base record. Thanks
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Thu Jul 10, 2014 2:07 am
Reply with quote

EQUALS, and can't use TRAILER3.
Back to top
View user's profile Send private message
Sumit Gupta

New User


Joined: 01 May 2014
Posts: 4
Location: India

PostPosted: Mon Jul 14, 2014 11:51 am
Reply with quote

I have come up with one more solution for it but this requires 2 sort cards, I have tried this for FB file but I am sure you can modify the starting positions and use the same for VB file also,

first sort card would be like
SORT FIELDS=COPY
OUTREC FIELDS=(1:1,186,187,3,C'001')

Use output file of first sort and use this sort card
SORT FIELDS=(1,8,CH,A,16,8,CH,A,27,03,CH,A)
SUM FIELDS=(190,3,ZD)
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Mon Jul 14, 2014 12:16 pm
Reply with quote

Sumit,

Please don't use FIELDS when using OUTREC (or INREC, or OUTFIL OUTREC), use BUILD, which reduces the possibility of FIELDS (or OUTFIL OUTREC) getting confused with the other uses of FIELDS (you have two more in your example). In this case, OVERLAY is clearer anyway.

If you use INREC to process before the SORT, you don't need the two steps/two passes of the file. You also need EQUALS to ensue the first record read is the first record retained by SUM when there are duplicate keys. Unless "leaving spaces" on something (like on the OVERLAY), there is no need to specify column numbers.

Code:
 OPTION EQUALS
 
 INREC OVERLAY=(190:C'001')
 
 SORT FIELDS=(1,8,CH,A,16,8,CH,A,27,03,CH,A)

 SUM FIELDS=(190,3,ZD)


It should be rare that you need to read an entire file twice with SORT, and such cases will be for very complicated processing. If you have something simple with multiple passes of the data, put more work into it, there will be a simple one-pass solution.
Back to top
View user's profile Send private message
Sumit Gupta

New User


Joined: 01 May 2014
Posts: 4
Location: India

PostPosted: Mon Jul 14, 2014 3:00 pm
Reply with quote

Sure Bill...
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Sortjoin and Search for a String and ... DFSORT/ICETOOL 1
No new posts JCL sort card - get first day and las... JCL & VSAM 9
No new posts How to load to DB2 with column level ... DB2 6
No new posts RC query -Time column CA Products 3
No new posts Sort First/last record of a subset th... DFSORT/ICETOOL 7
Search our Forums:

Back to Top