IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

How to extract first 2 group occurence from a list


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
pcsingh_2000

New User


Joined: 26 May 2004
Posts: 6
Location: Kolkata

PostPosted: Thu Nov 12, 2009 9:53 pm
Reply with quote

Hi,

I am trying to achieve following output from an input file.

Input File: record lenght 213, fixed
in position 8, I have program names
in position 177, i have timestamp data

I have sorted the file with respect to program name, timestamp

If I consider program name and timestamp as an unique group, I want first two occerences of each group in output file. If the group occurs only once, then also that group needs to be in output file.

Input file:
==========
.....ABA012B.................2004-10-26-14.52.22.743163.....
.....ABA012B.................2004-10-26-14.52.22.743163.....
.....ABA012B.................2004-10-26-14.52.22.743163.....
.....ABA012B.................2005-07-24-15.25.08.639561.....
.....ABA012B.................2005-07-24-15.25.08.639561.....
.....ABA012B.................2005-07-24-15.25.08.639561.....
.....ABA012B.................2005-08-22-16.30.08.640333.....
.....ABA012B.................2005-08-22-16.30.08.640333.....
.....ABA012B.................2005-08-22-16.30.08.640333.....
.....ABA015B.................2003-03-06-14.44.19.347273.....
.....ABA015B.................2003-03-06-14.44.19.347273.....
.....ABA015B.................2007-09-07-19.53.11.345000.....
.....ABA015B.................2007-09-07-19.53.11.345000.....
.....ABA019B.................2004-10-26-14.52.24.215062.....
.....ABA019B.................2004-10-26-14.52.24.215062.....
.....ABA019B.................2004-10-26-14.52.24.215062.....
.....ABA019B.................2004-10-26-14.52.24.215062.....

Output file:
========
.....ABA012B.................2004-10-26-14.52.22.743163.....
.....ABA012B.................2004-10-26-14.52.22.743163.....
.....ABA012B.................2004-10-26-14.52.22.743163.....
.....ABA012B.................2005-07-24-15.25.08.639561.....
.....ABA012B.................2005-07-24-15.25.08.639561.....
.....ABA012B.................2005-07-24-15.25.08.639561.....
.....ABA015B.................2003-03-06-14.44.19.347273.....
.....ABA015B.................2003-03-06-14.44.19.347273.....
.....ABA015B.................2007-09-07-19.53.11.345000.....
.....ABA015B.................2007-09-07-19.53.11.345000.....
.....ABA019B.................2004-10-26-14.52.24.215062.....
.....ABA019B.................2004-10-26-14.52.24.215062.....
.....ABA019B.................2004-10-26-14.52.24.215062.....
.....ABA019B.................2004-10-26-14.52.24.215062.....


Regards,
Prakash
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Thu Nov 12, 2009 11:14 pm
Reply with quote

Huh? The output you show does NOT match the rules you gave. In fact, your output is indentical to your input, except that you removed these records for some unknown reason:

.....ABA012B.................2005-08-22-16.30.08.640333.....
.....ABA012B.................2005-08-22-16.30.08.640333.....
.....ABA012B.................2005-08-22-16.30.08.640333.....

If you really want the first two occurrences, you can use a DFSORT/ICETOOL job like the following:

Code:

//S1   EXEC  PGM=ICETOOL
//TOOLMSG   DD  SYSOUT=*
//DFSMSG    DD  SYSOUT=*
//IN DD DSN=... input file (FB/213)
//OUT DD DSN=...  output file (FB/213)
//TOOLIN DD *
SELECT FROM(IN) TO(OUT) ON(6,8,CH) ON(177,26,CH) FIRST(2)
/*


OUT would have:

Code:

.....ABA012B.................2004-10-26-14.52.22.743163
.....ABA012B.................2004-10-26-14.52.22.743163
.....ABA012B.................2005-07-24-15.25.08.639561
.....ABA012B.................2005-07-24-15.25.08.639561
.....ABA012B.................2005-08-22-16.30.08.640333
.....ABA012B.................2005-08-22-16.30.08.640333
.....ABA015B.................2003-03-06-14.44.19.347273
.....ABA015B.................2003-03-06-14.44.19.347273
.....ABA015B.................2007-09-07-19.53.11.345000
.....ABA015B.................2007-09-07-19.53.11.345000
.....ABA019B.................2004-10-26-14.52.24.215062
.....ABA019B.................2004-10-26-14.52.24.215062


If that's not what you want, then you need to explain clearly the rules for what you do want with a matching example of input and output.
Back to top
View user's profile Send private message
pcsingh_2000

New User


Joined: 26 May 2004
Posts: 6
Location: Kolkata

PostPosted: Fri Nov 13, 2009 1:24 pm
Reply with quote

Hi Frank,

Sorry for not clear enough in my explanation..

I am thinking of unique group which is a combination of Programname and timestamp. for example
for program ABA012B - there are 3 groups, for ABA015B - 2 groups and for ABA019B - 1 group
first group is
.....ABA012B.................2004-10-26-14.52.22.743163.....
.....ABA012B.................2004-10-26-14.52.22.743163.....
.....ABA012B.................2004-10-26-14.52.22.743163.....
second group is
.....ABA012B.................2005-07-24-15.25.08.639561.....
.....ABA012B.................2005-07-24-15.25.08.639561.....
.....ABA012B.................2005-07-24-15.25.08.639561.....
third combination is
.....ABA012B.................2005-08-22-16.30.08.640333.....
.....ABA012B.................2005-08-22-16.30.08.640333.....
.....ABA012B.................2005-08-22-16.30.08.640333.....
4th group is
.....ABA015B.................2003-03-06-14.44.19.347273.....
.....ABA015B.................2003-03-06-14.44.19.347273.....
5th group is
.....ABA015B.................2007-09-07-19.53.11.345000.....
.....ABA015B.................2007-09-07-19.53.11.345000.....
6th group is
.....ABA019B.................2004-10-26-14.52.24.215062.....
.....ABA019B.................2004-10-26-14.52.24.215062.....
.....ABA019B.................2004-10-26-14.52.24.215062.....
.....ABA019B.................2004-10-26-14.52.24.215062.....

In my input file, there are millions of records, which comprises several groups like mentioned above and each group may contain 1 or more records. For each program, there can be n number of groups.

I tried to extract first 2 groups of each program.
like for program ABA012B, I want to extract only
first group is
.....ABA012B.................2004-10-26-14.52.22.743163.....
.....ABA012B.................2004-10-26-14.52.22.743163.....
.....ABA012B.................2004-10-26-14.52.22.743163.....
second group
.....ABA012B.................2005-07-24-15.25.08.639561.....
.....ABA012B.................2005-07-24-15.25.08.639561.....
.....ABA012B.................2005-07-24-15.25.08.639561.....

For program ABA015B, I want to capture
4th group
.....ABA015B.................2003-03-06-14.44.19.347273.....
.....ABA015B.................2003-03-06-14.44.19.347273.....
5th group
.....ABA015B.................2007-09-07-19.53.11.345000.....
.....ABA015B.................2007-09-07-19.53.11.345000.....

For program ABA019B, I want to capture below, since it is having only 1 group.
6th group
.....ABA019B.................2004-10-26-14.52.24.215062.....
.....ABA019B.................2004-10-26-14.52.24.215062.....
.....ABA019B.................2004-10-26-14.52.24.215062.....
.....ABA019B.................2004-10-26-14.52.24.215062.....

that's why I have removed group 3 from my output.

Regards,
Prakash
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Fri Nov 13, 2009 10:18 pm
Reply with quote

pcsingh_2000,

The following DFSORT JCL will give you the desired results

Code:

//STEP0100 EXEC PGM=SORT                                             
//SYSOUT   DD SYSOUT=*                                               
//SORTIN   DD DSN=Your input FB 213 byte file,DISP=SHR
//SORTOUT  DD SYSOUT=*                                               
//SYSIN    DD *                                                     
  SORT FIELDS=COPY                                                   
  INREC IFTHEN=(WHEN=INIT,                                           
  OVERLAY=(214:8,8,177,26,SEQNUM,8,ZD,RESTART=(214,34),             
           SEQNUM,8,ZD,RESTART=(214,8))),                           
  IFTHEN=(WHEN=GROUP,BEGIN=(248,8,ZD,EQ,1),PUSH=(264:ID=8),HIT=NEXT),
  IFTHEN=(WHEN=GROUP,BEGIN=(256,8,ZD,EQ,1),PUSH=(272:264,8))         
                                                                     
  OUTREC BUILD=(1,213,264,8,ZD,SUB,272,8,ZD,M11,LENGTH=8)           
  OUTFIL INCLUDE=(214,8,ZD,LE,1),BUILD=(1,213)                       
//*
Back to top
View user's profile Send private message
pcsingh_2000

New User


Joined: 26 May 2004
Posts: 6
Location: Kolkata

PostPosted: Thu Nov 19, 2009 5:32 pm
Reply with quote

Hi,

I have tried to follow your instructions by mapping the positions, as I have a different layout this time. Now the file length is 279. Timestamp is at 243 position and program names is same at 8. While building the file, I have sorted Program name on ascending and timestamp on descending manner. But still not able to get latest 2 timestamps of each program. In fact I am getting more occurences of each group.

I am pasting the JCL below:
Code:

//STEP0100 EXEC PGM=SORT
//SYSOUT   DD SYSOUT=*
//SORTIN   DD DSN=TSG10.SPLAN.DATA,DISP=SHR
//SORTOUT  DD DSN=TSG10.SPLAN.DATA.MOD,
//            DISP=(,CATLG,DELETE),
//            UNIT=SYSDA,
//            SPACE=(CYL,(5,10))
//SYSIN    DD *
  OPTION COPY
  INREC IFTHEN=(WHEN=INIT,
  OVERLAY=(280:8,8,243,26,SEQNUM,8,ZD,RESTART=(280,34),
           SEQNUM,8,ZD,RESTART=(280,8))),
  IFTHEN=(WHEN=GROUP,BEGIN=(314,8,ZD,EQ,1),PUSH=(330:ID=8),HIT=NEXT),
  IFTHEN=(WHEN=GROUP,BEGIN=(322,8,ZD,EQ,1),PUSH=(338:330,8))
  OUTREC BUILD=(1,279,330,8,ZD,SUB,338,8,ZD,M11,LENGTH=8)
  OUTFIL INCLUDE=(280,8,ZD,LE,1),BUILD=(1,279)
/*
Back to top
View user's profile Send private message
pcsingh_2000

New User


Joined: 26 May 2004
Posts: 6
Location: Kolkata

PostPosted: Thu Nov 19, 2009 5:55 pm
Reply with quote

Hi,

Please ignore my last message, I am getting the expected result by mapping the instructions of Skolusu.The program name actually starts from 18. As the new functions of DFSORT is quite interesting, I want to download the same from the link given by Frank, but it leads to a FTP link which is not working for me. Is there any way I can download the PDF.

Thanks a lot for the help!!!
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Thu Nov 19, 2009 9:41 pm
Reply with quote

pcsingh_2000,

The DFSORT FTP site is accessible to everyone. Unless you work at a shop which blocks FTP sites you shouldn't have a problem downloading the documents.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Thu Nov 19, 2009 11:09 pm
Reply with quote

Prakash,

If you can't download the pdf, send me an e-mail offline (yaeger@us.ibm.com) requesting it and I'll send it to you directly.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Extract the file name from another fi... DFSORT/ICETOOL 6
No new posts Need help for File Aid JCL to extract... Compuware & Other Tools 23
No new posts How to create a list of SAR jobs with... CA Products 3
No new posts optim extract file - SAS DB2 2
No new posts Build dataset list with properties us... PL/I & Assembler 4
Search our Forums:

Back to Top