IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

How to split a file in 2 with hierarchy


IBM Mainframe Forums -> SYNCSORT
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Dr_Halo

New User


Joined: 14 Jan 2020
Posts: 8
Location: italy

PostPosted: Tue Jan 14, 2020 9:15 pm
Reply with quote

Hi all. I have a file with an unkown number of records (in my test it has 254,950,232 recs). I would like to split this file in 2 parts, but it has a hierachy to respect so I cannot use SPLIT1R to do it.
This is my input file (VB2004):
Code:

----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8
010...................00                                                     012
103...000311494998300005000000710.600000669777MT056000860001713............0101.
110...000311494998300005000000710.600000669777MT056000860001713............0201.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
103...000043274993800005000000710.600000712525MT056000860201542............0101.
110...000043274993800005000000710.600000712525MT056000860201542............0201.
109...000043274993800005000000710.600000712525MT056000860201542............0601.
109...000043274993800005000000710.600000712525MT056000860201542............0601.


The first record (the one that begins with "010") is to be reported on both file. So it's simple to use an INCLUDE cond to do it.
The sequent records have a key from column 25 to column 63 (39 bytes). Hieracy is defined in columns 76-77, so for splitting the file in 2 I have to write all record with the same key (25-63) that have hieracy from 01 to 06 in the same order as I read.
Wanted files are like to those ones:

Code:

----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8
010...................00                                                     012
103...000311494998300005000000710.600000669777MT056000860001713............0101.
110...000311494998300005000000710.600000669777MT056000860001713............0201.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.

Code:

----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8
010...................00                                                     012
103...000043274993800005000000710.600000712525MT056000860201542............0101.
110...000043274993800005000000710.600000712525MT056000860201542............0201.
109...000043274993800005000000710.600000712525MT056000860201542............0601.
109...000043274993800005000000710.600000712525MT056000860201542............0601.


I've tryed to use SEQNUM to find what record have the same key doing a sort like this:
Code:
INREC IFTHEN=(WHEN=INIT,OVERLAY=(2005:SEQNUM,10,ZD,RESTART=(29,39))),
      IFTHEN=(WHEN=GROUP,BEGIN=(2005,10,ZD,EQ,1),PUSH=(2005:ID=9))   
SORT FIELDS=COPY                                                     

But now I don't know how to continue. I've tried to code this:
Code:

OUTFIL FNAMES=SORTOU1,                     
      INCLUDE=(5,3,CH,EQ,C'010',OR,2005,10,CH,LE,seqnum,DIV,+2)
OUTFIL FNAMES=SORTOU2,                     
      INCLUDE=(5,3,CH,EQ,C'010',OR,2005,10,CH,GT,seqnum,DIV,+2) 

but seqnum cannot be used anymore after inrec.

I'm using SYNCSORT FOR Z/OS 2.1.7.0N. Have you any idea?
Back to top
View user's profile Send private message
Dr_Halo

New User


Joined: 14 Jan 2020
Posts: 8
Location: italy

PostPosted: Tue Jan 14, 2020 10:18 pm
Reply with quote

Obviously the output records are to be reformatted with a BUILD=(1,2004) to delete seqnum, but this is not a problem now.
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2023
Location: USA

PostPosted: Tue Jan 14, 2020 11:14 pm
Reply with quote

Dr_Halo wrote:
I'm using SYNCSORT FOR Z/OS 2.1.7.0N.

It doesn't matter.

The problem is, it's very difficult to understand the idea of your logic...

1) When the new "hierarchy" begins? Starting from each '01' value? Or just after each '06' value? Or whenever next "hierarchy" is less than the previous one? Or something else?

2) What if there are more than 2 groups of "hierarchy"? Do we need to split into more files? Or return to the first file? Or continue to the second file? Or ignore all subsequent "hierarchies"? Or something else?
Back to top
View user's profile Send private message
Dr_Halo

New User


Joined: 14 Jan 2020
Posts: 8
Location: italy

PostPosted: Wed Jan 15, 2020 12:21 am
Reply with quote

sergeyken wrote:

1) When the new "hierarchy" begins? Starting from each '01' value?
yes!
sergeyken wrote:
2) What if there are more than 2 groups of "hierarchy"? Do we need to split into more files? Or return to the first file? Or continue to the second file? Or ignore all subsequent "hierarchies"? Or something else?
my goal is divide the input file in 2 for elaborate them in 2 parallel jobs. The output files have to mantain the original format.
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2023
Location: USA

PostPosted: Wed Jan 15, 2020 12:33 am
Reply with quote

Dr_Halo wrote:
sergeyken wrote:
2) What if there are more than 2 groups of "hierarchy"? Do we need to split into more files? Or return to the first file? Or continue to the second file? Or ignore all subsequent "hierarchies"? Or something else?
my goal is divide the input file in 2 for elaborate them in 2 parallel jobs. The output files have to mantain the original format.

What if the input contains "hierarchies":
01
02
03
04
05
06
01
02
01
02
01
02
01
06

Divide them around 50%, starting with 01 ???
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2023
Location: USA

PostPosted: Wed Jan 15, 2020 1:31 am
Reply with quote

Looks like we can rely on the "hierarchy" code only; suppose it strictly follows the groups defined by key field (25,39)?
Code:
//HGROUPS  EXEC PGM=SORT                                                       
//*                                                                             
//SYSOUT   DD  SYSOUT=*                                                         
//*                                                                             
//SORTIN   DD  *                                                               
010...................00                                                     012
103...000311494998300005000000710.600000669777MT056000860001713............0101.
110...000311494998300005000000710.600000669777MT056000860001713............0201.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
103...000311494998300005000000710.600000669777MT056000860001713............0101.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
103...000311494998300005000000710.600000669777MT056000860001713............0101.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
103...000311494998300005000000710.600000669777MT056000860001713............0101.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
103...000311494998300005000000710.600000669777MT056000860001713............0101.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
103...000311494998300005000000710.600000669777MT056000860001713............0101.
110...000311494998300005000000710.600000669777MT056000860001713............0201.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
103...000043274993800005000000710.600000712525MT056000860201542............0101.
110...000043274993800005000000710.600000712525MT056000860201542............0201.
109...000043274993800005000000710.600000712525MT056000860201542............0601.
109...000043274993800005000000710.600000712525MT056000860201542............0601.
103...000043274993800005000000710.600000712525MT056000860201542............0101.
110...000043274993800005000000710.600000712525MT056000860201542............0201.
103...000043274993800005000000710.600000712525MT056000860201542............0101.
110...000043274993800005000000710.600000712525MT056000860201542............0201.
109...000043274993800005000000710.600000712525MT056000860201542............0601.
103...000043274993800005000000710.600000712525MT056000860201542............0101.
109...000043274993800005000000710.600000712525MT056000860201542............0601.
//*-+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8 
//*                                                                     
//SORTOU1  DD  SYSOUT=*                                                 
//SORTOU2  DD  SYSOUT=*                                                 
//*                                                                     
//SYSIN    DD  *                                                       
 INREC IFTHEN=(WHEN=GROUP,                                             
               BEGIN=(76,2,CH,EQ,C'01'),                               
               PUSH=(81:ID=1))                                         
 SORT FIELDS=COPY                                                       
*
 OUTFIL FNAMES=SORTOU1,                                                 
        INCLUDE=(1,3,CH,EQ,C'010',                                     
              OR,81,1,BI,BO,X'01'),                                     
        BUILD=(1,80)                                                   
 OUTFIL FNAMES=SORTOU2,                                                 
        INCLUDE=(1,3,CH,EQ,C'010',                                     
              OR,81,1,BI,BZ,X'01'),                                     
        BUILD=(1,80)                                                   
 END                                                                   
//*                                                                     


SORTOU1
Code:
010...................00                                                     012
103...000311494998300005000000710.600000669777MT056000860001713............0101.
110...000311494998300005000000710.600000669777MT056000860001713............0201.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
103...000311494998300005000000710.600000669777MT056000860001713............0101.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
103...000311494998300005000000710.600000669777MT056000860001713............0101.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
103...000043274993800005000000710.600000712525MT056000860201542............0101.
110...000043274993800005000000710.600000712525MT056000860201542............0201.
109...000043274993800005000000710.600000712525MT056000860201542............0601.
109...000043274993800005000000710.600000712525MT056000860201542............0601.
103...000043274993800005000000710.600000712525MT056000860201542............0101.
110...000043274993800005000000710.600000712525MT056000860201542............0201.
109...000043274993800005000000710.600000712525MT056000860201542............0601.


SORTOU2
Code:
010...................00                                                     012
103...000311494998300005000000710.600000669777MT056000860001713............0101.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
103...000311494998300005000000710.600000669777MT056000860001713............0101.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
103...000311494998300005000000710.600000669777MT056000860001713............0101.
110...000311494998300005000000710.600000669777MT056000860001713............0201.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
109...000311494998300005000000710.600000669777MT056000860001713............0601.
103...000043274993800005000000710.600000712525MT056000860201542............0101.
110...000043274993800005000000710.600000712525MT056000860201542............0201.
103...000043274993800005000000710.600000712525MT056000860201542............0101.
109...000043274993800005000000710.600000712525MT056000860201542............0601.
Back to top
View user's profile Send private message
Dr_Halo

New User


Joined: 14 Jan 2020
Posts: 8
Location: italy

PostPosted: Wed Jan 15, 2020 1:37 am
Reply with quote

It would be nice if I could count all "01" and divide them in 2 output with the other records having the same Key (is not important what the hieracy they have but is important to mantain the order ascending). The number of records in the two files will be approximately the same so the CPU time would be like 50-50.
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2023
Location: USA

PostPosted: Wed Jan 15, 2020 1:48 am
Reply with quote

Dr_Halo wrote:
It would be nice if I could count all "01" and divide them in 2 output with the other records having the same Key (is not important what the hieracy they have but is important to mantain the order ascending). The number of records in the two files will be approximately the same so the CPU time would be like 50-50.

The sample above does exactly the same: it splits the "hierarchy" groups into even/odd groups sent to two different files, one by one.
Back to top
View user's profile Send private message
Dr_Halo

New User


Joined: 14 Jan 2020
Posts: 8
Location: italy

PostPosted: Wed Jan 15, 2020 2:04 am
Reply with quote

Thanks, now I'm stuck with a "short record"...
The result is:

Code:

SYSIN :                                                                       
 INREC  IFTHEN=(WHEN=GROUP,BEGIN=(80,2,CH,EQ,C'01'),PUSH=(2005:ID=1))         
*                                                                             
 SORT FIELDS=COPY                                                             
*                                                                             
 OUTFIL FNAMES=SORTOU1,                                                       
        INCLUDE=(5,3,CH,EQ,C'010',OR,2005,1,BI,BO,X'01'),                     
        BUILD=(1,2004)                                                         
 OUTFIL FNAMES=SORTOU2,                                                       
        INCLUDE=(5,3,CH,EQ,C'010',OR,2005,1,BI,BZ,X'01'),                     
        BUILD=(1,2004)                                                         
 END                                                                           
WER813I  INSTALLATION OPTIONS IN MFX LOAD LIBRARY WILL BE USED                 
WER550I  ZPCOPY EXECUTED - TYPICAL SAVINGS ARE UP TO 95% TCB TIME AND 20% ELAPS
WER276B  SYSDIAG= 3588534, 39573819, 39573819, 51273420                       
WER164B  31,216K BYTES OF VIRTUAL STORAGE AVAILABLE, MAX REQUESTED,           
WER164B     100K BYTES RESERVE REQUESTED, 27,872K BYTES USED                   
WER146B  64K BYTES OF EMERGENCY SPACE ALLOCATED                               
WER108I  SORTIN   : RECFM=VB   ; LRECL=  2004; BLKSIZE= 32760                 
WER073I  SORTIN   : DSNAME=SDS.DGP0AX.TS0O0.F1A0001.AXS0D.APRD2               
WER257I  INREC RECORD LENGTH =  2005                                           
WER238I  POTENTIALLY INEFFICIENT USE OF INREC                                 
WER110I  SORTOU1  : RECFM=VB   ; LRECL=  2004; BLKSIZE= 32760                 
WER110I  SORTOU2  : RECFM=VB   ; LRECL=  2004; BLKSIZE= 32760                 
WER074I  SORTOU1  : DSNAME=AM15.TRY1.ST                                       
WER074I  SORTOU2  : DSNAME=AM15.TRY2.ND                                       
WER410B  29,164K BYTES OF VIRTUAL STORAGE AVAILABLE ABOVE THE 16-MEGABYTE LINE,
WER410B     0 BYTES RESERVE REQUESTED, 27,656K BYTES USED                     
WER244A  SORTOU1  OUTREC - SHORT RECORD                                       
WER493I  ZIIP PROCESSOR USED                                                   
WER211B  SYNCSMF  CALLED BY SYNCSORT; RC=0000                                 


can you help me? I thought to use VLSHRT after every OUTFIL but it gives me another error...
Back to top
View user's profile Send private message
dneufarth

Active User


Joined: 27 Apr 2005
Posts: 420
Location: Inside the SPEW (Southwest Ohio, USA)

PostPosted: Wed Jan 15, 2020 2:17 am
Reply with quote

Short record contains? What is its size?
Back to top
View user's profile Send private message
Dr_Halo

New User


Joined: 14 Jan 2020
Posts: 8
Location: italy

PostPosted: Wed Jan 15, 2020 2:32 am
Reply with quote

I want a VB2004 output, but using the PUSH it would be VB2005? Or VB2006?
Back to top
View user's profile Send private message
Dr_Halo

New User


Joined: 14 Jan 2020
Posts: 8
Location: italy

PostPosted: Wed Jan 15, 2020 4:04 pm
Reply with quote

Finally:

Code:

//P010    EXEC PGM=SORT,PARM='VLTESTI=1'                             
//SYSUDUMP  DD SYSOUT=D                                               
//SYSOUT    DD SYSOUT=*                                               
//SORTIN    DD DSN=SDS.DGP0AX.TS0O0.F1A0001.AXS0D.APRD2,DISP=SHR     
//SORTOU1   DD DSN=AM15.TRY1.ST,DISP=(,CATLG,),                       
//             SPACE=(CYL,(10,10)),LRECL=2004,RECFM=VB               
//SORTOU2   DD DSN=AM15.TRY2.ND,DISP=(,CATLG,),                       
//             SPACE=(CYL,(10,10)),LRECL=2004,RECFM=VB               
//SYSIN     DD *                                                     
 INREC  IFTHEN=(WHEN=GROUP,BEGIN=(80,2,CH,EQ,C'01'),PUSH=(2005:ID=1)),
        IFTHEN=(WHEN=(5,3,CH,EQ,C'010'),OVERLAY=(2005:C'T'))         
*                                                                     
 SORT FIELDS=COPY                                                     
*                                                                     
 OUTFIL FNAMES=SORTOU1,VLTRIM=C' ',                                   
        INCLUDE=(2005,1,CH,EQ,C'T',OR,2005,1,BI,BO,X'01'),           
        OUTREC=(1,2004)                                               
 OUTFIL FNAMES=SORTOU2,VLTRIM=C' ',                                   
        INCLUDE=(2005,1,CH,EQ,C'T',OR,2005,1,BI,BZ,X'01'),           
        OUTREC=(1,2004)                                               
 END                                                                 
//*                                                                   

It seems to work. Thank you all.
Back to top
View user's profile Send private message
Dr_Halo

New User


Joined: 14 Jan 2020
Posts: 8
Location: italy

PostPosted: Wed Jan 15, 2020 7:14 pm
Reply with quote

Better:

Code:
//P010    EXEC PGM=SORT                                             
//SYSUDUMP  DD SYSOUT=D                                             
//SYSOUT    DD SYSOUT=*                                             
//SORTIN    DD DSN=SDS.DGP0AX.TS0O0.F1A0001.AXS0D.APRD2,DISP=SHR   
//SORTOU1   DD DSN=AM15.TRY3.RD,DISP=(,CATLG,),                     
//             SPACE=(CYL,(10,10)),LRECL=2004,RECFM=VB             
//SORTOU2   DD DSN=AM15.TRY4.TH,DISP=(,CATLG,),                     
//             SPACE=(CYL,(10,10)),LRECL=2004,RECFM=VB             
//SYSIN     DD *                                                   
 INREC  IFTHEN=(WHEN=INIT,BUILD=(1,4,C'-',5)),                   
        IFTHEN=(WHEN=GROUP,BEGIN=(81,2,CH,EQ,C'01'),PUSH=(5:ID=1))
*                                                                 
 SORT FIELDS=COPY                                                 
*                                                                 
 OUTFIL FNAMES=SORTOU1,                                           
        INCLUDE=(6,3,CH,EQ,C'010',OR,5,1,BI,BO,X'01'),           
        OUTREC=(1,4,6)                                           
 OUTFIL FNAMES=SORTOU2,                                           
        INCLUDE=(6,3,CH,EQ,C'010',OR,5,1,BI,BZ,X'01'),           
        OUTREC=(1,4,6)                                           
 END                                                             

So I've not to use PARM='VLTESTI=1' and VLTRIM=C' '
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> SYNCSORT

 


Similar Topics
Topic Forum Replies
No new posts Compare 2 files and retrive records f... DFSORT/ICETOOL 3
No new posts FTP VB File from Mainframe retaining ... JCL & VSAM 8
No new posts Extract the file name from another fi... DFSORT/ICETOOL 6
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts Extracting Variable decimal numbers f... DFSORT/ICETOOL 17
Search our Forums:

Back to Top