IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

How to handle different file length by DFSORT?


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
javen777

New User


Joined: 06 Mar 2015
Posts: 31
Location: china

PostPosted: Mon Nov 26, 2018 12:13 pm
Reply with quote

Dear all,

I would like to achieve 1 specific function by DFSORT, but i don't know how, so could you please help on this?
Thank you in advance.

there are several files as input but with different length:
name: length
File1 20
File2 300
File3 1000
.......
FileN nnnn

1.Adding a sequence number at each record in every these 3 files.
2.output the new record to a new file, but how to handle different file length is the diffcult point.

for example:
input:

FILE1:AAAAAAAAAAAAAAAAAAAA (20 length)
FILE2:BBBBBBBBBBBBBBBBBBBBBB.....BBBB(100 char length)
FILE3:CCCCCCCCCCCCCCCCCCCCCC.....CCCC(1000 char length)

output:

FILE2:00000001AAAAAAAAAAAAAAAAAAAA (28 length)
FILE2:00000001BBBBBBBBBBBBBBBBBBBBBB.....BBBB(108 char length)
FILE3:00000001CCCCCCCCCCCCCCCCCCCCCC.....CCCC(1008 char length)

In summary, every record will be output the original values, just plus the seq num, I don't know how to do this....

Thank you very much
Back to top
View user's profile Send private message
expat

Global Moderator


Joined: 14 Mar 2007
Posts: 8797
Location: Welsh Wales

PostPosted: Mon Nov 26, 2018 12:48 pm
Reply with quote

Some questions - just to clarify

Are the shown datasets a concatenation or will they be treated individually.
Is the sequence number of all shown files, or again within the same dataset.
Back to top
View user's profile Send private message
javen777

New User


Joined: 06 Mar 2015
Posts: 31
Location: china

PostPosted: Mon Nov 26, 2018 1:37 pm
Reply with quote

expat wrote:
Some questions - just to clarify

Are the shown datasets a concatenation or will they be treated individually.
Is the sequence number of all shown files, or again within the same dataset.


Thank you for asking!

1.each file is a individual file, they don't have anything related to each other. But they all need to be processed by the function of this DFSORT
2. seq num is generated within each file.
for example:
File1:
00000001AAA...
00000002AAA...
00000003AAA...
......
00000100AAA...

File2:
00000001BBB...
00000002BBB...
00000003BBB...
......
00000100BBB...

File3:
00000001CCC...
00000002CCC...
00000003CCC...
......
00000100CCC...
Back to top
View user's profile Send private message
Nic Clouston

Global Moderator


Joined: 10 May 2007
Posts: 2455
Location: Hampshire, UK

PostPosted: Mon Nov 26, 2018 3:26 pm
Reply with quote

Sort only has one input DDNAME so your data sets (they are NOT files) would have to be concatenated. That means that there must be some way of distinguishing between the records of one data set and the records of the next. This identifier must be in the same position for each data set. If there is then you can use that infromation to reset the sequence number between data sets.

If you cannot do this then you will have to run the sort for each data set.

Other information needed - what is the RECFM of these data sets?
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2010
Location: USA

PostPosted: Mon Nov 26, 2018 7:37 pm
Reply with quote

It's not easy to understand what for a secret "function" might be:
Quote:
they all need to be processed by the function of this DFSORT

1) If all three files have RECFM=VB then a trick can be done (assuming FILE3 has the longest LRECL)
Code:
//SORTIN DD DISP=SHR,DSN=FILE1,
//          DCB=(FILE3)
//       DD DISP=SHR,DSN=FILE2,
//          DCB=(FILE3)
//       DD DISP=SHR,DSN=FILE3

It is possible because DD DCB parameters have precedence over DSCB (so to speak, "dataset label")

2) For this specific need the SYNCSORT might be more useful; it supports multiple input files by implemented MULTIIN option.

P.S.
You have not "different file size", but "different record length". File size has nothing to do in your case.
Back to top
View user's profile Send private message
javen777

New User


Joined: 06 Mar 2015
Posts: 31
Location: china

PostPosted: Mon Nov 26, 2018 10:01 pm
Reply with quote

sergeyken wrote:
It's not easy to understand what for a secret "function" might be:
Quote:
they all need to be processed by the function of this DFSORT

1) If all three files have RECFM=VB then a trick can be done (assuming FILE3 has the longest LRECL)
Code:
//SORTIN DD DISP=SHR,DSN=FILE1,
//          DCB=(FILE3)
//       DD DISP=SHR,DSN=FILE2,
//          DCB=(FILE3)
//       DD DISP=SHR,DSN=FILE3

It is possible because DD DCB parameters have precedence over DSCB (so to speak, "dataset label")

2) For this specific need the SYNCSORT might be more useful; it supports multiple input files by implemented MULTIIN option.

P.S.
You have not "different file size", but "different record length". File size has nothing to do in your case.


sorry for the misunderstanding.

RECFM is FB, but each file has its own file length.
for example, File1 length is 100, file2 length is 20, file 3 length is 1200.
All i need is, seq number + original file record = output.

File1 -> seq num + original File1 file record => Output1
File2 -> seq num + original File2 file record => Output2
File3 -> seq num + original File3 file record => Output3
.......
FileN -> seq num + original FileN file record => OutputN

the issue i am facing is, all these files are from the upstream which are all different companies/departments. Files arrives in and we process them and put them into db2 tables.

the difficult part is, i don't know how to deal with different file length.
Back to top
View user's profile Send private message
javen777

New User


Joined: 06 Mar 2015
Posts: 31
Location: china

PostPosted: Mon Nov 26, 2018 10:06 pm
Reply with quote

sergeyken wrote:
It's not easy to understand what for a secret "function" might be:
Quote:
they all need to be processed by the function of this DFSORT

1) If all three files have RECFM=VB then a trick can be done (assuming FILE3 has the longest LRECL)
Code:
//SORTIN DD DISP=SHR,DSN=FILE1,
//          DCB=(FILE3)
//       DD DISP=SHR,DSN=FILE2,
//          DCB=(FILE3)
//       DD DISP=SHR,DSN=FILE3

It is possible because DD DCB parameters have precedence over DSCB (so to speak, "dataset label")

2) For this specific need the SYNCSORT might be more useful; it supports multiple input files by implemented MULTIIN option.

P.S.
You have not "different file size", but "different record length". File size has nothing to do in your case.


correct! different record length

you can see my last reply to have a better understanding, thank you
Back to top
View user's profile Send private message
javen777

New User


Joined: 06 Mar 2015
Posts: 31
Location: china

PostPosted: Mon Nov 26, 2018 10:10 pm
Reply with quote

Nic Clouston wrote:
Sort only has one input DDNAME so your data sets (they are NOT files) would have to be concatenated. That means that there must be some way of distinguishing between the records of one data set and the records of the next. This identifier must be in the same position for each data set. If there is then you can use that infromation to reset the sequence number between data sets.

If you cannot do this then you will have to run the sort for each data set.

Other information needed - what is the RECFM of these data sets?


Thank you.

and RECFM=FB

actually it's not 1 sort to process multiple files.

this 1 job which contain this sort will be in opc and this opc will be a file trigger one.
we will set up in opc so opc can be trigger by upstream file and the file will be processed.
and input file name will be passed into as sort input data.
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2010
Location: USA

PostPosted: Mon Nov 26, 2018 11:26 pm
Reply with quote

javen777 wrote:
sergeyken wrote:
It's not easy to understand what for a secret "function" might be:
Quote:
they all need to be processed by the function of this DFSORT

1) If all three files have RECFM=VB then a trick can be done (assuming FILE3 has the longest LRECL)
Code:
//SORTIN DD DISP=SHR,DSN=FILE1,
//          DCB=(FILE3)
//       DD DISP=SHR,DSN=FILE2,
//          DCB=(FILE3)
//       DD DISP=SHR,DSN=FILE3

It is possible because DD DCB parameters have precedence over DSCB (so to speak, "dataset label")

2) For this specific need the SYNCSORT might be more useful; it supports multiple input files by implemented MULTIIN option.

P.S.
You have not "different file size", but "different record length". File size has nothing to do in your case.


correct! different record length

you can see my last reply to have a better understanding, thank you


OMG!!!

The main question is: what you are going to do with THREE datasets?
    SORT?
    JOIN?
    MERGE?
    SELECT specific records/fields?
    REFORMAT?
Back to top
View user's profile Send private message
Arun Raj

Moderator


Joined: 17 Oct 2006
Posts: 2481
Location: @my desk

PostPosted: Tue Nov 27, 2018 12:47 am
Reply with quote

javen777 wrote:
RECFM is FB, but each file has its own file length.
for example, File1 length is 100, file2 length is 20, file 3 length is 1200.
All i need is, seq number + original file record = output
javen777,

You could achieve this by listing the data set attributes, ie, get the LRECL using the LISTDSI command, then build a control card dynamically to substitute the LRECL along with the SEQNUM. Finally run your input data set through this control card.

Something like this. Goodluck
Code:
//*                                                                   
// SET DSNAME='Your FB input data-set'                                         
//*                                                                   
//STEP01   EXEC PGM=IKJEFT01,PARM='LISTDS ''&DSNAME'''                 
//SYSTSPRT DD DSN=&L1,DISP=(,PASS),UNIT=SYSDA,DCB=(RECFM=FB,LRECL=80),
//            SPACE=(TRK,(1,1),RLSE)                                   
//SYSTSIN  DD DUMMY                                                   
//*                                                                   
//STEP02   EXEC PGM=SORT                                               
//SORTIN   DD DSN=&L1,DISP=SHR                                         
//SYSOUT   DD SYSOUT=*                                                 
//SORTOUT  DD DSN=&C1,DISP=(,PASS),UNIT=SYSDA,DCB=(RECFM=FB,LRECL=80),
//            SPACE=(TRK,(1,1),RLSE)                                   
//SYSIN    DD *                                                       
  OPTION COPY                                   
  INCLUDE COND=(3,1,CH,EQ,C'F')                 
  OUTFIL BUILD=(C' OPTION COPY ',80:X,/,         
                C' OUTREC BUILD=(SEQNUM,8,ZD,1,',
                   9,6,UFF,M11,LENGTH=5,C')')   
//*                                             
//STEP03   EXEC PGM=SORT                         
//SORTIN   DD DISP=SHR,DSN=&DSNAME               
//SYSOUT   DD SYSOUT=*                           
//SORTOUT  DD SYSOUT=*                           
//SYSIN    DD DSN=&C1,DISP=SHR       
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3048
Location: NYC,USA

PostPosted: Tue Nov 27, 2018 1:50 am
Reply with quote

Quote:
the issue i am facing is, all these files are from the upstream which are all different companies/departments. Files arrives in and we process them and put them into db2 tables.

the difficult part is, i don't know how to deal with different file length.
Don't you have a handshakes with upstream team and without having a solution how can one sign off something like this design? why would you need to add seq# at the beginning and how does it help you? What are you trying to achieve here? Can't Upstream add the same than this patch work?

Moreover, how do you load them into different DB2 tables when input DS changes?
Back to top
View user's profile Send private message
javen777

New User


Joined: 06 Mar 2015
Posts: 31
Location: china

PostPosted: Tue Nov 27, 2018 7:20 am
Reply with quote

Arun Raj wrote:
javen777 wrote:
RECFM is FB, but each file has its own file length.
for example, File1 length is 100, file2 length is 20, file 3 length is 1200.
All i need is, seq number + original file record = output
javen777,

You could achieve this by listing the data set attributes, ie, get the LRECL using the LISTDSI command, then build a control card dynamically to substitute the LRECL along with the SEQNUM. Finally run your input data set through this control card.

Something like this. Goodluck
Code:
//*                                                                   
// SET DSNAME='Your FB input data-set'                                         
//*                                                                   
//STEP01   EXEC PGM=IKJEFT01,PARM='LISTDS ''&DSNAME'''                 
//SYSTSPRT DD DSN=&L1,DISP=(,PASS),UNIT=SYSDA,DCB=(RECFM=FB,LRECL=80),
//            SPACE=(TRK,(1,1),RLSE)                                   
//SYSTSIN  DD DUMMY                                                   
//*                                                                   
//STEP02   EXEC PGM=SORT                                               
//SORTIN   DD DSN=&L1,DISP=SHR                                         
//SYSOUT   DD SYSOUT=*                                                 
//SORTOUT  DD DSN=&C1,DISP=(,PASS),UNIT=SYSDA,DCB=(RECFM=FB,LRECL=80),
//            SPACE=(TRK,(1,1),RLSE)                                   
//SYSIN    DD *                                                       
  OPTION COPY                                   
  INCLUDE COND=(3,1,CH,EQ,C'F')                 
  OUTFIL BUILD=(C' OPTION COPY ',80:X,/,         
                C' OUTREC BUILD=(SEQNUM,8,ZD,1,',
                   9,6,UFF,M11,LENGTH=5,C')')   
//*                                             
//STEP03   EXEC PGM=SORT                         
//SORTIN   DD DISP=SHR,DSN=&DSNAME               
//SYSOUT   DD SYSOUT=*                           
//SORTOUT  DD SYSOUT=*                           
//SYSIN    DD DSN=&C1,DISP=SHR       


That's exactly what i wanted!
Thank you very very very much!!!
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2010
Location: USA

PostPosted: Tue Nov 27, 2018 9:03 am
Reply with quote

This is one more example of reaching the simple goal by using a complex approach/tool.

The main problem is not solved: for really significant total number of datasets - how to guess in advance the DSN with maximal LRECL?

I can suggest a seriously simpler approach. Just convert ALL files in OUTFIL statements to RECFM=VB,LRECL=maxvalue, like 32000 or other acceptable value (may depend on further required processing). This DCB would also be the best compatible with DB2 load/unload.
Back to top
View user's profile Send private message
javen777

New User


Joined: 06 Mar 2015
Posts: 31
Location: china

PostPosted: Tue Nov 27, 2018 10:09 am
Reply with quote

sergeyken wrote:
This is one more example of reaching the simple goal by using a complex approach/tool.

The main problem is not solved: for really significant total number of datasets - how to guess in advance the DSN with maximal LRECL?

I can suggest a seriously simpler approach. Just convert ALL files in OUTFIL statements to RECFM=VB,LRECL=maxvalue, like 32000 or other acceptable value (may depend on further required processing). This DCB would also be the best compatible with DB2 load/unload.


thanks! very thoughtful!

But how to fetch the whole original record if VB may i ask?
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2010
Location: USA

PostPosted: Tue Nov 27, 2018 7:25 pm
Reply with quote

javen777 wrote:
But how to fetch the whole original record if VB may i ask?


Have no DFSORT available to copy here; same operations with SYNCSORT

Code:
//*====================================================
//SEQNUMVB PROC INPUT=NULLFILE                         
//*                                                   
//FB2VB    EXEC PGM=SORT,COND=(0,NE)                   
//SYNOUT   DD SYSOUT=*                                 
//*                                                   
//SORTIN   DD DISP=(OLD,PASS),DSN=&INPUT               
//*                                                   
//SORTOUT DD  DISP=(NEW,PASS),                         
//            SPACE=(TRK,(10,10),RLSE),               
//            DCB=(RECFM=VB,LRECL=32000,BLKSIZE=0),   
//            DSN=&INPUT..VB                           
//*                                                   
//SYSIN   DD  *                                       
 SORT FIELDS=COPY                                     
 OUTFIL FTOV                                           
 END                                                   
//*----------------------------------------------------
//RENUMBER EXEC PGM=SORT,COND=(0,NE)                   
//*                                                   
//SYNOUT   DD SYSOUT=*                                 
//*                                                   
//SORTIN   DD DISP=OLD,DSN=&INPUT..VB                 
//*                                                   
//LISTING  DD SYSOUT=*                                 
//OUTFILE  DD DISP=(NEW,PASS),                         
//            SPACE=(TRK,(100,100),RLSE),             
//            DCB=(&INPUT..VB),                       
//            DSN=&INPUT..NUMBERED                     
//*                                                   
//SYSIN   DD  *                                       
 SORT FIELDS=COPY                                     
 OUTREC BUILD=(1,4,            RDW                     
               SEQNUM,8,ZD,    SEQUENCE NUMBER         
               5)              INPUT RECORD BODY       
 OUTFIL FTOV,FNAMES=(LISTING,OUTFILE)                 
 END                                                   
//*                                                   
//*====================================================
//         PEND                                       
//*==============================================
//VB20     EXEC SEQNUMVB,INPUT=&SYSUID..FB20     
//VB40     EXEC SEQNUMVB,INPUT=&SYSUID..FB40     
//VB60     EXEC SEQNUMVB,INPUT=&SYSUID..FB60     
//*==============================================

Code:
********************************* TOP OF DATA *****
00000001AAAAAAAAAABBBBBBBBBB                       
00000002DDDDDDDDDDAAAAAAAAAA                       
00000003CCCCCCCCCCDDDDDDDDDD                       
00000004BBBBBBBBBBCCCCCCCCCC                       
******************************** BOTTOM OF DATA ***

Code:
********************************* TOP OF DATA **
00000001AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDDDDDDDD
00000002DDDDDDDDDDAAAAAAAAAABBBBBBBBBBCCCCCCCCCC
00000003CCCCCCCCCCDDDDDDDDDDAAAAAAAAAABBBBBBBBBB
00000004BBBBBBBBBBCCCCCCCCCCDDDDDDDDDDAAAAAAAAAA
******************************** BOTTOM OF DATA

Code:
********************************* TOP OF DATA ***********************
00000001AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDDDDDDDDEEEEEEEEEEFFFFFFFFFF
00000002DDDDDDDDDDAAAAAAAAAABBBBBBBBBBCCCCCCCCCCGGGGGGGGGGHHHHHHHHHH
00000003CCCCCCCCCCDDDDDDDDDDAAAAAAAAAABBBBBBBBBBIIIIIIIIIIJJJJJJJJJJ
00000004BBBBBBBBBBCCCCCCCCCCDDDDDDDDDDAAAAAAAAAAKKKKKKKKKKLLLLLLLLLL
******************************** BOTTOM OF DATA *********************
Back to top
View user's profile Send private message
Arun Raj

Moderator


Joined: 17 Oct 2006
Posts: 2481
Location: @my desk

PostPosted: Tue Nov 27, 2018 7:49 pm
Reply with quote

sergeyken wrote:
This is one more example of reaching the simple goal by using a complex approach/tool.

The main problem is not solved: for really significant total number of datasets - how to guess in advance the DSN with maximal LRECL?
The solution posted earlier was quite simple.

The first step finds the LRECL of the input data set and a subsequent step uses it to copy the input data set ONCE.

There is no need to guess the max LRECL or copy the data twice.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts PARSE Syntax for not fix length word ... JCL & VSAM 7
No new posts Extracting Variable decimal numbers f... DFSORT/ICETOOL 17
No new posts SFTP Issue - destination file record ... All Other Mainframe Topics 2
No new posts Modifying Date Format Using DFSORT DFSORT/ICETOOL 9
No new posts Access to non cataloged VSAM file JCL & VSAM 18
Search our Forums:

Back to Top