|
View previous topic :: View next topic
|
| Author |
Message |
ranga_subham
New User

Joined: 01 Jul 2005 Posts: 51
|
|
|
|
Hi,
Please provide me your suggestions/solutions to achieve the following:
A production job runs daily and creates a huge file with 'n' number of records. Except REXX, I want to use a utility (assuming SYNCSORT with COUNT) to know the 'n' number of records from this file and want to split the file into equal output files (each output file should have 1,00,000 records). How to achieve it dynamically if records vary on daily basis? On a given day we may get 5,00,000 and on the other day we may get 8,00,000 records. So, depending on the count I need to split the input file into 5 or 8 pieces for further processing. After this processing (suppose a COBOL program) I may again get 5 or 8 files. So, I need to merge all these files as single file and need to FTP it to remote customer server. How to automatize this situation?
Please provide your suggestions/solutions/ideas to this problem. Please let me know if you need more inputs/details.
Note: REXX is ruled out by customer.
Please help me out.
Thanks for your time. |
|
| Back to top |
|
 |
ranga_subham
New User

Joined: 01 Jul 2005 Posts: 51
|
|
|
|
I ran the following SYNCSORT step to get the count of records from the file.
| Code: |
//S1 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//SSMSG DD SYSOUT=*
//IN DD DSN=INPUT.FILE.TO.BE.SPLITTED,
// DISP=SHR
//TOOLIN DD *
COUNT FROM(IN)
/*
//
|
I got the following display in the TOOLMSG at sysout.
| Code: |
SYT000I SYNCTOOL RELEASE 1.4D - COPYRIGHT 2003 SYNCSORT INC.
SYT001I INITIAL PROCESSING MODE IS "STOP"
SYT002I "TOOLIN" INTERFACE BEING USED
COUNT FROM(IN)
SYT020I SYNCSORT CALLED WITH IDENTIFIER "0001"
SYT031I NUMBER OF RECORDS PROCESSED: 000000000400000
SYT030I OPERATION COMPLETED WITH RETURN CODE 0
SYT004I SYNCTOOL PROCESSING COMPLETED WITH RETURN CODE 0
|
So, now depending on the count (4,00,000) which JCL utility we can use to dynamically split this file into 4 equal pieces (each file having 1,00,000 records)? |
|
| Back to top |
|
 |
DavidatK
Active Member

Joined: 22 Nov 2005 Posts: 700 Location: Troy, Michigan USA
|
|
|
|
Hi Ranga,
| Quote: |
Please provide me your suggestions/solutions to achieve the following:
A production job runs daily and creates a huge file with 'n' number of records. Except REXX, I want to use a utility (assuming SYNCSORT with COUNT) to know the 'n' number of records from this file and want to split the file into equal output files (each output file should have 1,00,000 records). How to achieve it dynamically if records vary on daily basis? On a given day we may get 5,00,000 and on the other day we may get 8,00,000 records. So, depending on the count I need to split the input file into 5 or 8 pieces for further processing. After this processing (suppose a COBOL program) I may again get 5 or 8 files. So, I need to merge all these files as single file and need to FTP it to remote customer server. How to automatize this situation?
Please provide your suggestions/solutions/ideas to this problem. Please let me know if you need more inputs/details
|
I?m not clear on how many records your talking about here. 100,000 or 1,000,000?
What is the requirement for to have only 100,000 records per each split of the file? Even 8,000,000 records shouldn?t be an excessive burden on the system. And, remember, to split the master file into multiple files, you still have to make a pass through the entire file. Why not just process it.
But, that?s off the subject you posted.
I think you are taking the wrong approach to this. Having a variable number of files produced each night creates a problem for scheduling. Every night a variable number of jobs needs to be schecduled.
I think a better way is to have a constant number of files, enough to contain the absolute maximum number of records possible (at least in the foreseeable future), and write your 100,000 records to each of these files. When you run out of records you will have the remaining as empty files. This means that scheduling each night is constant; someone doesn?t have to be changing the job stream each night.
If your process cannot handle an empty file, you can check for an empty condition and COND= to skip the process.
Then you concatenate all the files together for the FTP.
Please come back with comments or questions,
Dave |
|
| Back to top |
|
 |
superk
Global Moderator

Joined: 26 Apr 2004 Posts: 4652 Location: Raleigh, NC, USA
|
|
|
|
| Dave, the O/P may be representing the record counts in either lakh's or in crore's. Look up the definition for each here and you'll see what I mean. |
|
| Back to top |
|
 |
DavidatK
Active Member

Joined: 22 Nov 2005 Posts: 700 Location: Troy, Michigan USA
|
|
|
|
superk,
I have not been exposed to unit counts in lakh's or crore's before. My horizons have been expanded. Thanks
Dave |
|
| Back to top |
|
 |
manyone
New User
Joined: 09 Mar 2006 Posts: 21
|
|
|
|
//* all gdg's are 1-entry each
//SORT1 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTMSG DD SYSOUT=*
//SORTIN DD DISP=SHR,DSN=TCLM.CQ000154.ECF.LATRAN.X060123
//SORTOUT DD DSN=TEMP.M4J6060.W1.DATA(+1),
// UNIT=TEMP,DISP=(,CATLG,DELETE),
// SPACE=(TRK,(10,10),RLSE)
//SORTWK01 DD UNIT=SYSDA,SPACE=(CYL,(10,20))
//SYSIN DD *
* prefix a 7-digit number to each record
SORT FIELDS=COPY
OUTFIL OUTREC=(SEQNUM,7,ZD,X,1,329)
//*
//SORT2 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTMSG DD SYSOUT=*
//SORTIN DD DISP=SHR,DSN=TEMP.M4J6060.W1.DATA
//SORTOF1 DD DSN=TEMP.M4J6060.V1.DATA(+1),
// UNIT=TEMP,DISP=(,CATLG,DELETE),
// SPACE=(TRK,(10,10),RLSE)
//SORTOF2 DD DSN=TEMP.M4J6060.V2.DATA(+1),
//SORTOF3 DD DSN=TEMP.M4J6060.V3.DATA(+1),
//SORTOF4 DD DSN=TEMP.M4J6060.V4.DATA(+1),
//SORTWK01+ DD UNIT=SYSDA,SPACE=(CYL,(10,20))
//SYSIN DD *
SORT FIELDS=COPY
OUTFIL FILES=1,INCLUDE=(1,7,ZD,GT,00,AND,1,7,ZD,LE,10),OUTREC=(9,329)
OUTFIL FILES=2,INCLUDE=(1,7,ZD,GT,10,AND,1,7,ZD,LE,20),OUTREC=(9,329)
OUTFIL FILES=3,INCLUDE=(1,7,ZD,GT,20,AND,1,7,ZD,LE,30),OUTREC=(9,329)
OUTFIL FILES=4,INCLUDE=(1,7,ZD,GT,30,AND,1,7,ZD,LE,40),OUTREC=(9,329)
// |
|
| Back to top |
|
 |
ranga_subham
New User

Joined: 01 Jul 2005 Posts: 51
|
|
|
|
Hi DavidatK,
Yes. It is 100,000 records. The COBOL program logic processes only 100,000 input records. If it is more in number then it will abend. This approach we have conveyed to the customer (pseudo code) and they agreed upon it. The thing is that the input file will have different no. of records each day. So, we need to act accordingly. They are not bothered about the scheduling because it is on request work. So, this solution will be at work once in a while. But, we are not able to achieve it. I seek the help from you experts.
I have not understood the info provided by manyone. Is it a solution to my problem?
Any inputs? Please help. |
|
| Back to top |
|
 |
manyone
New User
Joined: 09 Mar 2006 Posts: 21
|
|
|
|
the first sort prefixes a sequence number (automatically incremented) to the record so that 1-7 is sequence number, followed by 1 blank, then my record (329 bytes).
the second record splits the file from above into 4 parts according to the value of the first seven bytes (ie. sequence number). number 1 to 10 (in my example) goes to file 1, 11 to 20 goes to file 2, etc. - simply change to 100000 for your use. the result is 4 files (or however many) that can be input to 4 jobs.
i hope this is what you were looking for. |
|
| Back to top |
|
 |
i413678 Currently Banned Active User

Joined: 19 Feb 2005 Posts: 112 Location: chennai
|
|
|
|
Hi manyone,
You had given an excellent solutions.....But, ranga_subham asked we want to repeat this process for 'N' number of records......
is it right ranga_subham..........................
thx in advance...........
pavan |
|
| Back to top |
|
 |
martin9
Active User

Joined: 01 Mar 2006 Posts: 290 Location: Basel, Switzerland
|
|
|
|
hy,
a possible solution is,
you make a driver job.
means:
job1 is analyzing your input file and creates as many
jobs you need to get scheduled.
send all jobs to a dd wich redirects the jobs to the
internal reader...
//JOB DD SYSOUT=(x,INTRDR)
martin9 |
|
| Back to top |
|
 |
ranga_subham
New User

Joined: 01 Jul 2005 Posts: 51
|
|
|
|
Hi manyone,
The first step has given the following error in the sysout:
| Code: |
SYSIN :
SORT FIELDS=COPY
*
OUTFIL OUTREC=(SEQNUM,7,ZD,X,1,200)
*
WER275A NO KEYWORDS FOUND ON CONTROL STATEMENT
WER268A OUTREC STATEMENT : SYNTAX ERROR
WER449I SYNCSORT GLOBAL DSM SUBSYSTEM ACTIVE
|
|
|
| Back to top |
|
 |
ranga_subham
New User

Joined: 01 Jul 2005 Posts: 51
|
|
|
|
Hi Martin,
Would you please convert this pseudo code into actual code. That will be a great help to me.
Thanks for ur time. |
|
| Back to top |
|
 |
martin9
Active User

Joined: 01 Mar 2006 Posts: 290 Location: Basel, Switzerland
|
|
|
|
hy ranga,
you write a porgram which reads the entire input file,
after each 100000 records (or less for the last portition),
you create a single job, writing it to a dd which is
defined as input to the internal reader...
it will be submitted immediate...
the created jobs should consist of:
1. splitting the file in a useful portition (ie. 100000 records)
IDCAMS
REPRO INFILE(IN1) OUTFILE(OUT1) COUNT(100000) {SKIP(n)}
2. running your cobol program
note: all names/ vars ... can be variably
more details?
martin9 |
|
| Back to top |
|
 |
manyone
New User
Joined: 09 Mar 2006 Posts: 21
|
|
|
|
ranga,
sort control commands have to start after column 1 . if the command starts at col 1 it is considered as a label. so please shift right the commands 1 or 2 columns
thanks |
|
| Back to top |
|
 |
ranga_subham
New User

Joined: 01 Jul 2005 Posts: 51
|
|
|
|
Hi Manyone,
From the example you've provided here I've created my job like this. Please let me know if the job is correct. Sorry to bother you again and again. I was kept getting " ABENDED S000 U0016 CN(INTERNAL)" abend.
| Code: |
//SORT01 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTMSG DD SYSOUT=*
//SORTIN DD DSN=INPUT.FILE,
// DISP=(SHR,KEEP,KEEP)
//SORTOUT DD DSN=OUTPUT.FILE.SORT,
// UNIT=SYSDA,
// DISP=(NEW,CATLG,DELETE),
// SPACE=(TRK,(1,1),RLSE),
// DCB=(RECFM=FB,LRECL=200,BLKSIZE=00000)
//SORTWK01 DD UNIT=SYSDA,SPACE=(CYL,(10,20))
//** PREFIX A 7-DIGIT NUMBER TO EACH RECORD **//
//SYSIN DD *
SORT FIELDS=COPY
OUTFIL OUTREC=(SEQNUM,7,ZD,X,1,200)
//*
//SORT02 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTMSG DD SYSOUT=*
//SORTIN DD DSN=OUTPUT.FILE.SORT,
// DISP=SHR
//SORTOF1 DD DSN=OUTPUT.FILE.NEW.SORT1,
// UNIT=SYSDA,
// DISP=(NEW,CATLG,DELETE),
// SPACE=(TRK,(1,1),RLSE),
// DCB=(RECFM=FB,LRECL=200,BLKSIZE=00000)
//SORTOF2 DD DSN=OUTPUT.FILE.NEW.SORT2,
// UNIT=SYSDA,
// DISP=(NEW,CATLG,DELETE),
// SPACE=(TRK,(1,1),RLSE),
// DCB=(RECFM=FB,LRECL=200,BLKSIZE=00000)
//SORTOF3 DD DSN=OUTPUT.FILE.NEW.SORT3,
// UNIT=SYSDA,
// DISP=(NEW,CATLG,DELETE),
// SPACE=(TRK,(1,1),RLSE),
// DCB=(RECFM=FB,LRECL=200,BLKSIZE=00000)
//SORTOF4 DD DSN=OUTPUT.FILE.NEW.SORT4,
// UNIT=SYSDA,
// DISP=(NEW,CATLG,DELETE),
// SPACE=(TRK,(1,1),RLSE),
// DCB=(RECFM=FB,LRECL=200,BLKSIZE=00000)
//SORTWK01 DD UNIT=SYSDA,SPACE=(CYL,(10,20))
//SORTWK02 DD UNIT=SYSDA,SPACE=(CYL,(10,20))
//SORTWK03 DD UNIT=SYSDA,SPACE=(CYL,(10,20))
//SYSIN DD *
SORT FIELDS=COPY
OUTFIL FILES=1,INCLUDE=(1,7,ZD,GT,00,AND,1,7,ZD,LE,10),OUTREC=(9,200)
OUTFIL FILES=2,INCLUDE=(1,7,ZD,GT,10,AND,1,7,ZD,LE,20),OUTREC=(9,200)
OUTFIL FILES=3,INCLUDE=(1,7,ZD,GT,20,AND,1,7,ZD,LE,30),OUTREC=(9,200)
OUTFIL FILES=4,INCLUDE=(1,7,ZD,GT,30,AND,1,7,ZD,LE,40),OUTREC=(9,200)
//
|
Please suggest. |
|
| Back to top |
|
 |
ranga_subham
New User

Joined: 01 Jul 2005 Posts: 51
|
|
|
|
| Thanks Martin. I will try that. |
|
| Back to top |
|
 |
rohit jaiswal Warnings : 2 New User

Joined: 09 Mar 2006 Posts: 36 Location: hyderabad,A.P
|
|
|
|
hi
can u pls send me the details of it. |
|
| Back to top |
|
 |
manyone
New User
Joined: 09 Mar 2006 Posts: 21
|
|
|
|
| i think you were getting cond 16 because sort1 output is not lrecl 210. pls understand that my example was for a max count of 10 for each file -hence file1 will get 1 to 10, file 2 will get 11 to 20, etc. also i'm trying to understand your process. i'm guessing that you are modifying the records as you process each sub file and you need to reconstruct the modified files back in the original sequence for later ftp. if this is the case, i suggest you keep the sequence number intact (change your sort2 to outrec=(1,210)) and pass it around but in the end, simply resort everything by the sequence number but drop it during outrec. also note that all the files may not always have data (eg. if original has 27 recs, outputs will have 10,10,7,0 in file1,2,3,4). hence your processing should handle empty files accordingly. i hope this will help. |
|
| Back to top |
|
 |
ranga_subham
New User

Joined: 01 Jul 2005 Posts: 51
|
|
|
|
| Ok. I will do as you said. Thanks for the details. All in all, it is very useful info. Thanks for sharing. |
|
| Back to top |
|
 |
martin9
Active User

Joined: 01 Mar 2006 Posts: 290 Location: Basel, Switzerland
|
|
|
|
hy ranga,
do you think BLKSIZE=00000 will work?
pls provide the error messages, also joblog...
martin9
ps: you wanted a dynamic solution? |
|
| Back to top |
|
 |
|
|
 |
All times are GMT + 6 Hours |
|