Joined: 07 Dec 2007 Posts: 2205 Location: San Jose
kitchu84,
The following DFSORT/ICETOOL JCL will give you the desired results. I assumed that the list file has the DSN name in the first 44 bytes. I also limited the job to 1000 Datasets even though the maximum number of DD statements per job step is 3273. This job takes care of both FB and VB files.
The JCL is generated in step0200. Look at the output from SORTOUT of that step. It should have generated the JCL need to count the records from each file. If the generated JCL looks good then change the following statement and resubmit the job.
Thanks for solution. I am currently trying on correcting the creation of DD names in the dynamic jcl as it was throwing error:
JCP0427E DD NAME 'CT000CNTL' MUST BE 8 CHARACTERS OR LESS
JCP0427E DD NAME 'CT001CNTL' MUST BE 8 CHARACTERS OR LESS
JCP0427E DD NAME 'CT002CNTL' MUST BE 8 CHARACTERS OR LESS
JCP0427E DD NAME 'CT003CNTL' MUST BE 8 CHARACTERS OR LESS
JCP0427E DD NAME 'CT004CNTL' MUST BE 8 CHARACTERS OR LESS
Please let me know if there is a way to handle more than 3273 files. Do I need to write another dynamic jcl for that ... Please suggest.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
While we are back on this, remember some of the things we brough up before.
When you run, you'd like all your files to be from the same production run, but have no way to know this.
When the client gets his output, at least to start with, he's going to look at it. Then you're going to get queries. He's going to say "on this report from three weeks ago..." so make sure you can relate his copy of the report to one you can look at, and all the files.
What about periodic files?
What about starting with the "main" parts of the system so you are concentrating on the important first (if he finds problems).
How about you spend some time running with the production data before you give the client the first report, so you can check for anything "obvious" before he gets to see it.
It's not just producing the report, it is everything that goes with it.
Do you have a file archiver? When you re-run an old set of reports one time, it'll take hours to get everything back.
Please let me know if there is a way to handle more than 3273 files.
I wonder who is ever going to read the report
anyway the 3273 dd names is a jcl constraint..
You will have to analyze a bit and submit in multiple jcls
remembering that each dataset to be counted implies two dd, that' s the reason for the STOPAFT=1000,
to sqeeze everything out of jcl You could have used STOPAFT=1500
but maybe 1000 is easier to remember
With the proper DFSORT knowledge it will be possible to build in one pass as many jobs as needed each one counting less than <somenumber> of datasets
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
kitchu84 wrote:
[...]
Please let me know if there is a way to handle more than 3273 files. [...]
I missed that. I agree with enrico. If the client gets the report, you definitely won't get the queries until three weeks later (or however long) or until he gets his Excel Macro "working" when you will get hit with everything daily. Most queries you get won't mean much to start with.
Remember, you don't know which file is from which production run. It is just likely they will be from the same run, you won't know.
How are you getting the list of files which is input for all this? Anything you can do to group them together "logically" so you can do some automatic number checking before the client can? In the end, you can give him that report as well, save you looking for errors in his Excel Macro.
We will run this job periodically every 30 mins with the last 30 mins SAR unload and we will populate the data in DB2 tables for a particular JOB ran with that particular JOB ID.
For example: if a job name = ABCXXXXX ran with job id : JOB27909,
I will take unloads from SAR for the last 30 mins (time will be controlled by parm parameter in a COBOL module) and then I will filter out the file names for all the jobs alongwith their Job ids.
This will tell me specifically which run of the job had those files and what was the count. All this information will be loaded in table alongwith a timestamp. The data from table will be pulled through JAVA code and shown on URLs so that we can run some reports to know specifically on which date for a particular job what was the file count.
@enrico - Sorry I am not clear on this part:
"remembering that each dataset to be counted implies two dd, that' s the reason for the STOPAFT=1000,
to sqeeze everything out of jcl You could have used STOPAFT=1500"
Any pointers to handle more than 3273 files would be helpful.
Any pointers to handle more than 3273 files would be helpful.
did You care to read completely my previous reply ?
noo way with a single job...
depending on Your skills there might be a REXX alternative!
maybe it would be wiser to wait for Kolusu so that He may suggest how to build multiple Jobs in one pass
Joined: 07 Dec 2007 Posts: 2205 Location: San Jose
kitchu84 wrote:
I am currently trying on correcting the creation of DD names in the dynamic jcl as it was throwing error:
JCP0427E DD NAME 'CT000CNTL' MUST BE 8 CHARACTERS OR LESS
Kitchu84,
As enrico pointed out , I was creating the JCL with CXXXCNTL and not CTXXXCNTL. You seemed to have picked Sqlcode's JCL and I can't help you fix it.
kitchu84 wrote:
Please let me know if there is a way to handle more than 3273 files. Do I need to write another dynamic jcl for that ... Please suggest.
Just because I said the DD limit is 3273 doesn't mean your shop has the same limit. The limit of 3273 is based on the number of single unit DD statements for a 64K TIOT (task input output table). This limit can be different depending on the installation-defined TIOT size. 32K is the default TIOT size. The limit for a 32K TIOT is 1635. (In a JES3 system, the installation might further reduce the limit.)
kitchu84 wrote:
Sorry I am not clear on this part:
enrico-sorichetti wrote:
remembering that each dataset to be counted implies two dd, that' s the reason for the STOPAFT=1000,to sqeeze everything out of jcl You could have used STOPAFT=1500
The Dynamic JCL being generated uses 1 dd name for the input DSN name and the other(CXXXCNTL) for writting the count in the output file. So for every record in your list file 2 ddnames are being used. If the limit is 3273 , you would only be able to process 3273/2 = 1636 dd names. Enrico rounded it to 1500 leaving some buffer for other ddnames.
However you simply can't use STOPAFT=1500 because once you cross 999 entries the seqnum becomes 4 characters and you cannot have a DDname with numeric.
Here is a JCL which will read upto a max of 26,000 dsn names and generate dynamic jobs with 1000 dsnames per job. Please don't come back and ask me how to generate a JCL for more than 26,000.
I am showing you submitting of 10 jobs however you can extend it to 26 jobs which will process a total of 26,000 dsnames. In order to do that allocate the output files in step0200 from
JCOUNT01 thru JCOUNT26 and also add the control cards
The JCL is generated in step0200. Look at the output from JCOUNTnn of that step. It should have generated the JCL need to count the records from each file. If the generated JCL looks good then change the following statement and resubmit the job.
If you're not familiar with DFSORT and DFSORT's ICETOOL, I'd suggest reading through "z/OS DFSORT: Getting Started". It's an excellent tutorial, with lots of examples, that will show you how to use DFSORT, DFSORT's ICETOOL and DFSORT Symbols. You can access it online, along with all of the other DFSORT books, from:
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
I'm not sure I have followed this correctly.
A)
1) You run the DFSORT mega-file-counter about every 30 minutes
2) You check on SAR for all jobs completed in that time
3) You extract Job info from SAR, a sub-set of dataset info from the mega-file-counter
4) You load everything into DB2 so that you know which file relates to which job that ran at a particular time
or
B)
1) You check on SAR for all jobs complete in last 30 minutes
2) Create extract file-of-files for the mega-file-counter from the complete jobs in SAR
3) Run the DFSORT mega-file-counter
4) You load everything into DB2 so that you know which file relates to which job that ran at a particular time
or
C)
Something else I've missed completely
If A) is the mega-file-counter going to run in less than 30 minutes? That is only 1800 seconds, and it seems you might have more than 6500 DD's to open/close and read. + more questions if A) is confirmed
If B) why do you feel you need so many files, there will not be more than 3000 files created in a 30-minute window, will there? + more questions if B) is confirmed
However you simply can't use STOPAFT=1500 because once you cross 999 entries the seqnum becomes 4 characters and you cannot have a DDname with numeric.
I will take unloads from SAR for the last 30 mins (time will be controlled by parm parameter in a COBOL module) and then I will filter out the file names for all the jobs alongwith their Job ids.
This will tell me specifically which run of the job had those files and what was the count. All this information will be loaded in table alongwith a timestamp. The data from table will be pulled through JAVA code and shown on URLs so that we can run some reports to know specifically on which date for a particular job what was the file count.
and since there will be quite a number of read only datasets You will keep
wasting resources counting something that did not change
( as You know You cannot determine just looking at the jcl if a dataset is input or output )
an what about the dataset on tape, heck of a mount activity, every half an hour
and what about the possible gdg' s
it would be wiser for the powers of Your organization to review the whole process
Also, since we need to run the jobs automatically through CA7 so suppose we submitted 5 dynamic JCLS each of them creating a different file with counts. I have a challenge to merge and then use it in another Job.
Say there are 5000 file names . Hence the main job creates 5 dynamic JCLS and submits them - each of which in turn creates one output file of file counts .
Now since these dynamic JCLs are not defined in CA7, i need a way to put
dependancy of all these jobs/output files on another final job which merges the 5 output file and then uses the data... I cannot put the dependency of Main job on final job because the final job might abend
due to file not found if the dynamic jcls are still running.
Also, is there a possibility that it might take previous version of the file
into consideration?
I am sorry if this isnt the right place to ask this query.
Joined: 07 Dec 2007 Posts: 2205 Location: San Jose
kitchu84 wrote:
I have a challenge to merge and then use it in another Job.
Well I am not bill but since I was the one who provided you with the dynamic JCL solution , I am gonna answer it . You don't have a challenge. It is quite simple. change my job to have the same jobcard on all the dynamically created JCL's and no matter how many jobs you create they will simply be queued up one after another. They will run sequentially and you can use just 1 output file. You don't need another job to merge them. Before you ask, I am not going to help you with that. It is a simple change and you should be able to do that now that you have the necessary frame work.
kitchu84 wrote:
Also, is there a possibility that it might take previous version of the file into consideration?
You need to atleast read the comments in the generated JCL. The 7th line of the generated JCL will have this comment
Code:
//**********************************************************
//* DELETE THE OUTPUT COUNT DATASET IF EXISTED *
//**********************************************************
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
Skolusu wrote:
[...]
Well I am not bill
[...]
I don't think the whole thing was directed at me.
[...]
kitchu84 wrote:
We will run this job periodically every 30 mins with the last 30 mins SAR unload and we will populate the data in DB2 tables for a particular JOB ran with that particular JOB ID.
For example: if a job name = ABCXXXXX ran with job id : JOB27909,
I will take unloads from SAR for the last 30 mins (time will be controlled by parm parameter in a COBOL module) and then I will filter out the file names for all the jobs alongwith their Job ids.
This will tell me specifically which run of the job had those files and what was the count.
[...]
B)
1) You check on SAR for all jobs complete in last 30 minutes
2) Create extract file-of-files for the mega-file-counter from the complete jobs in SAR
3) Run the DFSORT mega-file-counter
4) You load everything into DB2 so that you know which file relates to which job that ran at a particular time
There is something I'm not getting.
Why I want to know more exactly how you are doing it is because there are different problems for either route. Are you running the full mega-file-counter? If so, how many times a day? Do you run it once and the SAR extract multiple times?
Some problems are the same, whichever route. Like, how do you deal with a re-run from a previous day? Are your timestamps "logical" or actual - "logical" being so that you can identify files from the same batch runs, assuming that you will be running over midnight? Etc.
Do you have something of the actual design that you can share with us?
This type of error suggests that its because WHEN=GROUP is not available in the current release we are using ... Could you please suggest an alternative.
Hi Bill : we will run the file counter every time after we run the SAR unloads. The SAR unloads run every 30 mins (for entire 24 hrs). Regarding the rerun, the timestamp are actual ... we are planning to load the data at that particular instant of time. So say a job runs at 12:30 pm, the SAR unload running at 1 pm will pick up that job details and we will run file counter for the job. If the job again runs at say 2:50 pm with different job id , the next SAR unload will pick up the details of the job and file counter will count the records at that instant of time which will be loaded with that particular timestamp.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
kitchu84 wrote:
[...]
Hi Bill : we will run the file counter every time after we run the SAR unloads. The SAR unloads run every 30 mins (for entire 24 hrs). Regarding the rerun, the timestamp are actual ... we are planning to load the data at that particular instant of time. So say a job runs at 12:30 pm, the SAR unload running at 1 pm will pick up that job details and we will run file counter for the job. If the job again runs at say 2:50 pm with different job id , the next SAR unload will pick up the details of the job and file counter will count the records at that instant of time which will be loaded with that particular timestamp.
Sorry if I am not clear. Please let me know ...
Hi Kitchu84,
Sorry, but again, is this the full file counter that you are talking about? No, I'm not clear.
How long is the file counter going to take to run, every 30 minutes.
If the full run is running 48 times a day (theoretically), then 47 times for each dataset it is not needed (roughly speaking).
Are you using a SAR time loaded on the Job time finished for your selection? If the latter, is there any "latency" between a Job finishing and appearing in SAR? If so, you will miss Jobs occasionally.
What DISP are your production jobs running with for output files, generally speaking? I get to this sort of point, and I think yet again I must be missing something crucial.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
enrico-sorichetti wrote:
Bill, looks like we are just wasting time here, as does the TS organization by counting things every half an hour
they made up their mind/bed let them sleep in it
enrico, I get that feeling as well. Extracting job/file information from one source, but not using that to trigger the counts? So, mistmatched timings for sure. Won't run inside 24 hours, won't get all the jobs updated, will lock production jobs, doesn't know about business days, mixes re-runs with current data. Etc. And uses the wrong sort package...