ca 11 restart but get new generations?

jasorn · Active User Joined: 12 Jul 2006 Posts: 191 Location: USA

Here's the situation at our shop:
Lots of jobs write to the same gdg base. By lots, I mean 300+. The problem we have is that when one of those abends and another one runs before the abended job is restarted. In this case, a generation is overwritten. I'm curious if there is a way to tell it to restart but get new generations.

So...
1. joba is creating +1(G0006V00) and abends in step3.
2. while joba is down, jobb runs and also creates +1(G0006V00) and runs to completion.
3. abend for joba is resolved and joba is restarted in step3, creates +1(G0006V00) overwriting the +1(G0006V00) jobb just created, and runs to completion.
4. doh! the generation jobb created is gone.

Is there a way to tell CA to restart in step3 but don't use G0006V00, use whatever is now the proper +1? That's be G0007V00 in this case.

Currently, we have a means to check after the fact if any generations were overwritten and we just process those again. We're thinking of splitting this into two jobs. The first will write to flat file and the 2nd will copy to gdg. While we'd still have the possibility of overwriting a generation, the occurances would be much fewer since the copy step hardly ever abends.

But I'd like to know if we could somehow be smarter about how we restart in the event of an abend before we go splitting 300+ jobs into 2.

Anuj Dhawan · Posted: Fri Mar 14, 2008 12:09 am

Hi,

dick scherrer · Posted: Fri Mar 14, 2008 5:07 am

Hello,

If the job only creates one +1 generation per run, you could simply create the actual +1 as the last step of the job.

A good overall solution is to modify the process so that the +1 from an abended execution is never preserved. That way, there is no concern about over-writing.

jasorn · Active User Joined: 12 Jul 2006 Posts: 191 Location: USA

I must not have presented the situation clearly. Let me try again.

First to address Dick's suggesting that the generation from an abended execution isn't preserved.

As the below explains, it's not being preserved. But upon restart the CMT still has the old g00v00 number which is no longer valid.

Now the a restating of the situation.

(Reading the replies, I'm guessing the confusion is in thinking JOBA and JOBB are 2 different instances of the same job. They're not. They are 2 completely different jobs but write their outputs to the same gdg base.)

The jobs
JOBA
. some steps that do stuff so that a complete rerun isn't disirable.
.
.
.
STEP21 - Create FILEA
STEP22 - Copy FILEA to GDG.DATASETC(+1) <=== Note: same GDG base as JOBB

JOBB
STEP1 - Create FILEB
STEP2 - Copy FILEB to GDG.DATASETC(+1) <=== Note: same GDG base as JOBB

The flow of the situation in question
(0) generation = GDG.DATASETC.G00007V00
1. JOBA abends in STEP21. +1 generation isn't created but the CMT knows G00008V00 is the one it's going to create.

2. JOBA is still down when JOBB starts.

3. JOBB runs successfully. Since the (0) generation is GDG.DATASETC.G00007V00, it creates GDG.DATASETC.G00008V00.

4. The issue causing JOBA's abend is resolved and JOBA is restarted in STEP21. This is the issue. The CMT thinks G00008V00 is still the (+1) generation. It's not as JOBB created G00008V00, before JOBA was restarted.

5. The restarted JOBA runs to end of job successfully. Since the CMT thinks G00008V00 is the (+1) generation, the restarted JOBA copied FILEB to GDG.DATASETC.G00008V00 instead of G00009V00 and overwrote the GDG.DATASETC.G00008V00 that JOBB created while the abend of JOBA was being resolved.

So, the question is can we tell ca11 in step 4 to restart but to create G00009V00 instead of overwriting G00008V00?

dick scherrer · Posted: Fri Mar 14, 2008 7:58 am

Hello,

jasorn · Active User Joined: 12 Jul 2006 Posts: 191 Location: USA

Dick,
For lots of reasons, resubmitting isn't desirable.

Restarting but clearing the cmt is what we really want. If that's not possible, we have the workaround of waiting until all the generations are deleted(we copy them and delete them twice a day) At that time, restarting is fine because since the generation it thinks it should create doesn't exist, it gets a correct g00v00 number.

I just figured this would be a common enough situation, there must be a parm to pass ca11 to tell it to clear the cmt. And I don't remember having this issue at other shops so there must be some good solution.

Guess it's time to look at the manual

jasorn · Active User Joined: 12 Jul 2006 Posts: 191 Location: USA

jasorn · Active User Joined: 12 Jul 2006 Posts: 191 Location: USA

fyi, here's what i settled on.

After a lot of searching and talking to experts here is an approach that seems to work.

First a side note. The systems guy who deals with ca-11 at our shop told me he doesn't know of a way to prevent generations from being overwritten in this situation and he's even requested modifications with ca and their response was we shouldn't have multiple jobs writing to the same gdg base in this manner.

Anyway here's how I handled this situation.

1. Before writing data to the +1 generation copy an empty dataset to the +1 generation. This is similar to touching a file in unix. I used disp=(mod,catlg,catlg) for this. This step is important because if there wasn't an abend, the +1 generation wouldn't exist and step 2 would get a jcl error.

2. Check the +1 generation to see if it's empty. Since I just touched it, should be empty if we can use it or have data if another job created it while this job was down.

3. If the +1 generation has data, it can't be used so I try an arbitrary generation. I chose the +111 generation. I decided to check if it's empty after touching it as described above and if it's empty use it. If it's not empty I take an abend rather that repeat this cycle as this situation would be highly unlikely and probably should have human intervention.

4. If the +1 generation is empty, it can be written to using disp=(old,catlg,delete)

This is the ONLY solution I could think of and nobody offered anything to accomplish this without maintaining a schedule that would be cumbersome for us to maintain or doing something like splitting these jobs up into one that writes to flat and then another one that copied to gdg.

Using the scheduler is problematic in that these are predecessor/successors of one another and there are enough that we exceed the limit on conflicts.