View previous topic :: View next topic
|
Author |
Message |
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
In all of the forums I visit I see a growing requirement for EMAIL notifications.
The actual 'sending' of the email is pretty standard, and is not often the main topic.
most of the time it is how to access/extract portions of the JOB details/files
in SDSF or SAR or some other spool repository
The implementation (or plague) of ITIL
(it is a good idea and a way of structuring a poorly managed company)
has let loose the idiot managers with poorly conceived attempts
to meet the reporting requirements
(keeping track of all problems/automatic notification of problems, etc)
and usually involve some grandiose scheme
whereby the sysout is extracted and attached/imbedded in an email.
what a waste of time
normally, more than what is contained in the spool output
is required to solve (identify) the problem.
it is unfortunate that these schemes are not designed
with less loftier mechanisms which would achieve the goal.
an email notificaiton should contain
JOB xxx ABENDED, FIX-IT!
and nothing else.
But such an automated solution does not make for an impressive presentation,
though in reality, that is all that is needed..... |
|
Back to top |
|
|
vasanthz
Global Moderator
Joined: 28 Aug 2007 Posts: 1742 Location: Tirupur, India
|
|
|
|
Hello Dick,
Quote: |
such an automated solution does not make for an impressive presentation, |
True. Very true in services based company.
In our shop we have email sending rexx step on all FTP jobs(around 100+). In case of FTP failures, the step captures the FTP output from spool along with the email.
During off hours, we look at the email on blackberry, if the failure is a simple one then instruct scheduler team with restart instructions over the phone.
Just saying that in some rare cases the job output could be useful.
Regards, |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
vasanthz wrote: |
Just saying that in some rare cases the job output could be useful.
|
point taken, thx for the response
i made the mistake of making a general statement. |
|
Back to top |
|
|
Pedro
Global Moderator
Joined: 01 Sep 2006 Posts: 2547 Location: Silicon Valley
|
|
|
|
Quote: |
it is unfortunate that these schemes are not designed
with less loftier mechanisms which would achieve the goal. |
(not sure that I do not miss-understand the double-negative...) what information would you include? |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hi Pedro,
I believe only this:
Quote: |
JOB xxx ABENDED, FIX-IT! |
or something similar. . . |
|
Back to top |
|
|
Peter Nancollis
New User
Joined: 15 Mar 2011 Posts: 47 Location: UK
|
|
|
|
Thoughts...
What is the value of more info - over than what Dick has suggested ... its bust lets get it fixed?
email/sms/pager [remember them?] alerts - good
but then dont you enter a dialog with the Ops ?
<sweeping generalisation>
If they cant provide the further info you need, they wont be able to [or would you be happy?] for them to action what you ask??
</>
The issue [as I see it] isn't raising the alert [with whatever info is reqd] - but the interaction to get the problem fixed.
Experience and communication on both sides... always was,,, always will be
...If not just code the automation ....ha ha ! |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
E-mail as dbz suggested.
Allow "reply" to request job output with known job number. Sufficient, whilst being simpler to implement, and not jamming inboxes of people who neither know nor care what all the output means.
Yes, it loses initial contact with OPS, but having the output in front of you whilst talking to them would be of use, I think. Saves "Are you sure it says ICE not IEC"? sort of dialogue.
Might be "fun" if someone leaves a few DISPLAYs in a loop :-) |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
every site that has the fortune (some have said misfortune)
to have me grace their environs
has always had as part of production turn-over documentation,
a description of every step in a job
and detailed notes as to what error may occur
and the action that ops is to take upon encountering error in the job run.
e.g.: restart instructions, 'bad' rc's from a step, etc.
vasanthz provided a good example of what is to be done.
The RBS insurance company had as detailed instructions for operations:
any step not have one of the following RCs:....
Operations was to contact (phone) the programmer on call,
and then wait for further instructions.
obviously, much depends on the experience/expertise level of the operations staff.
(as well as the programming staff).
The basic premise of my original post
(that i obscured with my double negatives)
is that expending so much effort into creating a 'sophisticated' email,
was not necessary most of the time.
for the FTP jobs that vasanthz mentioned,
what was required for a determination by the programmer was a known entity
and in this specialized case,
i would agree, was properly done.
my complaint centered on the general inclusion of sysout
as part of the email for every abended job
was not necessary
and in my experience, not always the definitive piece of info necessary for problem resolution.
My opinion,
send the job name/number/date to the responsible problem resolver.
the fact that the developer of such emails
posts in an internet forum for code to extract all this garbage
means the job of generating the email
fell into the wrong hands.
Plus my inherent prejudice concerning managers
who cause generation of junk to justify their own positions. |
|
Back to top |
|
|
Ed Goodman
Active Member
Joined: 08 Jun 2011 Posts: 556 Location: USA
|
|
|
|
As a developer, I have been evolving the level of information that I want when something breaks.
Normally, I'm dealing with testing schedules that I have set up to run while I attend to other duties at work or at home.
In my case, I want to know the return code and the job name most of the time. If we have an application program fail, I want the standardized application error output from the job.
Reason: I want to be able to thumbnail estimate how long it will take to solve the issue. If it's "just" a space abend, I know I can get it working in a few minutes. If it's some sort of "data has bad values" problem, I know it's going to take some serious debugging.
I want to know if I can continue watching the movie with the kids, or if I need to get signed in and start working. |
|
Back to top |
|
|
|