Re: despair



On Sep 16, 4:56 pm, "David P. Murphy" <dpm_goo...@xxxxxxxxx> wrote:
On Sep 16, 8:39 am, AEF <spamsink2...@xxxxxxxxx> wrote:

I've been at my current job for 7 years and have been a VMS admin for
13 years and have not had any problems like this. I don't write code
like this. I did inheret some code I wasn't happy with, and I fixed
what I could with what time I had (what I could get authorized by
management to do) and things have been okay.

Let me clear this point: I am a little upset that this code was
written, but I am much more upset that it has been in place for
over ten years without being corrected.

ok
dpm

Management is often hesitant to change anything. "If it ain't broke,
don't fix it!" Well, you still have to do your periodic oil changes
and inspections. And of course, sometimes it is "broke" even if things
have been running smoothly for a while.

Two jobs ago I was finding problems and struggling for permission to
fix them. One was a DCL code snippet that one of the handover admins
added at the last minute thinking he missed an iteration of
calculating what day monthly reports should be scheduled for. I
analyzed it and told mgmt that the next time such and such falls on a
Tuesday (I think such and such was the first of the month) a repeat of
the near disaster I saved us from from home at the last minute on a
previous occasion would occur: the flaw would start report jobs before
the "run commit" had finished or something like that (I don't
presently recall the details), which would of course corrupt the
database, resulting in the need to do a shift-long restore from 8mm
tapes. I was told to leave it alone. Finally, several months later,
when it was getting close to the "doomsday month", I said such and
such will do so and so next month. Will you please let me fix it?
Mgmt. then said yes. Our client was safe again.

[The handover admins -- who were actually also the previous admins --
told us that the admins prior to them told them it was too complicated
to automate the report jobs and Run Commits and such. I guess they
took this as a challenge and, well, missed a few things here and there
that would only crop up later at unexpected times. Hence the illusion
of "working system" I mention below.]

I remember when starting this same job mgmt told me "You're getting a
working system. Don't mess with it." (My title _was_ senior operator,
but I did know _some_ system manager stuff at the time.) Well, I had
to several times to save the day and prevent other fires. Another time
the header of INDEXF.SYS filled up on an a certain critical non-
database non-system disk. I was there at 6:30 in the morning by myself
and I was able to determine that the system couldn't be opened for the
end users at 7:00 a.m. because of this. I couldn't reach mgmt or the
handover admins (who were training us on the system and app). So on my
own initiative I copied it to tape and back. It took a long time, but
it worked. While the restore was going, I was told I shouldn't have
done that. I said I had no choice and I was sure it would fix the
problem. (Well, I _did_ "cross my fingers"!) The restore completed,
the app worked fine, and I had saved the day. One of the handover
admins told me that whenever that happened they would archive some
invoices. Well, I was very new to the job and didn't know about that
manual task yet. I didn't know which files were "expendable" or
archive-able. So I did what I thought was best. The handover admins
who used to run the system apparently didn't know how to fix the
dreaded HEADERFULL problem for good. (I've never had a "repacked" disk
experience a subsequent HEADERFULL event.)

One job ago I was pretty much given a free hand on most things but the
boss insisted that the VAX be rebooted every Friday evening before the
backups were run!!! He was very insistent and must have thought that
disaster would strike if the VAX wasn't rebooted every week. They had
always done that, I think, and were afraid to stop (this was before
Windows took over the desktop at this place). I was told that this was
something carried over from something with their IBM mainframe. I
don't recall any more details. In fact, the head operator said once
that he was worried no one could reboot the machine during Christmas
week. I said, look, it doesn't matter. It doesn't have to be rebooted.
I won't tell anyone. No one will know. It'll be fine. I said if the
operator wants the time off, fine. If he wants to come in to reboot it
to make some overtime, fine. I forget which happened, but I know at
least once it didn't get rebooted and I finally had an uptime of over
7 days. Of course nothing bad happened because of it.

There are probably more similar stories I can't think of offhand. I
feel your pain!

BREAKING NEWS!!! or FOX NEWS ALERT!!! (imagine the FOX Gong here): I
just thought of another disaster I saved at the two-jobs-ago job!
Later, I've got to get back to other things. But I'll say at this
point that it involved tape drives!!! And it was a fascinating time-
bomb bug!!!

Before you go blaming me for all these problems: ***I didn't write the
code. I inherited it!*** OK?

Back to a Paul Lynde question.

AEF

.



Relevant Pages