And I got in early
today; the SOGSMGR (the system manager for the SOGS computer system used
by the Operations Center here) won't be in for another 40 minutes or so.
I begin prepping the windows in the workstation that is responsive,
only to discover two of them are hung! Wunnerful. Okay, so there are two
windows I can't work with just yet; I put those away. In one of the windows
that I can use I read the shift report from last night. Hmmmm....new datadisk
for data evaluation was put in...they received science data and processed
it...no new engineering telemetry received last night...oops, problems
with processing one of last night's observations...hmmmm, problems with
repairing another observation that went to trouble (make a note of that
to work on today).... problems with the pipeline cleanup last night, need
to investigate that this morning, too...and problems with one of our software
tools that needs trouble-shooting. Well, this could prove to be a fun
morning.
I start up the shift report, check the disk space, and check when the
STRs are (STR = Science Tape Recorder dumps; when the spacecraft dumps
the recorded science data to the ground and ultimately up to us). Hmmmm...it
seems that the new datadisk isn't showing up on our little disk-check
tool (it actually has another name, but for simplicity we'll just refer
to it as the "disk-check" tool). A quick investigation revealed that our
disk-check tool wasn't properly updated when the new disk was installed,
and therefore the disk-check tool isn't picking up on the new disk. Great,
only a minor problem! A simple fix.
Slowly my co-workers filter in, and the SOGSMGR arrives! Woo hoo! I
immediately corner her about the problem with the hung workstation and
with the hung windows in the other workstation. She gets back to us a
few minutes later. Apparently the workstation that's hung has gotten itself
into a Bad State (tm) due to over-allocation of resources. I try logging
out and back into the workstation, but this doesn't alleviate the problem.
The workstation's resources are still overtaxed. This is Not Good (tm),
as we need this workstation to assist in processing the data. I'm expecting
to get a dump of science data in the next half hour or so. We try logging
out and in of the workstation again, but this doesn't help. Okay, the
SOGSMGR decides to perform the ultimate act, and reboots the machine.
This works!
Quickly I bring up the workstation, and get the process manager up and
going The process manager is the pipeline through which our science data
processes through; it needs to be up and operating to process any data
we might get. And a few minutes later, the data from the spacecraft hits
our system and processes through, with nary a hitch.
While that occurs, I turn my attentions next to the data that is sitting
in trouble, waiting to be fixed (you see, when an observation fails to
process properly, for whatever reason, it is sent to a trouble area for
us to attend to as time will allow; we try and do this as quickly as possible
so we can get the data out to the person who originally requested it as
soon as possible). A couple of the support team members join in and we
investigate the different observations in trouble for an hour or so. A
couple of them we were able to readily fix and reinsert for processing.
They processed just fine. There was a problem that needed further detailed
investigation. The two support team members drifted off in thought and
would try and get back with me later on them. On to the next item on the
list!
The SOGSMGR came back to tell me that the problem with the two hung
windows had been corrected (there was a problem with the disk driver,
which was hung). My two frozen windows were freed up. Great! More windows
to work with (I just love multi-tasking).
I reread the shift report, to make sure I didn't miss anything. And
then I read the message waiting from PASSOPS (down at Goddard) that there
is more engineering telemetry data from last night ready to be copied.
PASSOPS deals with the engineering telemetry data processing for us, in
addition to a half dozen other things (such as satellite uplink/downlink
requests). Okay, I call up another window that I'm not using and start
the tool to copy up the data. Unfortunately, I get the following message:
-SYSTEM-F-UNREACHABLE, remote node is not currently reachable
Ack!! Okay, past experience I know something's up with the line. I jog
over to the SOGSMGR area to ask if she knows anything about line problems,
and overhear her on the phone with someone else, explaining to them that
there is a problem with the data line and the repair crew from Bell Atlantic
is looking into it. Okay, so, no data copying for me at the moment. I'll
check back on the line status later.
I return my attentions to one of the observations in trouble as one
of the support guys comes back with a potential solution. We implement
his solution, and try reinserting the one observation...and wait...in
the background I can hear the SOGSMGR discussing with the system manager
down at Goddard the line problem...the data enters the pipeline...processes...and
goes through! Whew.
Well, that was the morning. It's time for lunch!
I strolled out with a couple other co-workers and wandered onto campus
for a quick pizza lunch. After enjoying the warm spring sun and discussing
various classes we had taken in our past (or currently were taking), both
good and bad aspects, we returned to work.
I slid back in and coordinated with one of the support team members
about a potential fix for the other observations that were in trouble.
I spent the next hour or so going through the repair procedure and...voila'!
Data processed.
Throughout the day I kept tabs on the status of the lines between here
and Goddard. Being that they're down means no data. No data means...well,
you can figure that out. But finally the lines came up, and I began copying
up the engineering telemetry data. At the same time I checked what data
had gotten through the DADS archive system and could be archived. Quite
a bit of it, it turned out. So I started a batch of data archiving.
Finally I gathered my notes from the shift, and finished editing my
shift report, and handed everything over to the evening shift. Let them
know that the PASSOPS data was still coming up, and that archiving was
going on, and that they should expect to have an otherwise quiet evening.
All the problems from last night and today amazingly were taken care of;
nothing for the evening shift to do but normal activities.
Instead of going straight home, being that it was *such* a nice day,
I opted to go out for an hour or so climbing at a local crag. After that,
I returned home, logged on to check what email I had accumulated during
the evening, and began dinner. Oh, yeah, foooood....mmmmm!
Finally I kicked back to relax to watch the latest episode of Babylon
5 that I recorded last week (and hadn't had a chance to view yet), and
then turned in to sleep. It was late, and I had to be back in early again
tomorrow...