What was I thinking?


So yesterday was Friday the 13th……and I stupidly reboot the main server at work in the morning. I broke 2 of my rules…1. Never touch Production boxes on a Friday. 2. Never ever do anything in the production environment on Friday the 13th. I knew I had to reboot the server so I could work on some new software that I installed the day before so I got up early, logged in remotely and restarted the server. While it was “rebooting” I jumped in the shower and figured I would do the tweaking of the new software when I got out and head to work a little later. Got out of the shower and tried to log back into the server…..no go. Tried pining it still no response….so I got ready nice and fast and headed to work….did I mention it was 5 am…..got to work and found the server in a continual reboot cycle. It would get to the screen to logging and then reboot. I just stood there for about 3 cycles saying every swear word I could think of…

I decided to first do a parallel install of the OS and see if I could get to the registry of the bad install and get the new software uninstalled…After I got the new install done I jumped into the registry of the bad install and found it totally messed up. So that idea had gone nowhere and cost me about 40 min. Now this server is the server that has all the work data for the company..every drawing, memo and documentation of all our current and past jobs so I had to get this server back up and running so I didn’t have managers yelling at me that their people weren’t working and the projects weren’t getting done.

I checked the RAID container that holds all the data and saw it was okay. So I sat down figured that the only data needed from the OS RAID container was the databases for the NetBackup software. So I copied the whole backup software folder to the data RAID. I then started a fresh install of the OS. While that was going on I jumped on to the phone with Veritas to find out what files/directories to put back once I got the server back up and the backup software installed. I ended up talking to Simon in London who was a great help. I found that I was unable to call England from my phone system. So when I hadn’t called Simon back when I told him I would he called me directly and I told him my calling problems and let him know that I wasn’t done with the OS install so he said he would call me back in an hour. Once I got all the drivers and software installed and files copied back Simon called right in time to walk me through a test and to see if all my metadata for all my back up…yea it was….and the backups went through last night.

So after many reboots and share setups the server was back up and running with a fresh OS install and all the data was safe and sound and I didn’t lose all my backup information.

The lesson learned – never touch anything on a Friday especially Friday the 13th 😉