Arya: Emergency maintenance

1:42PM Apache on this server needs to be updated and restarted urgently. Sites on this server might go offline for up to 5 minutes at a time until maintenance is completed. Apologies for the unavoidable inconvenience.

2:10PM Maintenance has completed successfully

 

 

Advertisements

Emergency maintenance: Arya

UPDATE: The time for maintenance has changed to 17:30 The server will go offline at 17:25


The Hetzner datacentre has informed me that the RAID alarm for the server Arya is currently sounding.

This means that either one of the hard drives has failed and requires replacing, or that something has gone wrong with the RAID configuration.

I have scheduled emergency maintenance for 5PM this afternoon, 25 January 2017.

The server will go offline at 4:55PM and should be offline for no longer than 40 minutes while the technicians work on it.

I will ensure that a full server backup is made before the server goes offline, to ensure that no data is lost.

Your emails will not be affected in any way as they are hosted on a separate mailserver.

Please accept my sincere apologies for any inconvenience this may cause you.

Arya: Emergency maintenance

10 February 2016

8:00 Arya is back online. Performance will be degraded (sites will be slow) while the RAID rebuilds for the next hour or so.

7:15PM The server is being shut down now.

7:10PM Backup has completed

5:15 Starting a full server backup now

5PM Hetzner has just informed me that the RAID alarm for the server Arya is sounding.
This means that either a hard drive has failed (and requires replacing) or there is something wrong with the RAID configuration.
I have given them the go-ahead to take the server offline to diagnose and fix the problem.
I expect up to 30 minutes of downtime while they do this, for which I apologise.

Emergency maintenance

Tue 11 August 4:40PM Hetzner has informed us that the RAID alarm is currently sounding, and that KEATS needs to be switched off in order to diagnose and fix the problem.

5:28PM Keats is being shutdown now

6:02PM Response from Hetzner:

This mail serves to confirm that the maintenance on your server tex001_truservcomm_jhb1_009, was completed successfully. 
SDA was swapped, and RAID is currently rebuilding.

6:15PM The RAID rebuild is at 24%

6:27PM RAID rebuild is at 30%

6:41PM 41%

6:53PM RAID rebuild is at 50%

7:19PM 65%

7:51PM 79%

8:07PM RAID rebuild is at 90%

10:36PM Hetzner says that the server is “fixed”. Unfortunately, it won’t boot. I am therefore going to reinstall the server, and then restore all hosting accounts from backup. Please accept my sincere apologies for this. I will work through the night and tomorrow to get all sites up as soon as possible.

 

2AM Wed 12 August OS has been rebuilt, cPanel and Cloudlinux have been installed. Restoration of hosting accounts is starting now.

4:30AM All accounts have been restored from backup.

Ongoing maintenance: Tyrion

10:05 AM Tyrion will be offline for short periods of time througout the day as Hetzner technicians attempt to troubleshoot a hardware error.

Most clients have been moved off Tyrion and onto Lannister, so this will affect only the 10 or so domains which are still pointing to Tyrion.

13:15 Troubleshooting has completed. Tyron will go offline at 6PM tonight (14 July 2015) for up to 12 hours while the OS is reinstalled.