Google

Home
About Me
Friends
Pics
Most Popular
My Links
Email Updates
Petals

*
2007/07/27
 22:06:35

Moving mailboxes

Mailbox migration time as a function of size and message countAnd the big migration to Exchange starts. Of course pretty graphs are required in the analysis of how fast boxes migrate to estimate the load. What better way to do so than to send a Mathematica notebook to the migration team? Perhaps one which pulls data live from the server so they can get the stats themselves whenever desired (and rotate the graph too). Unfortunately I doubt many of you have Mma 6 (needed for transparency), and I'm not going to release the raw stats data either, so a picture will have to do.

The cyan plot is a quadratic fit, the red plot is a mesh of actual data points.

*
2007/06/18
 23:51:59

25 drives in 2U

HP MSA70 SAS enclosure

That's 25 146G 2.5" 10k SAS drives, or 3.6TB raw. The thing is somewhat an oversized holder for the drives too, with one layer up front going back maybe 6" and then just power, fans, and interface behind for another foot or so. The drives only list 5W on 5V and 3.6W on 12V, so it's pretty reasonable on power too.

Seagate ST9146802SS 10K 146G 2.5" SAS drive

*
2007/04/07
 00:42:29

The week from hell

I started off the week at a point where I needed a vacation, looking forward to it being a short week and then a 4 day weekend. Last week was busy and ended with a long evening of taking everything down to replace a cache controller on a SAN. Unfortunately despite a good start this week of hiding at work and getting things done for 3 hours, it just went downhill. The problems started with runaway stuff filling up a disk (of course on a volume that happened to be missed when setting up monitoring). There's then the fun issue of Linux drivers turning SCSI_STATUS_BUSY (retry) into BUS_BUSY (abort) to fix some specific bug, which unfortunately when hit with the contention and delays the combination of ESX and shared SAN cause, causes locking devices in read-only mode. Neither the filesystem or database engines like that much, and unfortunately the "fix" is currently a hack in the HBA driver (to roll back the vendor's change). Then came finding random corruption in a database (on Windows this time, unrelated to the Linux driver issue). After trying to track that down, it was discovered the timing correlated with a drive issue behind a SAN controller, which apparently leaked through the redundancy without being caught. As the problem started on the weekend, and the state of log replays requiring downtime, it required replaying transaction logs from Saturday through yesterday - which took hours in itself, not counting the staging/testing and pulling backup copies and restoring files at various states.

Then I had a spam filter randomly die. Turns out the CPU fan failed, so it'd run for 5-10 minutes and then go into thermal protect. Luckily once I knew the problem I was able to safely dequeue the mail without issue and with a screenshot and checking the motherboard model (for which they had me ignore the "Warranty void if removed" sticker), a replacement part was on the way. Not the instant replacement swap out I expected from most reports in support forums, but swapping a fan and heatsink is quicker and easier for me than a backup/restore. The replacement arrived the next day, along with a replacement and extra warranty sticker (which amused me), although I'm not thrilled with the design of the replacement part even though it seems to work. This is all worked around 7 hours of meetings, and the usual daily stuff. Plus I ended up rescheduling my Monday vacation between not really getting the whole weekend anyways and practicalities of scheduling. Hopefully that's turning into next weekend being 4 days though. I did end up having a few minutes of extra time while being at the mall over lunch on Thursday though, which resulted in toy shopping (which will be another post), which at least is something fun for the week. I'm just glad there was one fewer day for things to go wrong, and it's now the weekend.

*
2006/10/07
 22:22:21

Self-destruct in 5 seconds. Have a nice day...

Ah, the joys of Linux. Not exactly something you want to see your server (or any computer for that matter) say...

Self-destruct in 5 seconds. Have a nice day...
*
2006/09/26
 16:33:02

Success...Maybe

The joys of alert messages...

Success The certificate for the server certificate chain authority has been installed. The server will not accept client certificates issued by this authority.
View next 10 entries