On the JPMC outage

The blogosphere is abuzz about JPMC outage (1, 2, 3). The basic reason people cite for long recovery time is a big, ambitious database design – to stuff in all the things (even lesser critical ones) into one database and take long time to recover the same.

The basic reason why the outage has occured in the first place is because of a software bug – Oracle has corrupted some files. Besides, this corruption has reached the mirror image too – because of which tape backup has to be brought in.

I was wondering, whether it would have done some good, if the standby mirror is a versioning volume/filesystem so that corruptions can be ridden of and old copy can be restored almost immediately. Is there any difficulty with that? I am sure this versioning can be taken care of without exposing any extra detail at the higher layers.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s