On the JPMC outage

The blogosphere is abuzz about JPMC outage (1, 2, 3). The basic reason people cite for long recovery time is a big, ambitious database design – to stuff in all the things (even lesser critical ones) into one database and take long time to recover the same.

The basic reason why the outage has occured in the first place is because of a software bug – Oracle has corrupted some files. Besides, this corruption has reached the mirror image too – because of which tape backup has to be brought in.

I was wondering, whether it would have done some good, if the standby mirror is a versioning volume/filesystem so that corruptions can be ridden of and old copy can be restored almost immediately. Is there any difficulty with that? I am sure this versioning can be taken care of without exposing any extra detail at the higher layers.


Notes from an Interviewer at IBM

Some key points to note on the criteria of selecting a candidate:

The coding assignment was a huge asset in determining people’s real abilities. Some of the nicest assignments came from those with academically weaker performances. You could clearly see who the hackers and potential future computer science professors were. Without the assignment, the selection process would have been much harder, so I’m glad that it was something we required the applicants to do.

On some key points to indicate a good programmer:

  • Open Source contributions
  • Social media involvement in the programming community. Did they have a StackOverflow profile? A Twitter account related to programming? Perhaps a a blog dedicated to the subject? While these points alone certainly don’t mean someone is a good programmer, they are a decent indicator that a person is at least passionate about the subject.
  • Knowledge of unusual programming languages. Again, this is no guarantee that someone is a good programmer, but chances are that they have an above average interest in our profession.

Non Homogenous String Burning Puzzle

This variation poses an additional restriction that burning is non-homogenous. That is, after 30 minutes, it need not be exactly at the middle of the string, burning.

The basic aha! moment for this puzzle is that if you start burning at both ends, then at the point where both flames meet and fizzle out, 30 minutes would have elapsed (assuming the string burns in one hour). One needs to take some time to grasp this point. After that, it can be generalized to super or sub lengths of the strings – if you burn the string at both ends, when the flames meet, half the time would have elapsed.

Thus, measuring 45 minutes with two strings is easy, then.