Posted in innovation

How a Startup Evolves

I stumbled on a very nice article that explains the phases in which an idea goes on to make money, more money. That is the phases in which a startup evolves.

What is specifically interesting is the kind of people involved in different phases of the company. The first phase consists of commandos who work very diligently to give life to the idea. The second phase consists of workers who are far less productive and are specialized at doing systematic things.

To see how much they differ, here is a snippet from the article:

Slashdot was also founded by two college students, Rob Malda and Jeff Bates. When Andover acquired Slashdot, Rob and Jeff handed off the administrative portion of their duties in order to concentrate on their creative work. They were amused that the jobs they used to do as a part-time sideline were now handled by more than a dozen full-time staff members at Andover. This is the normal productivity difference between commandos and infantry. The difference is you can hire more infantry fairly easily, and order the ones you have to do specific tasks. Commandos just don’t work that way.


Posted in design, innovation

On the Ownership of an Idea

Malcolm Gladwell has an article in the Newyorker titled “Creation Myth” that highlights how an idea, to be successful, has to be owned by different people who are good at doing different things to the idea. For example, the idea of personal computer and its peripherals – the central theme of the article, had many takers:

  • The one who conceptualized or thought of the idea – Douglas Engelbart.
  • The one who had an initial first implementation of the idea – Xerox PARC centre
  • The one who took the idea further and improvised it and made it much more usable – Apple.

So, who the credit goes to? As he says, the truth is complicated. ๐Ÿ™‚ Each of these steps is/was critical to the ultimate success of idea. The things that failed should not be discarded because they pave the path for the success.

Posted in software

We Can Increase Our Intelligence

so says a Scientific American article by doing the following things:

  • Openness to new experience or learning new things constantly
  • Challenging oneself with tougher things:

Efficiency is not your friend when it comes to cognitive growth. In order to keep your brain making new connections and keeping them active, you need to keep moving on to another challenging activity as soon as you reach the point of mastery in the one you are engaging in.

  • Think creatively and divergently
  • Do things the hard way – because efficiency is not our friend when increasing our intelligence
  • Networking – because it is the arena where all the above bullets can be easily exercised. ๐Ÿ™‚
Posted in engineering

HTTP and Page Load Times

Some salient points to be noted from this article on improving the page load times:

  • Having KeepAlives on has two advantages: the extra time for TCP three way handshake is not necessary and slow start wont happen again – which means the current bandwidth window is used to get the data from the server.
  • With ADSL connections (typical downstream to upstream ratioes being 5:1), if the request sizes are more, the upload bandwidth can become a bottleneck which means the page load time will be more.
  • If we have pipelining enabled, the latency part of the pipe between the client and server can be reasonably hidden.
  • By having more than one connection between the client and server, contents can be parallely downloaded.
  • Use of AJAX will reduce the request/response sizes and hence things will download faster.
  • The disadvantage of a KeepAlive connection is at the server side – the connection resources (which are limited) are held up there.
  • KeepAlive is a HTTP layer concept – so the server will maintain a timer to kill the connections once the KeepAlive expires.
  • Serving static content from a separate server meant for it will take the load off the dynamic content server.
  • There is a technique called CSS Sprites that can be used to combine multiple small images into a single file so that all the images can be downloaded in one request thus taking latency out of the picture to some extent.
  • When different hostnames are used (even if the IP address backing them is the same), browsers have a tendancy to open separate connections for each hostname. So, addressing different resources with different hostnames will also increase the page load time – the average latency is reduced by as many hostnames kept in the webpage.
  • Preferably, the image loaded from a specific hostname is better loaded again and again from the same hostname because the contents may be cached.
  • Content can not only be cached at the server side, but can also be cached at the browser side.
  • Apparently the “Expires” header can be used to say how long the thing can be cached, but for what all objects it can be used is something that needs to be figured out.
  • In general, a “?” found in the URL will make the cachers not cache it.
  • Setting another domain to serve static content will also make the headers small because cookies and other such stuff need not be sent along to this domain.
  • Conditional GETs are also there – a piece of object specific stuff exchanged between the browser and the server will make the server send a “(304) Not Modified” back so that the browser need not load the whole object again.
  • On the whole, the very desired feature – pipelining – is disabled by default on browsers (and Chrome is a bit worse – it does not support it at all) for reasons unknown.
Posted in tips, wisdom

Best Advice on Getting Things Done

Tony Schwartz gives the following best advice on getting things done:

The answer, surprisingly, is not that they have more will or discipline than you do. The counterintuitive secret to getting things done is to make them more automatic, so they require less energy.

It turns out we each have one reservoir of will and discipline, and it gets progressively depleted by any act of conscious self-regulation. In other words, if you spend energy trying to resist a fragrant chocolate chip cookie, you’ll have less energy left over to solve a difficult problem. Will and discipline decline inexorably as the day wears on.

“It is a profoundly erroneous truism that we should cultivate the habit of thinking of what we are doing,” the philosopher A.N. Whitehead explained back in 1911. “The precise opposite is the case. Civilization advances by extending the number of operations we can perform without thinking about them.”

A ritual, consciously created, is an expression of fierce intentionality. Nothing less will do, if you’re truly determined to take control of your life.

The good news is that once you’ve got a ritual in place, it truly takes on a life of its own.

Posted in engineering, storage

SSD controllers and File Systems

ACM queue carries an interesting article on the effects of deduplication on file system reliability.

The main problem is that flash disk controller does deduplication to avoid writing the same block twice on the disk. This is beneficial because it saves space and also reduces the number of operations to the flash directly improving its life.

The glitch is that file systems which store redundant copies of superblock and other metadata blocks on the disk for reliability – in case one copy goes bad, the other copy can be read from. With flash controllers doing deduplication, only one physical copy is present on the disk. Which means if one copy goes bad, all logical copies go bad. Which is bad.

One possible solution to this problem would be to have something in the block that is different in the duplicate copies – this makes file system operations a bit slower, but still achievable, in my opinion.

So, hardware based deduplication has its own share of issues that needed to be tackled at the file system layer.

Posted in design, engineering, innovation

Browser Security from Google Chrome

ACM Queue has an interesting article on the security measures taken in the Google chrome web browser in an attempt to thwart the attempts to attack and exploit the weaknesses of a browser.

The article nicely summarizes the three main things to achieve the above goal:

  • Mitigating or nullifying the actions that are caused by vulnerabilities.
  • Push updates frequently.
  • Warn users about malicious sites with the help of a global database of malicious sites.

The first part consists of two things: try preventing the damage in the first place and if the damage happens, keep the damage isolated so that it won’t have any side effects. Measures can be taken to prevent malicious code execution with the help of OS/hardware/tools. Techniques such as:

  • Data Execution Prevention: Mark Nx [not executable] flag on pages that has heap/stack etc. so that when buffer overflow and other such flaws are exploited to crop code in stack or heap, execution of the same can be prevented. The process will just crash.
  • Stack overflow check: A small random value is placed in between the top of stack and the return value. While returning, that small value is checked for. If it is not present, then that is a case of a stack overflow. This feature is provided by the compiler. This is so simple a technique and I wonder why modern compilers do not have this feature by default.
  • Address Space Layout Randomization: This seems to be a new feature where the data/stack/heap sections start are different addresses unlike the current way of starting them at well known virtual addresses in the process address space. This makes identifying those sections difficult.
  • Heap Corruption Detection: This is not very cleanly achievable unless the virtual machine supports it as a native feature.

The main security vulnerabilities seem to crop up from the rendering engine where the javascript code is executed, page rendering is done etc. Chrome has that done inside a sandbox so nothing explodes out of it. This is another way to prevent vulnerabilities from showing side effects.

That completes the first part. The second part is about pushing patches painlessly to the clients. While it is still not possible to apply patches without rebooting the browser (what, huh! Linux has a way to apply kernel patches without rebooting the kernel), Google has still come a long way to make it simpler. The updates that are pushed are incredibly small – because of their smart diff tool Courgette. The net effect of this is that updates can be pushed faster as well as more updates can be pushed which means vulnerabilities are fixed more often and sooner.

The last part of the job is to inform user before hand about visiting a potential malicious site. This job is technically simple when compared to the above two jobs. Colloborate with a site ( and keep an updated list of malicious sites. There is no need to push user URL to the website, the browser can download the list (or a homomorphic form of the list) and check whether the user is entering a malicious website. This is the simplest of the three jobs. Prevention is better than handling which is better than cure.

One thing worth noting is the extent of automated testing done by Chrome engineers to assure the quality of the product. In their own words:

The Google Chrome team has put significant effort into automating step 3 as much as possible. The team has inherited more than 10,000 tests from the WebKit project that ensure the Web platform features are working properly. These tests, along with thousands of other tests for browser-level features, are run after every change to the browserโ€™s source code.

In addition to these regression tests, browser builds are tested on 1 million Web sites in a virtual-machine farm called ChromeBot. ChromeBot monitors the rendering of these sites for memory errors, crashes, and hangs. Running a browser build through ChromeBot often exposes subtle race conditions and other low-probability events before shipping the build to users.

All in all, professional act!

Posted in software


Beautiful Words, from this guy, need to be etched in our brains:

“The amount of skill that you have in a certain area is proportional to the amount of work that you put into it. There is no such thing as a ‘creative’ or ‘technical’ type. The reason I was bad at art starting out is the same reason we are bad at anything starting out.”

Posted in software

SuperComputer on AWS Cloud!

This article talks about how one company built a 10000-core compute cluster on AWS cloud for its client.

It’s quite wonderful to see these kind of cloud computing applications being developed at places. For example, here are some important facts to note:

  • Cycle Computing boasted that the cluster was roughly equivalent to the 114th fastest supercomputer in the world on the Top 500 list, which hit about 66 teraflops.
  • Cycle and Genentech instead opted for a “standard vanilla CentOS” Linux cluster to save money, according to Stowe.
  • The 10,000 cores were composed of 1,250 instances with eight cores each, as well as 8.75TB of RAM and 2PB disk space
  • Scaling up a couple of thousand cores at a time, it took 45 minutes to provision the whole cluster. There were no problems. “When we requested the 10,000th core, we got it,” Stowe said.
  • The cluster ran for eight hours at a cost of $8,500.
  • For Genentech, this was cheap and easy compared to the alternative of buying 10,000 cores for its own data center and having them idle away with no work for most of their lives, Corn says.