Things I would love to buy for my kid

  • Lego
  • Arduino
  • Squishy Circuits
  • DK History of the World

The Long Tail of Media in the Indian Market?

Netflix like service in India is still far away – because of the thin internet pipes that run to houses. The experience will be horrible. Are there any other kind of services out there which will help users pick movies of their choice like netflix does? I guess so. Currently, it is in its crudest form – mostly because of the short minded ness of the people churning out that service. For example, Tata Sky’s showcase and Dish TV’s Movie on Demand are a baby step towards that direction.

The cost is a bit on the higher end and the movie selection is utterly pathetic – no choice at all. Let’s hope this will improve (and evolve) over a period of time where renting movies become cheaper and the selection becomes potentially unlimited. Let’s hope the people running those businesses realize the value of the Long Tail.

Calibrating Loops Per Jiffy in Linux Kernel

I was going through the kernel code of calibrating the delay loop and was just amazed at the beauty of it. Thought, would share my joy here. Here is the source code for the warm up before calculation:

        while ((loops_per_jiffy <<= 1) != 0) {
            /* wait for "start of" clock tick */
            ticks = jiffies;
            while (ticks == jiffies)
                /* nothing */;
            /* Go .. */
            ticks = jiffies;
            __delay(loops_per_jiffy);
            ticks = jiffies - ticks;
            if (ticks)
                break;
        }

The above loop tries to set the most possible significant bit in loops_per_jiffy variable such that the resultant number (loops_per_jiffy) when passed to __delay function, will cause the __delay to take more than one jiffy to execute. In other words, starting from LSB, it moves the bit to the left (in effect doubling loops_per_jiffy) in every iteration such until this number is big enough to cause __delay function to take more than one jiffy. The void while loop inside is basically to let go of the current jiffy – that is, to start at the beginning of the next jiffy time window, by letting go of the current jiffy window (we don’t know where in the current jiffy window we are so if we start __delay now, then our loops_per_jiffy calculation is skewed).

Lets take a small notation here: lpj_x is loops_per_jiffy variable with only bit x set. And LPJ be the closest value of the variable that can cause the __delay function to lapse exactly jiffy unit of time.

When we come out of loop, the following assertion is true: lpj_x causes __delay to cross a jiffy and lpj_(x-1) does not. Does this mean that lpj_x is the exact delay that can cause a jiffy to pass? No. That exact LPJ is somewhere between lpj_x and lpj_(x-1). So, starting from here we have to find LPJ further.

And here in comes another piece of beauty.

        loops_per_jiffy >>= 1;
        loopbit = loops_per_jiffy;
        while (lps_precision-- && (loopbit >>= 1)) {
            loops_per_jiffy |= loopbit;
            ticks = jiffies;
            while (ticks == jiffies)
                /* nothing */;
            ticks = jiffies;
            __delay(loops_per_jiffy);
            if (jiffies != ticks)   /* longer than 1 tick */
                loops_per_jiffy &= ~loopbit;
        }

Since lpj_x is a loose upper bound on LPJ, we proceed to calculate a tighter upper bound. The above loop does it. To understand this, consider x to be 3. Then lpj_3 8 – it is 1000 in binary. And lpj_2 4 is 100. The more precise LPJ is between lpj_3 and lpj_2. We would start from 4 and proceed to 8. We start from lpj_2 and set further lsbs and see it comes close. That is, we start with 4, make it 6 and then 7 (100, 110 and 111). Although we miss out on some numbers, this is apparently considered to be good enough an approximation.

That is what the loop does exactly. As usual, the inner void while loop is to align ourselves to the beginning of a jiffy window.

At the end, we will have lpj close to LPJ.

The only other magic sauce here seems to be __delay. It is just an instruction that is used to some CPU intensive (or CPU only) operation for so many iterations. We use that as a tool to tune loops_per_jiffy.

On the JPMC outage

The blogosphere is abuzz about JPMC outage (1, 2, 3). The basic reason people cite for long recovery time is a big, ambitious database design – to stuff in all the things (even lesser critical ones) into one database and take long time to recover the same.

The basic reason why the outage has occured in the first place is because of a software bug – Oracle has corrupted some files. Besides, this corruption has reached the mirror image too – because of which tape backup has to be brought in.

I was wondering, whether it would have done some good, if the standby mirror is a versioning volume/filesystem so that corruptions can be ridden of and old copy can be restored almost immediately. Is there any difficulty with that? I am sure this versioning can be taken care of without exposing any extra detail at the higher layers.

Non Homogenous String Burning Puzzle

This variation poses an additional restriction that burning is non-homogenous. That is, after 30 minutes, it need not be exactly at the middle of the string, burning.

The basic aha! moment for this puzzle is that if you start burning at both ends, then at the point where both flames meet and fizzle out, 30 minutes would have elapsed (assuming the string burns in one hour). One needs to take some time to grasp this point. After that, it can be generalized to super or sub lengths of the strings – if you burn the string at both ends, when the flames meet, half the time would have elapsed.

Thus, measuring 45 minutes with two strings is easy, then.

Making Code Slower

Ever seen a situation where the code has to be slowed down for a purpose? Here it is from Microsoft: :-)

Even Office 2007, Microsoft’s most recent version, was nearly saddled with a serious encryption bug. One part of every encryption system is the portion that handles passwords. The relevant code in Office 2007 was simply too well-written; because it worked so quickly, a crypto-attacker could rapidly progress through all possible password variations. Someone caught the error before the product was shipped and rewrote the software to make it slower–50,000 times slower, as it happened. It may be one of the few times in software history that programmers deliberately made a program run slower rather than faster.

A DDoS attack with URL shortner services?

I am wondering if a DDoS attack of URL shortening services will render them unusable. This came to my mind when I gave a dummy URL to bit.ly and it replied back with a short URL.

So, if a service becomes popular, then a DDoS attack which tries to get short links for dummy URLs would quickly make the short URL space run out.

Will they, then, increase the shortened URL length? Or periodically clean the database to rid of dummy URLs?

Interview with Hal Varian: Some Nuggets

Nice piece.

On combinatorial innovation (emphasis mine):

We’re in the middle of a period that I refer to as a period of “combinatorial innovation.” So if you look historically, you’ll find periods in history where there would be the availability of a different component parts that innovators could combine or recombine to create new inventions. In the 1800s, it was interchangeable parts. In 1920, it was electronics. In the 1970s, it was integrated circuits.

Now what we see is a period where you have Internet components, where you have software, protocols, languages, and capabilities to combine these component parts in ways that create totally new innovations. The great thing about the current period is that component parts are all bits. That means you never run out of them. You can reproduce them, you can duplicate them, you can spread them around the world, and you can have thousands and tens of thousands of innovators combining or recombining the same component parts to create new innovation. So there’s no shortage. There are no inventory delays. It’s a situation where the components are available for everyone, and so we get this tremendous burst of innovation that we’re seeing.

On how Google gains attention advantage:

We have to look at today’s economy and say, “What is it that’s really scarce in the Internet economy?” And the answer is attention. [Psychologist] Herb Simon recognized this many years ago. He said, “A wealth of information creates a poverty of attention.” So being able to capture someone’s attention at the right time is a very valuable asset. And Google really has built an entire business around this, because we’re capturing your attention when you’re doing a search for something you’re interested in. That’s the ideal time to show you an advertisement for a product that may be related or complimentary to what your search is all about.

Regarding data analysis and future:

The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill in the next decades, not only at the professional level but even at the educational level for elementary school kids, for high school kids, for college kids. Because now we really do have essentially free and ubiquitous data. So the complimentary scarce factor is the ability to understand that data and extract value from it.

I think statisticians are part of it, but it’s just a part. You also want to be able to visualize the data, communicate the data, and utilize it effectively. But I do think those skills—of being able to access, understand, and communicate the insights you get from data analysis—are going to be extremely important. Managers need to be able to access and understand the data themselves.

On the positive and negative sides of technology:

…you get a new technology in and people are excited about the positive sides of it. Then you see there are also some negative aspects. And you’ll have a regulatory infrastructure that arises to deal with those. I think everybody is very excited about the intended aspects of this technology—the fact that you can personalize, the fact that you can monitor, the fact that you can provide products that are more closely suited to a consumer’s interests and needs. What people are worried about are the unintended consequences, the downsides, the negative sides, the security, the identity theft, the possibility of extortion or embarrassment. These are the problems: not what people want to do but what could happen if these technologies weren’t appropriately managed.

On Google’s ad business:

I think the people who originally designed the model way back in 2001 had a very, very useful insight. They recognized that the content provider has impressions to sell. So you’ve got some space in your TV show. You’ve got some space on your page. You’ve got some space that’s available to put an ad. But what the advertiser wants to pay for is clicks or conversions or visits. So they don’t really care how many impressions they show. Normally, what they care about is getting people into their store and, ultimately, getting people to purchase. So you have to build a system that allows the publisher to sell impressions but the advertiser to buy clicks. And I think we’ve managed to accomplish that in a nice, elegant way.

Comments on Haskell

From Slashdot:

The thing about functional languages, and strict lazy functional languages like Haskell, is that the underlying principles are quite different from procedural languages like C. In C, you tell the computer to do things. In Haskell, you tell the computer the relationships between things, and it figures out what to do all on its own.

Personally, I suck at Haskell — I’m too much of a procedural programmer. My mind’s stuck in the rails of doing thing procedurally. But I’d very much like to learn it more, *because* it will teach me different ways of thinking about problems. If I can think of an ethernet router as a mapping between an input and output stream of packets rather than as a sequence of discrete events that get processed sequentially, then it may well encourage me to solve the problem in a some better way.

Hascal, and other functional languages may be good for multi-core development. However not to many programmers program in them… Plus I find they do not scale well for larger application. Its good for true computing problem solving. But today most developopment is for larger application which doesn’t necessarly solve problems per-say but create a tool that people can use.

The truth is that the vast majority of the software out there does pretty dull, mostly procedural jobs. That’s why the main languages in use are just dull variations on the procedural, C/Java/Perl style. No matter how much maths geeks go on about functional programming, procedural systems will always be more suited and easier to use for most of the problems out there.

The point is that there’s nothing those languages can do that can’t be done, often more easily, with the current crop of popular languages. Elegance cannot beat convenience in the workplace, or in most at any rate.

Your real problem with Haskell is that it is more complex per written token, and so you have to think more per token. Most people seem to generate some inner fear for things they don’t understand as good as they expect. And that’s the base of all your motivation to find reasons why you dislike Haskell. Of course you could simplify it, and get something like Python. But this is a bad idea on the long run, because then nature will only create bigger idiots. It’s better to wise up a bit, because what you get then, is really really nice!

A pure functional language would be a language where there are no side effects. I.e., you can’t change the state of anything, you can only construct new things out of existing things. As this gives some problems with IO, Haskell, taking purity to the extreme, had to wait for the invention of Monads to be able to do IO. Yes, Haskell was not capable of IO (reading/writing) for years. Functional languages follow this pattern: side-effects are only permitted if there is no other way. Examples are Lisp and Scheme, but also Matlab, Mathematica and Scala. Other languages allow side-effects by default, and have functional aspects in other respects. Examples of these are Javascript, Ruby, Python, Java, C++, C and even assembler (programming without any functional aspects is going to be hard). Quite likely Javascript programming can be done almost purely functionally. But so can C.

Don’t be ridiculous. Functional vs procedural isn’t a matter of intelligence. It’s simply a way of thinking. And the reality is that procedural languages better match the way the human mind works.

IMHO, learning to program in a functional style is like a right-handed person learning to write with their left. Yeah, they can do it, but it requires a ton of work for dubious real-world benefit, and in the end, it’s never really natural, simply because that’s not the way the brain is wired (except for the odd freakish exception ;) .

Whatever happened to reddit?

I am not able to see more than 1000 of my saved links. Is this a new [mis]feature of reddit? I have opened a ticket for the same, but since it was on weekend, did not get any response. I really hope I still have all of my saved reddit links somewhere in the reddit database. If reddit is having space crunch, then atleast they should intimate us to download all our saved links and can later siphon off those links.

I really really hope I have my saved links. :-(

Update: I got a response from the reddit folks saying that it is because of a performance issue and should be dealt with asap.

« Older entries

Follow

Get every new post delivered to your Inbox.