I am at the FAST 2011 conference this year presenting RAMCloud as a poster. See Poster at USENIX FAST 2011 for what I came up with. There have been some interesting talks at the conference so far with entire sessions devoted to flash/SSDs and data de-duplication.

One of the papers at the conference claimed that 4K was the average size of files on a windows machine – measured over multiple years at Microsoft. Another suggested that flash lifetimes were so poor that writes should be de-duplicated before storage. Yet another flash paper wanted to use value-locality (just another name for dedup? ) to improve performance of writes.


Lingo learned –

Wear Leveling – The process of spreading out writes to different physical locations on a solid-state device to avoid making too many writes to the same physical address on the device. This is required because these SSDs support a limited number of write cycles to any one location before that location becomes unreliable !

Erasure code – An error correction method that uses multiple extra bits after the original message to provide error correction for the message. Parity codes are a special case.


Interesting aside – several attendees were updating their company wikis with FAST 2011 trip reports as the talks were going on ! I was using an emacs buffer mostly.

Data Visualization

After years of working with large data, analytics and business intelligence at Yahoo, I have learned that often that hardest part of data analysis is the last one – the one where results of the analysis are presented/delivered to an audience.

As the analyst you have spent the last few days or weeks living with this data and somehow expect many aspects of this data to be as obvious to your audience as they are to you. This is hardly ever the case and is the reason why your final report or presentation needs to be particularly good. Often the results being presented are complex and contain multiple dimensions which can be hard to explain with words alone.

What if instead you could use a medium that allowed your audience to come to conclusions about this data on their own ? It is certainly alright to point them in the right direction, but why not allow them the joy of discovering these insights on their own ?

Interactive data visualizations might just be the answer. BI tools like MicroStrategy and Tableau help with this tremendously.

I was curious about constructing visualizations in the right way and was hence drawn to the course I am taking this quarter –

Additionally, I am also working in an inter-disciplinary research team at (with Geoff McGhee) and building visualizations to tell stories with them. See for a wonderful hour-long video from Geoff about journalism in the age of data.

I have the best job in America!

According to CNN money magazine (Nov 2010 issue) –

For me personally, the focus on technical challenges and the ability to work across a wide range of projects were the primary attractors to the architect job at Yahoo. It was hard to give up the direct involvement and control that came with being in engineering management at first, but I have learned so much while being an architect and find it very satisfying.

It is really cool to see David Chaiken, Yahoo’s Chief Architect (and one of my mentors at Yahoo!) featured in this article. It would have been very hard to pick a better representative in my opinion.

Perl – TIMTOWTDI with modules – but which module is best ?

Perl/CPAN often leaves you with too many modules as options to
establish the same task. Here is one way to handle that problem. Choose
the CPAN module that the rest of the CPAN community is excited about and uses in other CPAN modules.

This is from July 2009 –

So libwww-perl, Catalyst, Test::Exception, WWW::Mechanize and DateTime are all winners.

iPad – what would I do with one ?

The close encounter happened on saturday (launch day). One of the other shooters in the lucky-shot pool tournament had stood in the line in the morning and gotten one. As to why he would want to show it off at the pool-hall in the evening, I have no clue. Predictably the device is absolutely gorgeous and a joy to hold and use. Magazines and comics looked great. Most of the app line-up did not feel too different from an iphone. We tried to play HQ/HD youtube videos but could not. The keyboard was big and easy to use. So far, my limited imagination has not come up with uses for this beyond,

1. Replace my work laptop in meetings – note taking, surreptitious email-reading, sketching quick diagrams with a stylus

2. Gift for my parents as a wifi video-phone device. No camera yet so have to wait for this. But it will be super-easy to use and carry everywhere.

3. Gym companion – music player, videos player, book-reader to keep be from getting bored on the treadmill.


Is that enough justification for 500$ ?

Feeling good…..

Birds flying high you know how I feel
Sun in the sky you know how I feel
Reeds drifting on by you know how I feel

It’s a new dawn 

It’s a new day 

It’s a new life..For me 

And I’m feeling good

These are the words to a magical song. If you haven’t heard it before, start by imagining how it might sound.

Big bombastic horns announcing that you are feeling good ? Maybe a
swaying chorus of singers with wide smiles, clapping hands and declaring
these words. Lots of swagger either way. I expect it to sound like the
joyful exultation it reads like.

Now go look for either Nina Simone’s original version or Muse’s take on the song –

Confusing and wonderful at the same time.