This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.

Tuesday, December 28, 2004

How to Solve the Britney Problem

The "Britney Problem" is a common issue in P2P systems that use distributed hashtable (DHTs) to organize their namespaces. In many DHTs a given key, such as "britney.mpg", is mapped to a single node that will store this file and serve all requests to it. What occurs is that popular key's get mapped to single nodes, causing hotspots that get a tremendous number of requests and therefore can not serve all client peers interested in the file.

Paul Campbell recently posted a very interesting proposal for a solution to the Britney Problem in DHTs on the p2p-hackers list:

There's a simple solution to solve the "Britney Spears" problem in my opinion in the DHT model (randomized searches don't suffer from this problem as a result of their structural fault that they handle common files well but don't handle rare files well). Let's first recast your question in aneven worse situation.

Let's say that we directly implement an inverted index (keyword-->file) on the DHT. Even with common word elimination, this certainly and severely generates a very non-linear index space. But it is something that is very obvious and desirable since it also solves a major DHT and randomized P2Pdilema: how to quickly search for files by keyword.

So far, most proponents of DHT's believe that hashing is the answer. While hashing MASKS problems, it doesn't actually solve the problem. All solutions that I've seen appear to be pretty simple. The idea is to spread the location of a particular key out over a wider arc. Conceptually, one can spread it in multiple points (such as spreading it to equadistant points
around the ring) or on multiple hierarchal rings (like the distributed sloppy hash tables...DSHT's although that was done for delay purposes). But the concept is essentially the same.

The routing algorithm remains valid. Since the routing algorithm incrementally searches for the destination. It just remains to send a search that is in the form of incrementally refining an "area" to search in. As the searchrequest passes through potential nodes, the search area gradually narrows.

Storage is slightly more complicated. It is necessary to store a key/value pair in the appropriate AREA of the DHT. The storage targets the area and not the point. The above search mechanism (which has to be used to find the appropriate destination for a key/value pair anyways) also paradoxically
nails down the appropriate area for storage.

thecircle (from your terminology, it appears that's what you've been looking at) attaches a random bit string to the end of the hash to accomplish this and then spreads a file over a small area using that random hash (although a lot of this isn't actually implemented when you look at the code). The various "tree" designs out there just use the hash keys directly and split storage as necessary. I've seen a few designs that attempt to simply delete common files from the DHT and index those using a randomP2P structure (thus taking advantage of both designs).

What I've noticed in general is this. In order for the routing properties of a DHT to be preserved, it is necessary to maintain the node link structure. But that's not to say that you have to maintain the same structure at the key/value pair level of the system...that is, a direct
one-to-one mapping is necessary. The "skip graph" structures specifically acknowledge this. Simply substitute "key/value pair" for "DNS name" and you convert those structures to the idea I'm referring to.

In a similar way, one could simply store "first key/last key" values in addition to "node keys" and decouple the data structure from the DHT link structure. This is similar to index tabs in a matter how many cards are in the rolodex and no matter what the ordering, you maintain labels and you can approximately locate a card using the same search mechanism all the time.

As to the basic problem of "too many keys with the same value", this is also pretty easy to deal with. Extend the key/value pairs so that for certain operations (searching), you use just the key as before. However, optionally treat the keys as "key+value" meta-keys. For instance, when storing a new key/value pair, use a "search-store" function that treats
the key as "key"+"value" and the value as just "value". Then no matter how identical keys there are, there is still a valid ordering and the number of keys with the same values are no longer a limiting factor. Of course, if your system doesn't require a total ordering on all keys and values (the node balancing algorithms more or less do require this at least right now), then you can dispense with the "key+value" idea and simply ignore this issue.

The one remaining problem is node balancing. The mechanisms that are already published seem to be pretty uniform in design. Nodes that undertake rebalancing can do so using one of two mechanisms. Either an empty node (that may have to be created by dumping it's current payload) is moved into position on the ring, or else incremental data movement is undertaken. Since the former incurs a much higher communication cost, the latter is given
preference as much as possible. However, it has been shown that both mechansims are necessary for load balance to remain bounded.

I wonder what Paul meant by "simply store first key/last key values in addition to the node key" to decouple to the data structure ? did he mean use these in the routing table/leafset or was it something else.
Post a Comment

Subscribe to Post Comments [Atom]

Links to this post:

Create a Link

<< Home

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]