Coding In Paradise

January 09, 2004

There is a WikiWiki with a good introductory page on distributed hash tables (DHTs) at http://www.infoanarchy.org/wiki/wiki.pl?Distributed_Hash_Table
This page gives an overview and then provides further references to research projects that have implemented different kinds of DHTs, such as Chord, Pastry, Kademlia, etc.

As an example of how powerful DHTs can be as a generic naming substrate, see the open-source project named The Circle at http://thecircle.org.au/ . This application uses a DHT for several uses: to share files, to send instant messages and IRC-style chatting, and for putting together a personalized news service.

Here are some other things to see. I run a project named P2P Sockets that includes a simple, non-secure distributed DNS. While it doesn't currently use a DHT (it uses another open-source project named JXTA as its P2P substrate), I plan on transitioning to a DHT in the near-future.

The original paper concerning storing DNSSec records in Chord is available at http://www.pdos.lcs.mit.edu/chord/papers/ddns.pdf

There has been a great deal of activity both at the grass-roots, open source level and at the academic level in these systems the last few years. While I don't believe technical solutions can always solve social and political issues, I do believe that an alternative technical approach to the DNS can help ameliorate these problems. P2P/DHT-based approaches might point the way to such a solution.

There are several significant research issues that must be resolved before this is possible, though. These are latency, DoS-style attacks on such a substrate, reliability of the naming records, and how to achieve secure name bindings while also ensuring human-friendly names.

The first issue, latency, seems to be disappearing as newer DHT algorithms are developed. The second issue is DoS-style attacks. If we go with a First Come/First Served (FCFS) system for handing out naming bindings, which removes DNS-style registrars from the loop, then assailants can programatically exhaust the namespace by simply grabbing names. While a FCFS system is attractive because it removes the need to have gatekeepers handing out names, it does open this problem. One way to solve this is to retain DNS registrars who sign but do not store DNS records; DNS records are stored in the P2P substrate using the DNSSec standard. If a DNS registrar detects that another peer is attempting to DoS it, it can cut it off. Of course, this doesn't protect against distributed DoS attacks, where many peers in the network might be compromised and requesting names. We might have to introduce some "friction" into the system, such as money or hashcash (i.e. clients have to provide a proof that they ran some computationally-heavy algorithm).

The third issue is reliability of the naming records. Chord has its own solution to this problem, as does the OceanStore team. This is a difficult problem without an elegant solution at this point. A good paper comparing some P2P replication schemes is "Erasure Coding vs. Replication: A Quantitative Comparison".

The final issue is to achieve secure name bindings while also ensuring human-friendly names. The problem with achieving these two goals in a distributed, peer-to-peer system is succinctly explained in a position paper by an open-source programmer nicknamed Zooko. This paper is titled "Names: Decentralized, Secure, Human-Meaningful: Choose Two" . Some P2P projects have decided to simply abandon human-friendly names, instead going with secure pointers instead. This is the approach the Freenet project has taken.

Unfortunately, these secure pointers are incomprehensible to ordinary computer users. I think there is value in a global namespace that is human-friendly, such as the current DNS. While I believe that Zooko is correct in identifying that you can't achieve complete decentralization, complete security, and human-friendly names at the same time, I do feel that it is possible to have both security and human-friendly names with a partially decentralized system. The question then becomes how much can we decentralize while still retaining the other two aspects. Perhaps we will be able to decentralize the portions of DNS that are capital intensive, such as storing records or acting as root servers.

Coding In Paradise

Comments

Popular Posts

HOWTO: Adding Keyboard Accelerators to Holoviz Applications for Machine Learning Workflows

Personal Photos Model Using Deep Learning