CS 6901, Projects in Computer Science
Fall 2005
Advisors: Dr. Markus Hofmann, Dr. Henning Schulzrinne, Salman Abdul Baset
NoTorrent is a system that uses peer-to-peer web serving in an attempt to counter the Slashdot Effect (also known as flash crowds or web hotspots). There is a NoTorrent tracker that keeps track of which peers have copies of which resources. There is also a NoTorrent client which is responsible for retrieving a resource that a user requested. The client also serves cached resources to other NoTorrent peers.
Return to main NoTorrent page.
NoTorrent is written in Java, so it should work cross-platform. It is written in Java 1.5, so the JVM must be at least version 1.5.
Download the NoTorrent program files: notorrent.tgz
Untar that file:
tar xzvf notorrent.tgz
Read the instructions in the README for installing, running, and configuring (via command line options) the tracker and the client.
One way to demo the system is to simulate an origin server becoming inaccessible. Peers that retrieved a resource from the origin server before it went down can serve that content to another peer after the server goes down. To simulate this, do the following:
The program consists of a client and a tracker, which we describe separately below.
The client starts by initializing its cache, spawning several threads to handle proxy connections from the browser, and spawning several threads to handle resource requests from other peers.
The main part of the client cache is a hash table mapping URIs to resources (encoded as byte[]s).
The client proxy listens for connections from the client's browser. The connections include HTTP requests for resources. The resources are then retrieved either from the origin server, the cache, or a peer, as described in the Architecture and Strategy section of the NoTorrent main documentation page.
The client server listens for connections from other peers. The requests are XML-encoded messages containing the URL of the requested resource. The client server looks in its cache for the resource. It either returns the resource as a byte[], or it terminates the connection to indicate that it does not have the resource.
The tracker starts by initializing its StateInfo table and then spawning several tracker threads to handle messages sent by clients.
The tracker maintains a StateInfo object, which is basically a hash table that maps URIs to sets of peers that claim to have the resource associated with that URI.
The tracker handler thread receives XML-encoded messages from clients. The main messages the tracker receives are:
Some web requests are not forwarded correctly to the origin server. Currently I parse the user's HTTP request for the URL, and then I make an HTTP request for that resource without passing any of the original message headers. Most pages can be retrieved normally in this manner. However, certain requests with the missing headers result in HTTP 403 Forbidden responses by the server. This problem could probably be fixed by just forwarding a client's HTTP request to the server instead of stripping it of all headers and creating a new HTTP request.
Additionally, Yahoo encodes the URLs in their search results with tracking information. For example, a search for 'columbia' returned a result with the following URL:
http://rds.yahoo.com/_ylt=AvJ7YJnlZa45cNDRLKUeG9pXNyoA;_ylu=X3oDMTE2aXNyZXZjBGNvbG8DdwRsA1dTMQRwb3MDMgRzZWMDc3IEdnRpZANGNjcxXzky/SIG=11c9ojmuj/EXP=1135155687/**http%3a//www.columbia.edu/When not using the NoTorrent client proxy, the browser translates the URL in the browser's address bar to http://www.columbia.edu/. When using NoTorrent's client proxy, the URL remains the same. The pages still load, but this is an abnormality the user should not encounter (remember, we had said in the main NoTorrent documentation page, that we wanted to "Do no harm." That is, NoTorrent should never get in the user's way or provide worse content than if it hadn't been there). It is unclear at this point whether the problem is that I am not correctly handling HTTP 302 redirects or whether this too has to do with missing HTTP headers.
The system currently works basically as it should in that the client proxy can retrieve resources from
As mentioned in the main NoTorrent document, this project was originally intended to be a modification of the BitTorrent codebase. That changed, but the overall structure of NoTorrent was more or less based on BitTorrent [1]. In particular, although NoTorrent messages are encoded in XML and BitTorrent messages are encoded in Bencoding ("BEE encoding"), I was heavily researching the BitTorrent protocol [2] [3] when I designed the NoTorrent messages. Thus, the NoTorrent message design was significantly influenced by the BitTorrent message design.
As mentioned earlier, NoTorrent messages are encoded in XML. Encoding and decoding XML messages could potentially be a headache. Thanks to JDOM [4], however, encoding and decoding XML messages was very easy.