Program Documentation for NoTorrent

Howie Vegter
Columbia University
New York, NY 10027
USA
hrv2101 AT columbia DOT edu

CS 6901, Projects in Computer Science
Fall 2005
Advisors: Dr. Markus Hofmann, Dr. Henning Schulzrinne, Salman Abdul Baset

Program Abstract

NoTorrent is a system that uses peer-to-peer web serving in an attempt to counter the Slashdot Effect (also known as flash crowds or web hotspots). There is a NoTorrent tracker that keeps track of which peers have copies of which resources. There is also a NoTorrent client which is responsible for retrieving a resource that a user requested. The client also serves cached resources to other NoTorrent peers.

Return to main NoTorrent page.

Outline of this Report

System Requirements

NoTorrent is written in Java, so it should work cross-platform. It is written in Java 1.5, so the JVM must be at least version 1.5.

Installation Instructions

Download the NoTorrent program files: notorrent.tgz

Untar that file:

tar xzvf notorrent.tgz

Read the instructions in the README for installing, running, and configuring (via command line options) the tracker and the client.

How to Test NoTorrent and Demonstrate Its Operation

One way to demo the system is to simulate an origin server becoming inaccessible. Peers that retrieved a resource from the origin server before it went down can serve that content to another peer after the server goes down. To simulate this, do the following:

Program Internal Operation

The program consists of a client and a tracker, which we describe separately below.

Client

The client starts by initializing its cache, spawning several threads to handle proxy connections from the browser, and spawning several threads to handle resource requests from other peers.

Client Cache

The main part of the client cache is a hash table mapping URIs to resources (encoded as byte[]s).

Client Proxy

The client proxy listens for connections from the client's browser. The connections include HTTP requests for resources. The resources are then retrieved either from the origin server, the cache, or a peer, as described in the Architecture and Strategy section of the NoTorrent main documentation page.

Client Server

The client server listens for connections from other peers. The requests are XML-encoded messages containing the URL of the requested resource. The client server looks in its cache for the resource. It either returns the resource as a byte[], or it terminates the connection to indicate that it does not have the resource.

Tracker

The tracker starts by initializing its StateInfo table and then spawning several tracker threads to handle messages sent by clients.

Tracker's StateInfo

The tracker maintains a StateInfo object, which is basically a hash table that maps URIs to sets of peers that claim to have the resource associated with that URI.

Tracker Handler Thread

The tracker handler thread receives XML-encoded messages from clients. The main messages the tracker receives are:

Things that Do Not Work

Some web requests are not forwarded correctly to the origin server. Currently I parse the user's HTTP request for the URL, and then I make an HTTP request for that resource without passing any of the original message headers. Most pages can be retrieved normally in this manner. However, certain requests with the missing headers result in HTTP 403 Forbidden responses by the server. This problem could probably be fixed by just forwarding a client's HTTP request to the server instead of stripping it of all headers and creating a new HTTP request.

Additionally, Yahoo encodes the URLs in their search results with tracking information. For example, a search for 'columbia' returned a result with the following URL:

http://rds.yahoo.com/_ylt=AvJ7YJnlZa45cNDRLKUeG9pXNyoA;_ylu=X3oDMTE2aXNyZXZjBGNvbG8DdwRsA1dTMQRwb3MDMgRzZWMDc3IEdnRpZANGNjcxXzky/SIG=11c9ojmuj/EXP=1135155687/**http%3a//www.columbia.edu/
When not using the NoTorrent client proxy, the browser translates the URL in the browser's address bar to http://www.columbia.edu/. When using NoTorrent's client proxy, the URL remains the same. The pages still load, but this is an abnormality the user should not encounter (remember, we had said in the main NoTorrent documentation page, that we wanted to "Do no harm." That is, NoTorrent should never get in the user's way or provide worse content than if it hadn't been there). It is unclear at this point whether the problem is that I am not correctly handling HTTP 302 redirects or whether this too has to do with missing HTTP headers.

Potential Enhancements

The system currently works basically as it should in that the client proxy can retrieve resources from

  1. the origin server,
  2. the cache, and
  3. other peers.
However, besides the normal issues of hardening and cleaning the code, there are several other ways the system could be enhanced:

Acknowledgements for Code and Ideas Borrowed

As mentioned in the main NoTorrent document, this project was originally intended to be a modification of the BitTorrent codebase. That changed, but the overall structure of NoTorrent was more or less based on BitTorrent [1]. In particular, although NoTorrent messages are encoded in XML and BitTorrent messages are encoded in Bencoding ("BEE encoding"), I was heavily researching the BitTorrent protocol [2] [3] when I designed the NoTorrent messages. Thus, the NoTorrent message design was significantly influenced by the BitTorrent message design.

As mentioned earlier, NoTorrent messages are encoded in XML. Encoding and decoding XML messages could potentially be a headache. Thanks to JDOM [4], however, encoding and decoding XML messages was very easy.

References

1
Cohen, Bram. BitTorrent. 19 Dec. 2005 <http://www.bittorrent.com>.
2
Bittorrent Protocol Specification v1.0. 13 Dec. 2005. 19 Dec. 2005 <http://wiki.theory.org/BitTorrentSpecification>.
3
BitTorrent - Protocol. 19 Dec. 2005 <http://www.bittorrent.com/protocol.html>.
4
JDOM. 19 Dec. 2005 <http://www.jdom.org>.
5
RFC 3489. 19 Dec. 2005 <http://www.faqs.org/rfcs/rfc3489.html>.
6
STUN. 19 Dec. 2005 <http://en.wikipedia.org/wiki/STUN>.

Last updated: 2005-12-20 by Howie Vegter, hrv2101 AT columbia DOT edu