So, you have the idea for the next killer P2P app. How do you go about building it?
And testing it?
This is a chronicle (of some sorts) where I explore how one might
go about building and testing a P2P app. This is the path that I took and it may/not
be the best way for your needs, but in the least it should serve as a guide for some
of the difficulties and tradeoffs involved.
This is reverse chronological. So first, I describe how one can test and get some concrete
numbers on how the app might behave in the real world. Next (or before), how to simplify
the building process using middleware. And of course, the all important question of
which programming language to use ;-) Other `side' issues will be discussed as and when
appropriate.
Note: I just started writing out this and it'll be continually updated over time (hopefully :)
so, this page has the cliched `under construction indefinitely' tag.
Nothing is as important as getting that warm fuzzy feeling that your app will run
perfectly on an Internet scale. So, short of recruiting zombie machines, how does one
test a distributed app?
Spending a couple of thousand (green) bucks will get you the required hardware to set up
a fairly decent WAN (Wide Area Network) emulator. Such a setup
allows you to run hundreds of your application instances on `virtual nodes' and data traffic
is shaped underneath by the emulator in real time. No change is required on part of the application
to run on the emulator. I describe one such emulator, Modelnet, in more detail
here.
For Modelnet, you will need one beefy machine and as many smaller machines as you can afford. Plus,
a gigabit switch (a decent 16-port one costs around $200). In my current setup I have one Dell 2-cpu,
2.6Ghz machine and 4 Dell PowerEdge machines running at 2.8Ghz. The four Powerdge machines cost less
that $1,200 together! It is also worthwhile to beef up the RAM on the main emulator machine (2-4GB would
be neat). And make sure all machines have gigabit NIC cards.
An important point in deciding how much hardware you need for a good wamm fuzzy feeling is deciding what target machines the final P2P will run it: desktop machines on a LAN, PCs on DSL and cable, PCs on dial-up or all of the above. There are multiple constraints of network bandwidth on one machine, the router, CPU usage and RAM which dictate how many virtual nodes you can run on one physical machine and how many virtual nodes you can run in total. Plus, the nature of the appplication plays a role as well: is the application more I/O bound or CPU bound? There is no easy answer for this but an example may provide some rules of thumb.
Suppose you wanted to emulate nodes that all sit on 10Mbps interfaces. Then, if each machine has a gigabit
interface, you will be able to run 100 instances of your application on one machine (theoretically) before
the NIC bandwidth becomes the bottleneck. More than likely in this situation, running 100 instances of the
application on one machine will invoke the OS scheduler pretty often. This is more than likely to impact the
final results. Moreover, depending upon the application and the RAM space, the OS may actually place some
apps on the swap space: this is disastrous for timing results. So, when you run many instances of your app
on a node, the first thing you want to check is that the swap is not being hit.
So, you now figure out that you can run 25 processes on a node before you hit the swap space. This means,
the total data generated by all the virtual nodes is now 250Mbps; you still have 750Mbps leftover. The next
step is obvious: get three more machines and run 25 apps on each of them: now you are still under
the bandwidth bottlneck, not on the individual NIC cards but of that of the router. Plus, you are not hitting
the swap space.
Next, you check, the CPU usage on each of the individual PCs. Ideally, you would like to stay below 10% of
CPU usage on each machine. If your apps are CPU intensive and your CPU usage is high when you run 25 instances,
then you know what to do: run less number of instances on each machine and buy more machines.
In most cases, P2P apps are more I/O bound than CPU bound i.e. the network bandwidth is the usual bottleneck.
You would easily be able to run 40 application instances without significantly using up CPU. Additionally, the
TCP/IP stacks have improved considerably in scalability and performance; you can do large amounts of
data transfers with very little CPU help.