Simulating Network Problem Using iptables Command

October 19, 2009

In a distributed system where communication between services are happening at all time and at the same time network issues or hiccups occur frequently.  A well designed service consumer should able to handle the network connection or timeout issues in a graceful manner.  This issue is magnified to multiple folds if a service client is time bound and expects responses to come back in milliseconds.  So what are the general guidelines or best practices for dealing with the issues outlined above?

Here is a short list of possible solutions.  One must carefully pick the right solution for one’s specific application needs.

  • Exponential backoff – more info
  • Denied access gate with background pinging thread

Regardless of which solution is chosen, what is a good way to simulate the network issues.  This is where the “iptable” command comes in. “iptable” command is generally used by network administrators to administer the tables of IP packet filter rules.  The rule that is useful for our purpose is “DROP”, which drops the packets on the floor.    To simulate a network connection, we can setup a filter rule for a specific host such that all packets that are supposed to go this host will be dropped on the floor.

In short here is the command to set up such filter rule:

sudo iptables -A OUTPUT -p tcp -d <remote host ip> –dport <remote port> -j DROP

When testing is done, make sure to remove the packet filter rule with the following command:

sudo iptables -D OUTPUT -p tcp -d <remote host ip> –dport <remote port> -j DROP

Now we know how to simulate network connection issue and this should help in testing the connection issue error handling code.


Setting Up Hadoop On Windows

October 1, 2008

I recently had to learn Hadoop and getting Hadoop running on Windows Vista is not as straightforward as I thought.  I believe getting Hadoop up and running on Linux is much more easier.  I would like to use this blog to share my experiences and hopefully it will help the next person that is trying to do the same thing.

My goal was to be able to:
1) Run some of the examples that came with Hadoop
2) Run a MapReduce Java program in Eclipse and able to debug it

First Problem:

Hadoop comes with a set of shell scripts, so the first thing to do is download and install Cygwin.  Since the scripts were written on some Unix variation, they will not work out of the box.  When I tried ‘bin/hadoop’ command, I got the following:


./bin/hadoop
./bin/hadoop: line 18: $'\r': command not found
: No such file or directory./bin
./bin/hadoop: line 21: $'\r': command not found
: No such file or directorydrive/c/tools/hadoop-0.14.4

Apparently this is a common problem and it is related to new line differences between Windows and Unix.  Windows uses two characters (\r\n) and Unix uses on character (\n).  Here is a link to a solution.  Basically you need to run comand ‘dos2unix’ on the Hadoop’s scripts or use your favoriate Unix command to strip out the ‘\r’, i.e sed $’/\r//g’ <file name>

Once this problem is resolved, then I was able to run the examples that came with Hadoop like WordCount or Grep inside Cygwin shell.

Second Problem:

My second goal was to run one of the Hadoop’s example inside Eclipse.   When I tried to this I got an exception while Hadoop is trying to create a process – ‘CreateProcess error=2′ and the command is something like ‘df -k’.

So it was frustrating because I was able to the examples in Cygwin shell.  It turned out the MapReduce framework is trying to execute a command ‘df -k’.  Once I added the Cygwin path to Vista PATH environment variable, then this problem went away.  It was great!! Now I can actually step throug the code.  As a developer, this is very important.

I am looking forward to sharing my Hadoop experience as I learn more about it.

Hadoop is a very powerful piece of technology and often power comes with complexity.