On the Wire: Network Capture Tools for API Developers

Lane LiaBraaten, Google Data APIs team
June 2007

Introduction

Developing applications that interact with web services poses a unique set of problems. A common source of frustration is not knowing exactly what message was sent to the server, or what response was received. Some of the most difficult bugs to track down are caused by a disconnect between what we think we're sending to the server, and what is actually going across the wire.

This article introduces several tools that can help make the data on the wire more visible and useful. Commonly called "packet sniffers," these tools capture all network packets that move across your network interface. Examining the contents of these packets and the order in which they were sent and received can be a useful debugging technique.

An example: Retrieving a public feed

I'm putting together a cycling team for a charity ride, and have created a calendar for events such as info sessions, team fundraisers, and training rides. I've made this calendar public so team members and other riders can view the calendar and participate in the events. I also want to send out a newsletter with upcoming events, so rather than copying the information from the Google Calendar website, I can use the Google Calendar data API to query this calendar and retrieve events.

The Google Calendar API documentation has information about how to use the RESTful Google Data API to interact with my calendar programmatically. (Editor's Note: as of v3, Google Calendar API no longer uses the Google Data format.) The first thing to do is to get the calendar's event feed URL by clicking on the button on the calendar settings page:

http://www.google.com/calendar/feeds/24vj3m5pl125bh2ijbbneh953s%40group.calendar.google.com/public/basic

Using the Google Calendar documentation as a reference, I can retrieve and display calendar events like this, where PUBLIC_FEED_URL holds the event feed URL.

CalendarService myService = new CalendarService("exampleCo-fiddlerExample-1");
final String PUBLIC_FEED_URL = "http://www.google.com/calendar/feeds/24vj3m5pl125bh2ijbbneh953s%40group.calendar.google.com/public/basic";
URL feedUrl = new URL(PUBLIC_FEED_URL);
CalendarEventFeed resultFeed = myService.getFeed(feedUrl, CalendarEventFeed.class);

System.out.println("All events on your calendar:");
for (int i = 0; i < resultFeed.getEntries().size(); i++) {
  CalendarEventEntry entry = resultFeed.getEntries().get(i);
  System.out.println("\t" + entry.getTitle().getPlainText());
}
System.out.println();

This produces a basic list of the events on my calendar:

All events on your calendar:
    MS150 Training ride
    Meeting with Nicole
    MS150 Information session

The code snippet above displays the titles of the calendar events, but what about the rest of the data we received from the server? The Java client library doesn't make it simple to output a feed or entry as XML, and even if it did, the XML isn't the whole story. What about the HTTP headers that accompany the request? Was the query proxied or redirected? With more complex operations, these questions become increasingly important, especially when things go wrong and we get errors. Packet sniffing software can answer these questions by revealing the network traffic.

tcpdump

tcpdump is a command line tool that works on Unix-like platforms, but there is also a Windows port called WinDump. Like most packet sniffers, tcpdump puts your network card into promiscuous mode, which requires superuser privileges. To use tcpdump, just specify the network interface to listen on, and the network traffic will be sent to stdout:

sudo tcpdump -i eth0

If you run this command, you'll be bombarded by all sorts of network traffic, some of which you won't event recognize. You could just forward the output to a file and grep though it later, but that could lead to some very large files. Most packet capture software has some filtering mechanisms built in so you only capture what you need.

tcpdump supports filtering based on various characteristics of network traffic. For example, you could tell tcpdump to only capture the traffic to or from your server on port 80 (HTTP messages) by inserting your server's hostname into the following expression:

dst or src host <hostname> and port 80

For each packet that matches the filter expression, tcpdump will display a timestamp, the source and destination of the packet, and several TCP flags. This information can be valueable because it shows the order that packets were sent and received.

It is often useful to see the contents of the packets as well. The '-A' flag tells tcpdump to print each packet in ASCII, exposing the HTTP headers and message body. The '-s' flag is used to specify how many bytes to display (where '-s 0' means to not truncate the message body at all).

Putting it all together we get the following command:

sudo tcpdump -A -s 0 -i eth0 dst or src host <hostname> and port 80

If you run this command, then execute the short .Java example above, you'll see all the network communication involved in this operation. Among the traffic you'll see the HTTP GET request:

22:22:30.870771 IP dellalicious.mshome.net.4520 > po-in-f99.google.com.80: P 1:360(359) ack 1 win 65535
E.....@....\...eH..c...P.=.....zP......GET /calendar/feeds/24vj3m5pl125bh2ijbbneh953s%40group.calendar.google.com/public/basic HTTP/1.1
User-Agent: exampleCo-fiddlerExample-1 GCalendar-Java/1.0.6 GData-Java/1.0.10(gzip)
Accept-Encoding: gzip
Cache-Control: no-cache
Pragma: no-cache
Host: www.google.com
Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Connection: keep-alive

You'll also see the 200 OK response message that contains the Google Data feed. Notice that the feed is broken up among four packets:

22:22:31.148789 IP po-in-f99.google.com.80 > dellalicious.mshome.net.4520: . 1:1431(1430) ack 360 win 6432
E...1 ..2.I.H..c...e.P.....z.=.:P..M...HTTP/1.1 200 OK
Content-Type: application/atom+xml; charset=UTF-8
Cache-Control: max-age=0, must-revalidate, private
Last-Modified: Mon, 11 Jun 2007 15:11:40 GMT
Transfer-Encoding: chunked
Date: Sun, 24 Jun 2007 02:22:10 GMT
Server: GFE/1.3

13da
<?xml version='1.0' encoding='UTF-8'?><feed xmlns='http://www.w3.org/2005/Atom'
xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:gCal='http://sc
hemas.google.com/gCal/2005' xmlns:gd='http://schemas.google.com/g/2005'><id>http
://www.google.com/calendar/feeds/24vj3m5pl125bh2ijbbneh953s%40group.calendar.goo
gle.com/public/basic</id><updated>2007-06-11T15:11:40.000Z</updated><category sc
heme='http://schemas.google.com/g/2005#kind' term='http://schemas.google.com/g/2
005#event'></category><title type='text'>MS150 Training Schedule</title><subtitl
e type='text'>This calendar is public</subtitle><link rel='http://schemas.google
.com/g/2005#feed' type='application/atom+xml' href='http://www.google.com/calend
ar/feeds/24vj3m5pl125bh2ijbbneh953s%40group.calendar.google.com/public/basic'></
link><link rel='self' type='application/atom+xml' href='http://www.google.com/ca
lendar/feeds/24vj3m5pl125bh2ijbbneh953s%40group.calendar.google.com/public/basic
?max-results=25'></link><author><name>Lane LiaBraaten</name><email>api.lliabraa@
gmail.com</email></author><generator version='1.0' uri='http://www.google.com/ca
lendar'>Google Calendar</generator><openSearch:totalRe


22:22:31.151501 IP po-in-f99.google.com.80 > dellalicious.mshome.net.4520: . 1431:2861(1430) ack 360 win 6432
E...1!..2.I.H..c...e.P.......=.:P.. 2...sults>3</openSearch:totalResults><openSe
arch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch
:itemsPerPage><gd:where valueString=''></gd:where><gCal:timezone value='America/
Los_Angeles'></gCal:timezone><entry><id>http://www.google.com/calendar/feeds/24v
j3m5pl125bh2ijbbneh953s%40group.calendar.google.com/public/basic/dgt40022cui2k3j
740hnj46744</id><published>2007-06-11T15:11:05.000Z</published><updated>2007-06-
11T15:11:05.000Z</updated><category scheme='http://schemas.google.com/g/2005#kin
d' term='http://schemas.google.com/g/2005#event'></category><title type='text'>M
S150 Training ride</title><summary type='html'>When: Sat Jun 9, 2007 7am to 10am

&amp;nbsp; PDT&lt;br&gt;   &lt;br&gt;Event Status:     confirmed</summary><conte
nt type='text'>When: Sat Jun 9, 2007 7am to 10am&amp;nbsp; PDT&lt;br&gt;   &lt;b
r&gt;Event Status:     confirmed</content><link rel='alternate' type='text/html'
 href='http://www.google.com/calendar/event?eid=ZGd0NDAwMjJjdWkyazNqNzQwaG5qNDY3
NDQgMjR2ajNtNXBsMTI1YmgyaWpiYm5laDk1M3NAZw' title='alternate'></link><link rel='
self' type='application/atom+xml' href='http://www.google.com/calendar/feeds/24v
j3m5pl125bh2ijbbneh953s%40group.calendar.google.com/public/basic/dgt40022cui2k3j
740hnj46744'></link><author><name>MS150 Training Schedule</name></author><gCal:s
endEventNotifications value='false'></gCal:sendEventNotifications></entry><entry

><id>http://www.google.com/cal

22:22:31.153097 IP po-in-f99.google.com.80 > dellalicious.mshome.net.4520: . 2861:4291(1430) ack 360 win 6432
E...1#..2.I.H..c...e.P.......=.:P.. ....endar/feeds/24vj3m5pl125bh2ijbbneh953s%4
0group.calendar.google.com/public/basic/51d8kh4s3bplqnbf1lp6p0kjp8</id><publishe
d>2007-06-11T15:08:23.000Z</published><updated>2007-06-11T15:10:39.000Z</updated
><category scheme='http://schemas.google.com/g/2005#kind' term='http://schemas.g
oogle.com/g/2005#event'></category><title type='text'>Meeting with Nicole</title

><summary type='html'>When: Mon Jun 4, 2007 10am to 11am&amp;nbsp; PDT&lt;br&gt;
  &lt;br&gt;Where: Conference Room B &lt;br&gt;Event Status:     confirmed</summ
ary><content type='text'>When: Mon Jun 4, 2007 10am to 11am&amp;nbsp; PDT&lt;br&
gt;  &lt;br&gt;Where: Conference Room B &lt;br&gt;Event Status:     confirmed

&lt;br&gt;Event Description: Discuss building cycling team for MS150</content><l
ink rel='alternate' type='text/html' href='http://www.google.com/calendar/event?
eid=NTFkOGtoNHMzYnBscW5iZjFscDZwMGtqcDggMjR2ajNtNXBsMTI1YmgyaWpiYm5laDk1M3NAZw'
title='alternate'></link><link rel='self' type='application/atom+xml' href='http
://www.google.com/calendar/feeds/24vj3m5pl125bh2ijbbneh953s%40group.calendar.goo
gle.com/public/basic/51d8kh4s3bplqnbf1lp6p0kjp8'></link><author><name>MS150 Trai
ning Schedule</name></author><gCal:sendEventNotifications value='false'></gCal:s
endEventNotifications></entry><entry><id>http://www.google.com/calendar/feeds/24
vj3m5pl125bh2ijbbneh953s%40group.calendar.google.com/public/basic/va41amq3r08dhh
kpm3lc1abs2o</id><published>20


22:22:31.190244 IP po-in-f99.google.com.80 > dellalicious.mshome.net.4520: P 4291:5346(1055) ack 360 win 6432
E..G1$..2.K.H..c...e.P.....<.=.:P.. ....07-06-11T15:10:08.000Z</published><updat
ed>2007-06-11T15:10:08.000Z</updated><category scheme='http://schemas.google.com
/g/2005#kind' term='http://schemas.google.com/g/2005#event'></category><title ty
pe='text'>MS150 Information session</title><summary type='html'>When: Wed Jun 6,
 2007 4pm to Wed Jun 6, 2007 5pm&amp;nbsp; PDT&lt;br&gt;   &lt;br&gt;Event Statu
s:     confirmed</summary><content type='text'>When: Wed Jun 6, 2007 4pm to Wed
Jun 6, 2007 5pm&amp;nbsp; PDT&lt;br&gt;   &lt;br&gt;Event Status:     confirmed<

/content><link rel='alternate' type='text/html' href='http://www.google.com/cale
ndar/event?eid=dmE0MWFtcTNyMDhkaGhrcG0zbGMxYWJzMm8gMjR2ajNtNXBsMTI1YmgyaWpiYm5la
Dk1M3NAZw' title='alternate'></link><link rel='self' type='application/atom+xml'
 href='http://www.google.com/calendar/feeds/24vj3m5pl125bh2ijbbneh953s%40group.c
alendar.google.com/public/basic/va41amq3r08dhhkpm3lc1abs2o'></link><author><name
>MS150 Training Schedule</name></author><gCal:sendEventNotifications value='fals
e'></gCal:sendEventNotifications></entry></feed>

This output includes all the HTTP headers and content, as well as several cryptic TCP flags. All the data is present here, but it's kind of hard to read and understand. There are several graphical tools that make it easier to view this data.

WireShark (formerly Ethereal)

Screen capture of Wireshark
WireShark displays network traffic in several ways.

WireShark is a graphical tool built with libpcap, the same library that tcpdump is built on, and is available on Linux, Mac OS X, and Windows. WireShark's GUI enables several new ways of interpreting and interacting with packet capture data. For example, as packets are captured from your network interface, they are displayed in different colors based on the protocol they are using. You can also sort the traffic by timestamp, source, destination, and protocol.

If you select a row in the list of packets, Wireshark will display IP, TCP, and other protocol-specific information in the packet headers in a human-readable tree. The data is also displayed in HEX and ASCII at the bottom of the screen.

While the visual nature of WireShark makes network traffic easier to comprehend, you'll still want to filter network traffic in most cases. WireShark has robust filtering capabilities, including support for hundreds of protocols.

TIP: To view the available protocols and build complex filters, click the button near the top of the WireShark window.

To recreate the filter used in the tcpdump example above, you can insert the following expression into the WireShark filter box:

ip.addr==<your IP address> && tcp.port==80

Or leverage WireShark's knowledge of HTTP:

ip.addr==<your IP address> && http

This will filter the results of your capture to just the packets involved in this interaction with the Google Calendar server. You can click on each packet to see the contents and piece together the transaction.

TIP: You can right-click on one of the packets and choose "Follow TCP Stream" to display the requests and responses sequentially in a single window.

WireShark provides several ways to save your capture information. You can save one, some, or all the packets. If you're viewing a TCP stream you can simply click the "Save As" button to save only the relevant packets. You can also import the output from a tcpdump capture and view it in WireShark.

A problem: SSL and encryption

A common shortcoming of packet capture tools is the inability to view data that is encrypted over an SSL connection. The above example accesses a public feed, so SSL isn't necessary. However, if the example accessed a private feed, the client would need to authenticate with the Google authentication service, which does require an SSL connection.

The folowing snippet is similar to the previous example, but here the CalendarService requests the user's calendar metafeed, which is a private feed that requires authentication. To authenticate, just call the setUserCredentials method. This method triggers an HTTPS request to the ClientLogin service and grabs the authentication token out of the response. The CalendarService object will then include the authentication token in all subsequent requests.

CalendarService myService = new CalendarService("exampleCo-fiddlerSslExample-1");
myService.setUserCredentials(username, userPassword);
final String METAFEED_URL = "http://www.google.com/calendar/feeds/default";
URL feedUrl = new URL(METAFEED_URL);
CalendarFeed resultFeed = myService.getFeed(feedUrl, CalendarFeed.class);

System.out.println("Your calendars:");
for (int i = 0; i < resultFeed.getEntries().size(); i++) {
  CalendarEntry entry = resultFeed.getEntries().get(i);
  System.out.println("\t" + entry.getTitle().getPlainText());
}
System.out.println();

Consider the network traffic required to authenticate and access a private Google Data API feed:

  1. Submit user credentials to ClientLogin service
    • Send an HTTP POST to https://www.google.com/accounts/ClientLogin with the following parameters in the message body:
      • Email - the user's email address.
      • Passwd - the user's password.
      • source - identifies your client application. Should take the form companyName-applicationName-versionID. The examples use the name ExampleCo-FiddlerSSLExample-1.
      • service - the Google Calendar service name is 'cl'.
  2. Receive the authorization token
    • If the authentication request fails, you'll receive an HTTP 403 Forbidden status code.
    • If it succeeds, then the response from the service is an HTTP 200 OK status code, plus three long alphanumeric codes in the body of the response: SID, LSID, and Auth. The Auth value is the authorization token.
  3. Request private calendar metafeed
    • Send an HTTP GET to http://www.google.com/calendar/feeds/default with the following header:
    • Authorization: GoogleLogin auth=<yourAuthToken>
      

Try running this snippet and viewing the network traffic in WireShark (using 'http || ssl' as a filter). You'll see the SSL and TLS packets involved in the transaction, but the ClientLogin request and response packets are encrypted in the "Application Data" packets. Don't worry, next we'll look at a tool that can actually reveal this encrypted information.

Fiddler

Fiddler is another graphical packet sniffing tool, but it behaves quite differently than the tools presented so far. Fiddler acts as a proxy between your application and the remote services that you're interacting with, effectively becoming a man-in-the-middle. Fiddler establishes SSL connection with both your application and the remote web service, decrypting traffic from one endpoint, capturing the plaintext, and re-encrypting traffic before sending it on. Unfortunately, Fiddler is only available for Windows-sorry to all you Mac and Linux users.

Note: SSL support requires Fiddler version 2 and the .NET Framework version 2.0.

Viewing network traffic in Fiddler is mostly done through the Session Inspector tab. The sub-tabs most useful for debugging issues with the Google Data APIs are:

  • Headers - shows the HTTP headers in a collapsable tree format.
  • Auth - shows the Authentication headers.
  • Raw - shows the contents of network packets in ASCII text

TIP: Click the icon in the lower left corner of the Fiddler window to turn capturing on and off.

Fiddler uses the .NET Framework to configure network connections to use Fiddler as a proxy. This means that any connections you make with Internet Explorer, or with .NET code will appear in Fiddler by default. However, the traffic from the Java sample above won't show up because Java has a different way of setting up HTTP proxies.

In Java, you can set the HTTP proxy using system properties. Fiddler runs on port 8888, so for a local installation you can make Java code use Fiddler as a proxy for HTTP and HTTPS by adding these lines:

System.setProperty("http.proxyHost", "localhost");
System.setProperty("http.proxyPort", "8888");
System.setProperty("https.proxyHost", "localhost");
System.setProperty("https.proxyPort", "8888");

If you run the sample with these lines, you'll actually get a nasty stack trace from the Java security package:

[java] Caused by:
  sun.security.validator.ValidatorException: PKIX path building failed:
  sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
Screen capture of Fiddler
Fiddler can decrypt and display SSL traffic.

This error occurs when the certificate returned from the server in an SSL connection can't be verified. In this case, the bad certificate is coming from Fiddler, acting as a man-in-the-middle. Fiddler generates certificates on the fly, and since Fiddler isn't a trusted issuer, these certificates will cause Java to fail in setting up SSL connection.

Note: When Fiddler is running, any SSL connection you make in Internet Explorer will trigger a 'Security Alert' asking if you want to proceed despite the sketchy certificate. You can click 'View Certificate' to see the certificate that Fiddler generated.

So how can you get around this security exception? Basically you need to reconfigure Java's security framework to trust all certificates. Luckily, you don't have to reinvent the wheel here-check out Francis Labrie's solution and add the SSLUtilities.trustAllHttpsCertificates() method to the example above.

Once you've configured Java to use Fiddler as a proxy and disabled the default certificate verification, you can run the example and see all the traffic that is sent over the wire in plaintext. Don't steal my password!

Remember, this authentication transaction is just one small example of SSL traffic. Some web applications use SSL connections exclusively, so debugging HTTP traffic is out of the question without a way to decrypt the data.

Conclusion

tcpdump is available on Linux, Mac OS X, and Windows, and is a great tool when you know what you're looking for and just need a quick capture. However, there are some graphical tools that present the network traffic in formats that are easier to comprehend. tcpdump has many more options and filtering capabilities than those that have been covered here. For a full description of tcpdump's functionality, type 'man tcpdump' or visit the tcpdump man page online.

WireShark is also available on Linux, Mac OS X, and Windows. Built-in support for hundreds of protocols makes WireShark a useful tool for many applications, not just HTTP debugging. This introduction barely scratches the surface of WireShark's many capabilities. For more information type "man wireshark" or visit the WireShark website.

Fiddler also has a lot of great features, but what sets it apart is its ability to decrypt SSL traffic. For more information visit the Fiddler2 website.

These packet sniffing applications are great tools to have in your toolbelt, and observant readers will have noticed that they're all free! Next time you're working with the Google APIs and you see something fishy, pull out one of these network analyzers and take a closer look at what's on the wire. If you can't find the problem, you can always post a question to our discussion group. Including the relevant network messages will help others understand and diagnose your particular issue.

Good luck and happy sniffing!

Resources