Some cool perl and WebScarab Stuff

I was using WebScarab to do some SessionID analysis and I exported its output to a comma-separated file. The output, though, was in a format that Microsoft Excel couldn’t plot. Using perl, I converted WebScarab’s timestamps into more compatible timestamps that I could plot with Excel.

So here’s a couple lines out of that file

1162494041997,11826346672417325953,6c30d8130bc05b6ec381
1162494042104,4008986413070164165,6c3053d2a0cb7e2a20c5
1162494042224,11293771226654801443,6c30cf10801d38497e23

The problem is that the first column, the timestamp, is not parseable by Excel. Worse than that, It’s not even a Unix timestamp. Traditional Unix timestamps are measured in seconds since 00:00 January 1, 1970. This number, 1162494041997 for example, is actually milliseconds since January 1, 1970.

So, in one line of perl, I’m going to convert all the timestamps to a format that Excel can grok. Something like: 11/02/2006 14:00:41.997, which is what 1162494041997 really is.

Here’s the one line of perl. My file from WebScarab is called JSESSIONID.txt:

perl -pi -e 'use POSIX qw(strftime); s/^(\d+)/strftime("%m\/%d\/%Y %H:%M:%S", localtime($1\/1000)) . "." . ($1 % 1000)/e;' JSESSIONID.txt</textarea>

I didn’t say it was a short line.

Here’s an explanation of the key components to that line. You don’t need to know this in order to use this perl code. If you’re wondering what it does, however, this will help you understand.

/^(\d+)/

Match some digits beginning at the beginning of the line.

strftime("%m\/%d\/%Y %H:%M:%S",

This is the format for the time to come out. Notice a couple things: It doesn’t have the fraction of a second (that is done separately) and it has backslashes to escape the forward slashes in the date. Otherwise those forward slashes would confuse the regular expression parser.

localtime($1\/1000)

This is where we take the number we found on the beginning of the line and divide it by 1000. That way, we get Unix seconds, which localtime likes.

. "." . ($1 % 1000)

Take the output of the strftime() call and append some stuff on the end. Namely, a literal “.” and then the milliseconds value from the time.That’s it! It’s not easy stuff, but it is wonderfully effective.

The three lines above get turned into:

11/02/2006 14:00:41.997,11826346672417325953,6c30d8130bc05b6ec381
11/02/2006 14:00:42.104,4008986413070164165,6c3053d2a0cb7e2a20c5
11/02/2006 14:00:42.224,11293771226654801443,6c30cf10801d38497e23

If you’re familiar with Excel, you’ll realize that Excel can handle those date formats just fine. Now you can plot the data in Excel, which gives me a bit more control than WebScarab does directly.