Friday, August 19, 2005

Bandwidth Drain

In the past couple of months I noticed that my website is using increasing more bandwidth. That means that the amount of data served by my site to the world was increasing. While I was delighted that I seem to get more visitors every month, I couldn't really see why. This is just a personal website with very few visitors. The site's usage analysers showed that my main page, blog and master's thesis pages are the three most popular sites visited, but the hit and byte usage numbers didn't add up.

Last month my 7 GB bandwidth limitation was reached. When this happens the site simple cease to function. So, for a about 4 days my website was offline (bad, but not critical) and we (Adele and me) couldn't send or receive email through the pshymorphic.com domain. Since pshymorphic.com is our primary personal email server, it is quite annoying! All messages send to us, just bounced (probably pissed off all the spammers, hehe).

At the start of this month, when the domain was online again, I started monitoring the usage statistics more carefully. The site was transferring about 150-250 MB per day. That is a hell of a log of data, but both Urchin and Webalizer couldn't tell me what the cause was.

About two weeks ago I decided to get to the bottom of this problem. I went through the Webalizer and Urchin reports in great detail. I finally found a clue. Urchin reported that files with the "gif/jpeg" extension were contributing the most to bandwidth usage. I made a few calculations (bytes/hits) for this category and realised that the problems must be a duo to a single file that is at least 500 KB big. I soon found that the offending picture is my somewhat famous GothicBackground. I pretty much forgot that it was on my site, since I've posted a link to this file more than two years ago in my old blog (see 10 March 2003 entry). Amazing how cool the net is.

The disappointing thing is that both Urchin and Webalizer didn't show this file in their "Top Pages" or file list. I'm not sure why, maybe it is a configuration thing. In retrospect, I should have simply monitored the 'access_log' file directly. I would have seen immediately how often the picture was accessed and how big it actually is (560 KB).

I now knew what the cause of the problem was, but a new dilemma presented itself. Who the hell is using the picture? It was obviously not people downloading the picture about 400 times a day. Someone, with a fairly popular site, is referencing the picture directly! I could delete the picture to solve the problem, but I didn't want to do that. I wanted to know who's using the picture, but it is impossible (as far as I know) to get the reference information from the limited data to my disposal.

I tried searching for the file on Google, Yahoo and MSN, but no luck. I did found links to the file on a few sites, so a few people liked it enough to download it :)

All is not lost. I'm and engineer and engineers solve problems. I devised a sneaky plan (I can be very sneaky sometimes). I used Gimp to create a new image. This image contains a request to the viewer to email me with information about where the image was seen. I uploaded the picture to my site and anxiously waited for the emails to stream in. Nothing happened :(

Finally, after about a week, a friendly soul (thanks Nicole!) send me an email with the info I was looking for. The website belongs to a band called "Dimond in the Ruff". I'm surprised that the owner(s) didn't notice the new background.

Now that I've got all the info I need, I must decide what to do. I don't have a problem that someone uses the picture, but they can at least download the picture to their own site. For a minimal fee I might even serve a smaller version of the picture to the site :) My evil side contemplated to replace the image with the infamous Goatse "art work". For the uninitiated, read the story about Goatse at Wikipedia (no picture there, I promise). This plan got a few good laughs from Rob, he think I should do it. For now, I'm going to wuss out on that plan. I'll decide what to do over the weekend.

Labels:

0 Comments:

Post a Comment

<< Home