Archive for the ‘Search Engine Optimization’ Category

Google Analytics, if you’re not using it you should be

Monday, January 9th, 2006

I’ve been using Google Analytics for about a month to track 4 of my sites (and more recently this one as well) and I am impressed! The key (in my mind) to Google Analytics and other high end commerical solutions like Fireclick and Omniture is that they use client side Javascript based tracking which is far less prone to crawler inflation and provides more information (e.g. flash version, screen resolution, color depth, etc..) than log based solutions.

Sure, you lose the 1-2% of users that have Javascript disabled but that’s a lot more accurate than the 15-30% overcounts I’ve seen in many log based solutions. If you’ve ever used AWStats on a high traffic site you’ve likely observed the gross overcounting that occurs from crawlers when AWStats doesn’t know about, spam bots,
etc…

In summary Google Analytics offers a lot of the features of the high end commerical solutions such as site overlay (temporarily disabled), funnel analysis, conversion or goal tracking, cookie based returning and unique visitor stats, detailed client stats, etc… and best of all it’s free. If you’ve got a personal website or are a small to medium sized company you’d be crazy not to at least try it!

That said, there are a few features I would like to see in Google Analytics:
1. Google Analytics desktop dashboard that gives me a quick overview of each of my sites at a glance so I wouldn’t have to login each time.

2. Live traffic data. I haven’t found a way to get up to the minute page view counts which is critical for identifying massive traffic spikes when they happen instead of the next day.

3. Similar to 1 but an aggregate report on the web version that gives me an overview of each of my sites next to each other. As it is I have to drill down into each site to get even basic overview info.

4. The one big feature I would like to see would be on the fly drag and drop pathway analysis like Fireclick and Omniture both offer.

Cloaking, no need to be ashamed

Friday, December 2nd, 2005

Cloaking, the process of showing a user one view of your page and a
search engine crawler another view of your page is (I believe) a
fairly common practice that’s been hotly debated in the past and
discouraged by search engines.

In reality I’ve seen sites use cloaking to strip navigation elements,
hide form data, disable multi-page results, etc… to increase search
engine ranking while at the same time decreasing page size for crawlers
and decreasing the number of pages that need to be crawled. Given that
crawlers represent over a 3rd of page views for many high traffic sites,
cloaking has its economic advantages as well.

In my opinion there is nothing wrong with cloaking and it can be
used effectively as long as it’s not abused and the basic underlying
content and concept of a page stay the same. It’s a bad thing when used
to artificially increase keyword density or misrepresent the content of the
page to simply get page views.

I’ve always been a little hesitant to talk about effective cloaking
techniques with colleagues at other companies though. Cloaking
seems to be a dirty secret that most people feel they shouldn’t talk
about. Just observe how most people who work for a website that
economically depends on search engine traffic will publicly deny
cloaking.

Anyhow, I was surfing around with my user agent
set to GoogleBot to see which other sites are engaging in a little
cloakatude. Low and behold
Amazon behaves differently when you send it a user agent of Google Bot.
No search box in the top nav… makes sense… crawlers don’t use
forms anyhow, why spend money on bandwidth to serve them up a form
they’ll just ignore.

I found other major sites as well that were doing similar cloaking.
Not
bothering to serve ad code to known crawlers, removing unnecessary
content, etc… Anyhow it just goes to show that some degree of
cloaking seems to be happening at many major websites. I for one think
it’s fine. Why spend tens of thousands in bandwidth serving up extra
pages or content that’s not relevant to crawlers anyhow?

Search Engine Friendly URLs in Java

Saturday, July 16th, 2005

At work I’ve been looking into doing search engine friendly URL’s in Java. For those not in the know, a search engine friendly URL is of the form http://www.domain.com/foo/value1/value2 as opposed to the more typical approach of http://www.domain.com/foo?param1=value1&param2=value2.

In my research so far there seem to be several approaches commonly used to solve this problem.

1. The most common seems to be URL rewriting using Apache’s mod_rewrite. In Java there’s also a servlet filter called the UrlRewriteFilter. I think if you’re going to do this for Java backed pages then using a servlet filter is definitely advantageous over using Apache to handle it so that it works in a plain old tomcat dev environment also.

2. Wildcard servlet mapping is another possibility (without using a servlet filter). By mapping something like /* to your dispatcher servlet, you could use the path info in the MVC controller to pull out the parameters. I’m not crazy about this approach because it requires your MVC controllers to be smart and therefor use a non-standard approach to get request parameters.

3. The last option I’ve heard about is just writing your own servlet filter. For example if you know that your search engine friendly urls will always be of the format /foo/param1/value1/param2/value2/etc… then you could simply write a custom servlet filter that disected the uri and filled in the request parameters. You could probably just do this with a regex using option 1 as well.

So far I’m liking options 1 and 3 the most because they take a nice AOP style interception approach that allows you to code MVC controllers expecting regular request parameters. Anyhow, if you’ve heard of how other folks have solved this problem or have suggestions or ideas, I’d be curious to hear about it!