Search Engine Friendly URLs in Java

At work I’ve been looking into doing search engine friendly URL’s in Java. For those not in the know, a search engine friendly URL is of the form http://www.domain.com/foo/value1/value2 as opposed to the more typical approach of http://www.domain.com/foo?param1=value1&param2=value2.

In my research so far there seem to be several approaches commonly used to solve this problem.

1. The most common seems to be URL rewriting using Apache’s mod_rewrite. In Java there’s also a servlet filter called the UrlRewriteFilter. I think if you’re going to do this for Java backed pages then using a servlet filter is definitely advantageous over using Apache to handle it so that it works in a plain old tomcat dev environment also.

2. Wildcard servlet mapping is another possibility (without using a servlet filter). By mapping something like /* to your dispatcher servlet, you could use the path info in the MVC controller to pull out the parameters. I’m not crazy about this approach because it requires your MVC controllers to be smart and therefor use a non-standard approach to get request parameters.

3. The last option I’ve heard about is just writing your own servlet filter. For example if you know that your search engine friendly urls will always be of the format /foo/param1/value1/param2/value2/etc… then you could simply write a custom servlet filter that disected the uri and filled in the request parameters. You could probably just do this with a regex using option 1 as well.

So far I’m liking options 1 and 3 the most because they take a nice AOP style interception approach that allows you to code MVC controllers expecting regular request parameters. Anyhow, if you’ve heard of how other folks have solved this problem or have suggestions or ideas, I’d be curious to hear about it!

This entry was posted in Java, Search Engine Optimization. Bookmark the permalink.

One Response to Search Engine Friendly URLs in Java

  1. robot says:

    I like Option 1 and I used it temporary to 301-redirect old URLs which could disappear in SE indexes…
    Using query string as part of URL seems to be more natural. SE indexes such pages.
    However, many SE use specific algos and analyze long-term links statistics. And static “path” plays significant role…
    What to do?
    ?Computers is more natural for my site, and even better is
    ?c=12345
    “?c=12345” can end up as long static path:
    /Computer/Notebook/Lenovo/PROM/Microsoft/Office
    (there are products in this category!)
    and even much more longer.
    I can’t believe it can play a role for SEO in 2007. Anchor text and URL, but not “Path” inside URL. SE already know that URL path is not static, and URL query is not dynamic in so many cases.

    Of course I can use /12345 instead of ?c=12345, but I simply don’t have time… additionally to UrlRewrite I will have to modify a lot of tags generating anchors.
    Thanks

Comments are closed.