Consistency is more important than "correctness" or using the "best" software

When I started developing software professionally over a decade ago I would readily diverge from company standards at a whim to use a new language, style of coding, favorite editor/IDE, libraries, different build process, etc… Mainly because I wanted to be using new stuff or because I thought a new tool or approach was better than the outdated company standard approach. Once I started managing engineering though I quickly began to realize that this type of whimsical divergence has a very real orgranizational cost associated with it both on the systems as well as the software front.

Here’s a real world example that you see at many organizations. Say I have 2 web projects at my company such as an intranet site and a public website. If I put two groups of programmers to work on them, within a week both projects will have very different approaches and tools for build processes, deployment, dependency management, app server, database, possibly language, etc… All of this lack of standards has a very real dollar and cents cost to the company since it’s now more costly for one person to move from project to project not to mention the breadth of knowledge required.

So why don’t organizations standardize on language, build process, coding style, package naming, dependency management, app server, IDE, etc…? Because people can’t agree on the “best” or most “correct” software, it’s a very heated and often personal subject and that’s our (myself included) biggest problem. Consistency is far more important for an organization than “correctness” or using the “best” software. Organizational consistency promotes depth of knowledge rather than requiring an extreme breadth of knowledge.

The gains you may see on your project using Ruby or Python will pale in comparison to the long term maintenance costs the organization will incur if it has 5 software products in 5 different languages. Now, more than ever I believe in strict standards in an organization even down to dictating which IDE is used and using checkstyle to enforce consistent coding standards. It allows developers to quickly move between projects because they can start by understanding the code rather than having to learn a whole new set of tools.

This begs the question though of how an organization can evolve if they set standards and discourage divergence from those standards. My answer to that is that it needs to happen selectively through pilot projects and when successful needs to be adopted and gradually rolled out company wide or abondoned. Standards also need to be reviewed regularly by sharp people who will be implementing those standards. You need buyin from the very top for this type of approach rather than bringing new stuff in under the radar but in the long run it’ll save the company money.

Posted in Software Engineering | Leave a comment

Fed up with Java, switching to a new language

I’ve had it with Java, too much technology to learn with Unit testing, ORM, IoC, AOP, and SOA, not to mention static typing and waiting to compile. That’s why as of right now I’m a converted .BAT programmer and here’s why:

1. .BAT is already installed on 90+% of the computers in the world
2. .BAT supports dynamic typing (set FOO=x, set FOO=1)
3. .BAT is interpreted, no build (and waiting) necessary to run my code
4. .BAT is modular and extendable (FOO.BAT can call BAR.BAT)
5. .BAT supports a variable number of arguments to other .BAT modules (BAR.BAT 1 2 3 4 5 6 7)
6. .BAT has more IDE’s than any other language (although my favorite is Notepad)
7. The OS has built in refactoring support allowing me to move and rename my .BAT modules without starting up my favoriate IDE (Notepad)
8. .BAT is cross-platform (DOS, Windows 3.1, Windows 95, Windows 98, ME, 2000, XP, 2003)
9. .BAT supports running from a GUI by double clicking on it or using it from the command line
10. .BAT makes accessing environment variables easy so I can move most configuration out of my .BAT module
11. .BAT has built in debugging and logging with PAUSE and ECHO

Things I’m hoping to see in the next release of .BAT (shipping with Longhorn)

1. Improved ODBC driver support in .BAT
2. Improved regular expression support (although their * and ? filesystem regex support is already the best in the business)

I only wish I’d seen the light sooner before I wasted the better part of a decade on Java!

Posted in Java, Software Engineering | 21 Comments

Unit testing with coverage reports from within Eclipse

The ever verbose (if you've seen his blog) Pjammy pointed me to this nifty Eclipse plugin called djUnit. Just right click on a JUnit test, select run with djUnit, and it runs the JUnit test with instrumentation to produce a code coverage report. Here's a screenshot. Not only that, it will mark the lines in the editor pane that don't have unit test coverage making it really easy to see where more thorough testing is needed.

I've gotten accustomed to seeing this type of report in Maven or Cruise Control, but I often wouldn't look at coverage until I was mostly done developing. That's a bad thing because code coverage becomes more of an afterthought or cleanup process and is therefor likely to be neglected. However, getting coverage reports from within Eclipse whenever I run my unit tests allows me to consider it in the midst of my TDD process resulting in much improved coverage.

Posted in Java | Leave a comment

Static typing increases productivity and reduces errors

I’m going to come right out and say that I have a strong preference for static typing in a programming language, especially as it applies to medium and large sized applications with multiple developers. Here’s why:

1. Refactoring: IDE’s make huge refactoring jobs much easier largely because of static typing.
2. Catching errors at compile time rather than runtime: In a larger organization where unit test coverage can vary a bit it’s nice to know these more basic errors will be caught by the compiler.
3. Readability: I like seeing what classes are expected in a method signature, it makes figuring out a poorly documented API vastly easier, and lastly I like using my IDE to click on a method argument and hit a hot-key to jump over to that arguments class.

For me static typing is one more level of protection where I incur only a slight penalty in extra keyboard strokes for a huge time savings in refactoring, readability, and catching basic errors in out of the way places in the code base. Sure, I can develop slightly faster in a dynamically typed language but I can maintain and refactor the code faster in a statically typed language.

In most organizations it’s been my experience that unit test coverage will vary between departments. It certainly won’t exercise every single line of code in the system. When a developer refactors class X there will be those classes that use X that don’t get every line of code exercised in a unit test (MVC controllers seem to suffer from this a lot). If they’re in a lesser used area of a system and using a dynamically typed language, they’ll breeze right through QA and into production. I’ve never had this issue with a statically typed language but I’ve definitely seen several production code issues due to a refactoring of a project in a dynamically typed language.

I’ve heard the argument that dynamically typed languages are more productive, however, in my experience it’s more so that interpreted languages are more productive and less so because of static typing.

Anyhow, despite my bias for statically typed languages I’m itching for a project to use Ruby, its extremely compact syntax and great libraries just might make up for it’s lack of static typing 😉

Posted in Java, Ruby, Software Engineering | Leave a comment

Testing with mock objects

I put together a workshop for the team at work on unit testing and design with mock objects. I opted to show a small sample using both EasyMock and JMock. I'd used JMock before but EasyMock had some nice surprises in store for me. The two things that stood out immediately for me with EasyMock were:

1. No need to subclass my test case. This has always bothered me about JMock because when you have two libraries that both require subclassing a JUnit TestCase such as XMLUnit and JMock, you're up a creek.
2. In the record phase you call the method on the mocked object so you get to use IDE code completion (if the interface already exists), IDE method signature and stub creation, and refactoring abilities. Whereas with JMock you pass the method to call as a string which has two disadvantages in my opinion. First, when practicing TDD I like to call method names that don't exist on my interfaces yet and then have the IDE create the method signature in the interface and method stubs in the implementation classes. Second, after a refactoring you'll either need an IDE that knows when you rename a method to also replace that method name in strings, or you'll have to do a project or workspace wide search and replace.

The author of JMock has written a JMock and EasyMock comparison that's worth reading. I also know of these open-source projects that use mock objects: Tapestry and Hivemind use EasyMock, AppFuse uses JMock, and ONess uses JMock. If you know of other open-source projects that use one library or the other please post a comment!

Lastly I'll leave a reference to Martin Fowlers article touching on JMock, EasyMock, and the differences between mock objects and stubs.

Posted in Java | Leave a comment

Java's verbose XML API

It feels like it takes me too many lines of code in Java to create a new XML document, transform it, and then validate the result. At every company I've worked at we've ended up with an XML utility class we wrote with various methods to make this process easier. Utility classes that I see regularly reimplemented to make an API more accessible always set off warning flags for me that the API needs improvement.

I understand there's a lot of flexibility with the Java XML API but I'd like a sensible defaults API to use when I don't need all that power.

I feel like it should take me 4 lines to do this basic task: Line 1 call factory to create document, Line 2 call some method I write to fill in the document with content (no gripe there), Line 3 call method to transform it passing in the XSLT to use, Line 4 call method to validate the output of the transformation against a DTD or XSD. Is that too much to ask or do I just not know about some easier way to work with XML in Java?

Posted in Java | Leave a comment

Java XML data binding with Castor

In working on an XML data feed project at work I needed to generate the feed from our POJO domain model. I was pleasantly surprised to find that Castor XML made this task quite easy.

The Marshalling framework will marshall a bean to XML as follows:

MyBean myBean = new MyBean();
myBean.setId(new Integer(12));
myBean.setName("foo");
myBean.setDescription("bar");
Marshaller.marshal(myBean, doc.getDocumentElement());

which creates XML that looks something like this:


<myfeed>
    <mybean>
        <id>12</id>
        <name>foo</name>
        <description>bar</description>
    </mybean>
</myfeed>

You also have the option to create your own mapping file to give you some control over the format of the generated XML. I opted to use an XSLT to beautify the feed rather than using the Castor Mapping. I figured if other developers work on this project in the future chances of them knowing XSLT are good whereas chances of them knowing Castor's XML mapping file format would be slim.

The biggest drawback of this approach of auto generating XML from your domain model is that it is very brittle. One change to the domain model and the generated feed is different and then customers come complaining. I opted to use a DTD (or XSD) and then validate a mini version of the feed in a unit test. That way if a developer makes a change to the domain model causing the feed to change, the continuous integration system will let us know there was a problem when the unit test fails.

Posted in Java | Leave a comment

Persistence layer unit testing best practices

I've been doing some searching on best practices for unit testing the persistence layer with DBUnit and I'm interested in people's feedback on my policies or pointers to policies others have created.

For example I've been thinking of making our company's policy as follows:

1. Unit tests can rely on certain rows with certain Id's being in the database in pristine format (loaded with DBUnit).
2. Unit tests can select DBUnit loaded row id X in table foo, reinsert it with a new id, and then change it or delete the newly added record, but should leave DBUnit loaded records pristine.
3. It's not OK for a unit test to modify DBUnit loaded rows because other unit tests expect those rows to be there.
4. A unit test cannot depend on a certain number of rows being in the database because it will likely contain the DBUnit loaded rows as well as some unknown number of rows that may be left over form other unit tests.
5. In general it's good policy for a unit test to clean up new data it added during the test but not absolutely required because when a unit test fails it may leave bad data behind.

Do these policies seem reasonable or are there other places I should be looking for these types of best practices?

Posted in Java | Leave a comment

Building one big fat jar file

I've been working on software that generates a very large XML feed for customers. My plan was to have it run by a cron job and regenerate the feed once a week. I wanted my project to produce one file (including dependencies such as Spring and Hibernate) that a release engineer could drop onto the production system rather than an archive that needed to be untarred and installed. Out of stuborness I also wanted to be able to run it from cron without wrapping it with a shell script. Basically I wanted to be able to type “java -jar XmlFeed.jar” and just have it work.

I was starting to think it was going to take some significant effort to make this happen when I stumbled across Codehaus Classworlds Uberjaring. Fortunately there was a Maven Uberjar plugin so I added a maven.uberjar.main property to my project.properties file and within a couple of minutes of starting to look for a solution I had built a working jar file. Alas, I discovered the Classworlds classloader was really slow and my program ran about 15x slower than normal.

I started searching again and next stumbled across the Maven JavaApp plugin which does the same thing. I installed the plugin and within about a minute I again had one big jar, and with the Java App approach it ran at normal speed. Success!

I won't argue that Maven is the coup de gras of build systems but the fact that within a few minutes I was able to use 2 different approaches to build a big jar demonstrates the power of Maven plugins and the approach of building software by defaults. I would be just as happy if there was a repository of Ant macros that did the same thing. Until that day comes though Maven has pleasantly surprised me numerous times with plugins that have gotten me up and running quickly.

Note: I'll also add that I discussed this with a friend later and he suggested as an alternate approach that I could have also built an EAR file and run it from the command line with a J2EE app client (not within an app server).

Posted in Java | 1 Comment

And the best MVC framework is….

Norm Deane is going to be pulling a Matt Raible by reimplementing the same software with all of the major MVC frameworks (Struts, Spring MVC, Webwork, Tapestry, and JSF). I'm excited to see his progress as he writes about it, it's a lofty goal but a worthy one IMHO!

I've been going through the same process in trying to evaluate the best web framework for our company going forward. It's a tough decision because there are more factors at play than simply good technology. I don't want to select the beta max of MVC frameworks.

So far I've written sample apps in Struts, Spring MVC, and Tapestry. Next I'll be embarking down the JSF road. The sample app only gives you a taste which is why I like the fact that Norm is redoing a real app. For example I found the learning curve steepest with Tapestry relative to Struts and Spring MVC but the more I used it the more I liked it.

There's a lot to be said for working with something until you know enough about it to honestly critique it. I'm not convinced that my sample app approach is sufficient for me to say which framework I really prefer since I still feel like a novice in most of them. It's unlikely I'll ever have the luxury of time to really become proficient enough in even 3 major MVC frameworks to make a well informed assessment. My decision will be made based on a combination of my own experience writing these sample apps, opinions of colleagues/fellow bloggers I respect, perceived longevity/eventual popularity of the framework, and ability to hire people that know the framework.

As a business decision I feel like on the hiring front Struts is the safest bet at the moment, I get more Struts resumes than I can count when I'm hiring java web developers. On the longevity and developing for the future front my money is on JSF because there are big companies behind it as well as open source advocates that like it. On the cool technology front I think the award should go to Tapestry, not only does it seem to me (a novice) to be a cool component/event driven MVC framework, it has a non-invasive templating language (reminiscent of Enydra's XMLC) that I long for every time I work in JSP.

So which MVC framework will we be choosing at my company? I don't know but I've only got another 3 weeks or so to decide and will let you know what we end up with.

Posted in Java | Leave a comment