At my company we have 50+ MySQL databases on one server that have the same schema for the purpose of partitioning data (due to the size of each database). The decision goes back before my time and of course there is a tremendous amount of spaghetti infrastructure (web, batch processing, cron scripts, reporting, etc…) relying upon this arrangement. The data is state specific so for the purpose of this discussion we'll say all schemas are identical. If I need California data I can connect to that database and look it up the same way I would look it up in say the Maine database. One handy feature in MySQL is that I can get a connection to any database on the server and then access data in another database via that connection. For example I can connect to the Ohio database and then select * from florida.sometable;. You'll see why this is possibly handy in my discussion about iBatis below.
In moving our company from Perl to Java one of the main challenges for me has been sorting out our approach to the ORM layer. I'm using Hibernate on the unique databases that have unique tables. However, I'm struggling to find an elegant way to work with Hibernate to have one class map to tables in 50+ different databases and have come up with the following possibilities (that I still need to test) listed in order of my perceived level of simplicity:
- Use iBatis for these multi-database tables and have Spring maintain 1 database connection and use variable substitution as part of the SQL map statements: select * from #databasename#.table. This approach seems simple and promising on the surface. Downside is programmers will need to know two ORM packages and it takes a little longer to write a DAO/POJO combo in iBatis than Hibernate (at least for me).
- Use Hibernate and have Spring maintain the 50+ hibernate session factories and the 50+ datasources. I would then have Spring dependency inject a class that allows the DAO to fetch the appropriate hibernate session factory.
- Use JDBC and have Spring maintain 1 database connection. Then I would use select * from databasename.tablename and then manually do the ResultSet to POJO mapping.
- Put the company on hold and have engineering focus for a few months on refactoring the 50 databases down to one that support partitioning and refactor the legacy web, batch processing, and reporting systems. This one I feel is a little too costly and risky since I can't think of a way to take an iterative approach due to the number of systems involved.
If you have ideas please chime in!