Tatsat Banerjee


What's the next logical step in web development?



I’m a big fan of Ruby, the language.

I’m really impressed with the potential of the Rails framework (although to be honest there seems to be a lot of voodoo going on behind the scenes, which is probably just an indication that I have not yet done enough with this framework to be comfortable with it).

But there is no denying the reality that, for most of us, adopting Ruby and RoR is not a simple decision, because we already have code that has been developed atop a Java web stack.

If we are looking at a new project, and there is some ability to absorb the risk of going with a new technology stack, I would be the first to suggest that RoR be given serious consideration. On the other hand, if we are involved in the on-going development of a Java web application that has been going on for several years, does it make sense to try to integrate a new stack into the mix?

Let me set some parameters. You have a successful product that is continually being improved. Using “best practices” (I’ll come back to that term in another post), the application has been developed using an MVC framework — let’s say Struts, JSP and Hibernate as a concrete and all-too-common example.

Now, we want to make some changes as a result of customer feedback. Typically, these changes are either new features or modules, or else changes to the existing functionality.

Changes to the existing functionality are hard to do in any other technology. Sure, if the changes are broad enough, it might make sense to redevelop one or more modules from scratch; in that case, of course, we can treat it the same way we would treat the development of a new feature.

So, for a new feature, what alternatives do we have?

We can certainly continue on with our existing technology stack. In many ways, this is the low-risk option, because we know the technology. We know the tool-sets, we have knowledge within our development team, and we have historical data about how quickly we can do things using that technology. And let’s not forget that the stack itself is well-proven, because we (and countless others) have deployed applications on that stack and we know how the stack scales, how it deploys, how it responds to machine and network resource allocations. There is a lot of value to maturity.

But is it really a low-risk option? What if the development velocity is not fast enough? Sure, we have good predictability, but if our predictions are that we can’t do it as quickly as we need to in order to satisfy the customer, or prevent the work being assigned off-shore, then surely persisting with that technology is in reality the high risk option.

And here’s the real issue. Java web development using the classic development stack is just not fast enough. I’ve heard the arguments – you need to do all the fancy plumbing and documentation / annotation so that the app scales and is flexible and maintainable when it gets hit by millions of users per minute. The reality, however, is that most applications never need that scalability, especially if they are not finished because application development is too slow.

So what do we do? Is there some way to incorporate some of the really rapid web application development techniques into an existing Java web application?

I think that we are at a difficult time in Java web development. We have a lot of systems that have been developed over a long time (“long” being relative to the rate of change of software development, of course). While these systems are often still in active enhancement, they are also in a very real sense legacy systems, built using tool sets, frameworks and architectures that, well, seemed like a good idea at the time. Looking back at what we have done, and also looking at all the shining new toys all the new kids are playing with that show just how much faster web application development can be, we can’t help but feel frustrated and itchy to do something, well, different.

At the same time, we have some interesting things “just around the corner” in the Java space. But we need something we can use right now.

I see JRuby is coming along in leaps and bounds. This is an implementation of the Ruby language for the Java Virtual Machine. It is almost at the point that it can run Rails applications. However, I don’t see a clean way to do new-feature development for an existing web application using Rails, even if it is on the JVM. Another very nice dynamic language, Python, has a JVM implementation in Jython, but this is languishing and seems to have been largely orphaned when its initial developer switched focus to IronPython, which is an implementation for the .NET platform.

Groovy is coming along nicely, but slowly. It is likely to be an “official” scripting language as a result of having a JSR. Also, it has a Java-like syntax, which means that there is a shorter pick-up time required for Java developers.

If I thought that language is the limiting factor, then I would look at Groovy because it has a lot of the syntactical conveniences of the popular scripting languages with full access to existing objects that have been coded in Java (including the Java libraries).

However, while I think Java can be too wordy, requiring lots of boilerplate code in some circumstances, I am not at all convinced that this is the major reason that web development in Java is too slow.

In reality, I think that the real reason web development in Java is too slow is that we are making it too complicated. The real reason that frameworks like RoR are so incredibly productive, in my opinion, are more related to the use of very simple ORM designs like ActiveRecord, and the Convention over Configuration philosophy.

Sure, Hibernate is REALLY powerful. But it is not ideal for all sorts of database access, at least not when used naively. Sometimes, a simple SQL query, processed as JBOF (Just a Bunch Of Fields, and yes, I did just make that up) is totally appropriate.

Consider for example presenting a user with a filtered, paged list of widgets. In the prehistoric era of web development (that is, about 8 years ago, and using VB6 COM behind IIS/ASP) I designed a relatively simple, generic technique. I created an SQL statement by putting together the WHERE clause dynamically. I then did a SELECT statement, retrieving only the IDs that matched the criteria. IDs were just 32 bits each, so even a million of the suckers was just 4M – most lists were a few hundred to a (very) few thousand rows. I just stuck them in an array and stuffed them into the session. Then, paging was simple: just calculate the array indexes that correspond to the desired page, create an SQL statement that retrieves only the ID and the columns required for the list display (using an SQL WHERE ID IN … statement) and displayed the list. All this is totally generic, it scales REALLY well, and has not let us down after years of very heavy use in the field.

More recently, and in the Java world, we end up retrieving lists of objects. We rely on Hibernate or the ORM de jour to do magic, multi-level caching and lazy object instantiation and hope that it all works. And then we dump the list into some magic JSP taglib that does sorting in memory. And when the list gets to a few hundred items, the list takes MINUTES to display, and customers are unhappy, and developers say “you didn’t specify performance criteria”, and analysts say “but of course it has to handle more than a dozen items in a list”, and you need to divert resources to do major investigation and refactoring or redevelopment, and you start to think that things are not meant to be this hard.

In business application development, the needs of the application for data access are not complex. We need to get filtered lists of items, then we need to get complete individual items. That’s pretty much it, and that’s what DATABASE servers do — we should let them do their job and not try to replicate that in the application. Updating is only a little more complex.

The other lesson that we can learn from RoR is that we seriously need to tame the configuration frenzy that Struts brings. I need more time to think about this, but I think that a good way to begin simplifying this in an existing product is to add a single Struts action that further parses the request URI and uses some convention to identify the class and method that should handle it. That class could be written in Java, or any of the new, JVM-hosted scripting languages. Do it well, and write a suitable class loader, and you could even hot-deploy a URI request handler class or JAR file.

The reason that I am considering this is not because I don’t want to use an existing framework like Ruby on Rails (or for that matter Turbogears or Django). It’s that I need to be able to integrate whatever framework we use into the application as it exists so far, and everything I see (and my gut instinct) tells me that these frameworks are good for new projects but are likely to be a bitch to configure and integrate with a Java/Struts/JSP stack.

I have not yet clarified my own thinking about all this, but I wanted to post it to get some feedback. What do others think? Am I alone in thinking that we are making Java web development harder than it needs to be?