Skip navigation

I fear that most programmers (myself included) usually think of the word "legacy" as something bad, as in the following article:
Do your customers have legacy COBOL applications written around the time King Nebuchadnezzar II built the Hanging Gardens of Babylon?

Legacy doesn't have to be a bad word, as is seen in The American Heritage® Dictionary of the English Language:

leg•a•cy

Something handed down from an ancestor or a predecessor or from the past:
"a legacy of religious freedom." See Synonyms at heritage.


What are you doing today to insure that the programmers of tomorrow don't curse that @#%!! legacy Java code that they're stuck with?

Based on my own astute powers of observation (and my gut feelings), I will hazard to publically predict that the programmers in the distant future (say five years from now) will not all be using Java. I further predict that the many programmers who will be using Java will not be using the primitive dialect that we currently employ.

With this limited knowledge of the future, how can I increase the odds that my coding efforts will be appreciated rather then cursed? To answer this question, I took a look at my least favorite legacy "assets" and tried to put my finger on what annoys me most.

Many of my least-favorite legacy applications have poor documentation. If you want to make a future programmer happy, explain what your code does and how to use it in clear and concise terms.

Hard to generate inputs and hard to parse outputs are guaranteed to produce expletives rather then praise. I once wrote a "survey" application whose output was serialized C++ objects. A companion application, also in C++, reinstantiated the survey objects to populate a database (The "surveys" were mailed to consumers on floppies. When the consumers mailed the floppies back, the results were retrieved). This was a great idea until the company switched to Java. If I had added code to produce ASCII output (this was pre-XML days), I would have been cursed less often. Efficiency does less to assure a good legacy then interoperability.

Several of my least-favorite legacy applications are bound by Graphical User Interfaces. GUI applications are notoriously difficult to integrate with anything else (see the blog by Rich Burridge for specifics): Contrast reusing GUI applications with reusing the CLI shell utilities that are so prevalent in Unix environments. I'm not suggesting that GUI interfaces aren't appropriate, just that life is easier when designers include command-line interfaces or programmatic APIs to access the core functionality.

Finally, my least-favorite legacy assets often do more then they need to. Sometimes the programs have bizarre side effects, but often the problem is that the original designer didn't distill the functionality to its essence. It's like having to pay for a reverse-osmosis purification unit when all you need is a water heater.

We can't guarantee that our heirs will value our code, but with the benefit of hindsight we can improve the odds.

All this leads me to think that embracing the Service Oriented Architecture paradyme will reduce the chances that I will be burned in effigy some day. SOA is defined in an IBM article as:

SOA is - "an application architecture within which all functions are defined as independent services with well-defined invokeable interfaces which can be called in defined sequences to form business processes."
The experiences that Calum Shaw-Mackay relates in his blog on The Great Theoretical Architecture resonate strongly with my own. His distilled design mantra is simple: 
"this is my service it does X, and when you tell it to do X with this data, it does something and gives you this back"
Perhaps if I design my applications with "service orientation" as my guiding principle, then the legacy of code that I pass on will be appreciated instead of endured.
(Cross posted at The Thoughtful Programmer)  
One of the first tasks that I performed for my employer was to diagnose and resolve a several minutes long CPU spike on the database server for one of our J2EE applications. All of our servers were well monitored, and without much ado we were able to pin the spike on a specific use case. As it turns out, the culprit was a use case for exporting Loan Application records from our company to a client. To accomplish this task several thousand Entity Beans were instantiated, data was extracted from the objects, and a comma-delimited output file was generated. Adding insult to injury, in addition to instantiating thousands of EJBs, the collection exceeded our cache size resulting in the passication and activation of beans (to and from the database) as we traversed the collection.
In retrospect it's hard to fathom how this implementation strategy got past a design review, but in the heat of battle all sorts of less-then-optimal solutions creep into most products.

The first tack that I took to resolve this issue was to pursue a JDBC rather then an Entity Bean approach (inspired by the Fast-Lane Reader pattern), and this resulted in a substantial performance gain (the use case executed in a third of the original time). Fortunately, my colleagues are way more SQL savvy then I am, and they suggested pursuing a stored procedure approach. The stored-procedure implementation of the use case executes in about 1/100th of the time required for the original EJB-centric solution.

This is one of those great "war stories" that can be used to make all sorts of points. It speaks to inadequate design reviews, the need for system monitoring, the misuse of Entity Beans, the value of teams with diverse skill sets, and numerous other "soap box issues" that I've been known to pontificate about (a former co-worker coined the tern "johntification" to refer to my frequent monologues).

Today I would like to use my "Entity Bean Based Loan Export" war story to talk about optimizing data access, and how we really ought to code in a way that enables it.

The goal of our Loan Export use case was well defined:

Produce an output file that contains data from Loans that meet specified criteria.


Note that the use case concerns Loans; not Java objects; not database records. This is a key point to remember. Depending on the current state of the system, the data that constitutes the "Loan" could be on a hard disk, in the cache of the database system, or in the application's memory (real or virtual).

The best strategy for collecting data can vary wildly based on where the data currently resides. Using my war story as an example; if Entity Beans are already instantiated for all of the Loans to be exported, then producing an output file from the Entity Beans will generate no additional load on our database server and should be pretty zippy. If the Loan data is still exclusively on disk, then the stored procedure approach is the way to go (assuming that I'm using a single RDBMS).

I am not sure how to clearly express the point that I want to make, but it has something to do with optimizing for the present and planning for the future. One solution may be optimal if all objects can reside in memory, while another may be optimal if the number of objects exceeds some threshold. We need to code in a manner that allows an "optimized" solution to be injected without disrupting or confusing our intent.

These thoughts gell with the goals of SQL query optimization. In some database systems, the SQL that you submit is not the SQL that is executed. Behind-the-scenes query optimizations are applied by the database engine, resulting in better overall performance.

In the SQL research world, the goal is along the lines:

The query that you specified is sufficient for the system to determine the records that you want to retrieve. The procedure by which those records are obtained is an implementation detail that you need not worry about.
Wouldn't it be delightful to write Java data access code along similar lines? Consider a "collection populator" service. Specify the type of objects the collection should hold, specify the criteria that the objects within the collection must meet, and let the service worry about the details of populating the collection.

Of course there's no such thing as a free lunch: You are going to have to write all of the methods of your "collection populator" service. The advantage will come later if your data sources change or you need to develop a more efficient implementation.

Update:

A Brief Introduction to IoC by Sam Newman -- provides a good example of using IOC to inject specific DAOs.


(Cross posted at The Thoughtful Programmer)  

Mention JSPs positively in a blog and you will undoubtably get flamed. Encourage colleagues to use Entity Beans and you may never be taken seriously again. JDO was crippled for many by the lack of a standard for O/R mapping. EJBQL, and JDOQL lacked the functionality that many legacy RDBMS schemas demanded.

These are just a few of the "standard" features of Java that are reviled by ardent Java supporters... but for some reason developers stay loyal.

Are we loyal because we fear domination by Microsoft? Judging by the paranoid reactions of many to the Microsoft-Sun cease fire there's probably some truth to that conjecture.

Are we loyal because we ignore "standards"? Ignore might too mild a term, armed resistance might be more accurate. From the very beginning we've engaged in civil disobediance, rolling our own solutions and banding together to support projects that openly challenge the wisdom of the "standards". For every Jakarta project that implements a JCP standard, there's at least one that opposes the same standard.

So what should we make of all this?

I'm not sure myself, but I do think that the "major" players need to take notice and adjust their approaches. The user community does not respond to edicts from "on high", and doesn't care how much money has been spent to push a feature. In the egalitarian world where anyone can post an opinion the best laid plans of corporate marketeers aren't worth much. Bad solutions may linger, but they'll never prosper.

At this particular time, we're at a crossroads. The major players are either working on tools to hide the ugly guts (BEA's Weblogic Workshop's Controls), distract us with shiny new window dressings (Sun Creator Studio's Java Server Faces), or decouple the whole model (JBoss's AOP/Hibernate). All agree that creating a J2EE application is difficult and error prone, and all are scrambling for a fix.

Let's hope the result isn't another set of standards that developers love to hate.