Craftsmanship in the Face of Requirements
Pragmatic Programmers Andy Hunt and Dave Thomas talk with Bill Venners about software craftsmanship, the importance of fixing the small problems in your code (the "broken windows") so that they don't grow into large problems, and making design decisions that are reversible and adaptive. This article features some highlights from a sequence of conversations between Andy, Dave, and Bill that originally appeared on Artima.com.
Andy Hunt and Dave Thomas are the Pragmatic Programmers, recognized internationally as experts in the development of high-quality software. Their best-selling book of software best practices, The Pragmatic Programmer: From Journeyman to Master (Addison-Wesley, 1999), is filled with practical advice on a wide range of software development issues. They also authoredProgramming Ruby: A Pragmatic Programmer's Guide(Addison-Wesley, 2000), and helped to write the now famousAgile Manifesto.
Bill Venners: In the preface of your book, The Pragmatic Programmer, you quote a Quarry worker's creed:
We who cut mere stones must always be envisioning cathedrals.
You then say "Within the overall structure of a project there is always room for individuality and craftsmanship." What do you mean by that?
Dave Thomas: In a very structured environment, people tend to abdicate responsibility. People say, "It's not my job anymore. My boss is telling me what to do. A big master plan is given to me. I just have to do this module, and this module, and this module." The analogy is with a stone mason who is a very small part of a very big whole. The reality is that the stone masons building the cathedrals were seriously high-quality craftsmen. They were always conscious of the fact that the work they were doing was going to be the face of a cathedral. What we're saying is, even if you feel you don't have the authority or responsibility to do it right, the reality is, you do. The quality of the work you are doing is important. It contributes to the overall impact or effect of the project.
Andy Hunt: The other facet is that this allows or encourages individual artistry. You can't go hog wild. You're building a cathedral. You have to carve your gargoyle to fit into the overall theme and coherent tone of the place. You can't start carving swans or something. But within the overall design constraints of the whole, you still have that individual artistic liberty to do your best work as it relates to that whole.
BV: Why is that important?
DT: Two reasons. First, programming is very difficult. To do it well requires a phenomenal amount of commitment. To motivate yourself and keep yourself committed, you need to have pride in what you're doing. If instead you consider yourself a mechanical assembly-line worker, whose only job is to take the spec and churn out bytes, then you're not going to have enough interest in what you're doing to do it well. So from the global perspective, it is very important. From a personal perspective, why should you be doing work you don't enjoy? It is important if you are going to commit this much to a job that you enjoy it.
AH: The idea of artistic freedom is important because it promotes quality. As an example, suppose you're carving a gargoyle up in the corner of this building. The original spec either says nothing or says you're making a straight-on gargoyle just like these others. But you notice something because you're right there on the ground. You realize, "Oh look, if I curve the gargoyle's mouth this way, the rain would come down here and go there. That would be better." You're better able to react locally to conditions the designers probably didn't know about, didn't foresee, had no knowledge of. If you're in charge of that gargoyle, you can do something about that, and make a better overall end product.
Fixing Broken Windows
BV: What is the broken window theory?
AH: Researchers studying urban decay wanted to find out why some neighborhoods escape the ravages of the inner city, and others right next door — with the same demographics and economic makeup — would become a hellhole where the cops were scared to go in. They wanted to figure out what made the difference.
The researchers did a test. They took a nice car, like a Jaguar, and parked it in the South Bronx in New York. They retreated back to a duck blind, and watched to see what would happen. They left the car parked there for something like four days, and nothing happened. It wasn't touched. So they went up and broke a little window on the side, and went back to the blind. In something like four hours, the car was turned upside down, torched, and stripped — the whole works.
They did more studies and developed a "broken window" theory. A window gets broken at an apartment building, but no one fixes it. It's left broken. Then something else gets broken. Maybe it's an accident, maybe not, but it isn't fixed either. Graffiti starts to appear. More and more damage accumulates. Very quickly, you get an exponential ramp. The whole building decays. Tenants move out. Crime moves in. And you've lost the game. It's all over.
We use the broken window theory as a metaphor for managing technical debt on a project.
BV: What is technical debt?
AH: That's a term from Ward's Wiki. (See Resources.) Every time you postpone a fix, you incur a debt. You may know something is broken, but you don't have time to fix it right now. Boom. That goes in the ledger. You're in debt. There's something you've got to fix. Like real debt, that may be fine, if you manage it. If you've got a couple of those — even a lot of those — if you're on top of it, that's fine. You do a release and get it out on time. Then you go back and patch a few things up. But just like real debt, it doesn't take much to get to the point where you can never pay it back, where you have so many problems you can never go back and address them.
DT: My current metaphor for that is my email inbox, because I have this habit every now and then of not answering email for a while. And then it gets to the point, around about the 250-message mark, where I suddenly realize, I'm never going to answer these messages. And it is the same with pending changes in software.
BV: How does technical debt relate to the broken window theory?
AH: You don't want to let technical debt get out of hand. You want to stop the small problems before they grow into big problems. Mayor Giuliani used this approach very successfully in New York City. By being very tough on minor quality-of-life infractions like jaywalking, graffiti, pan handling — crimes you wouldn't think mattered — he cut the major crime rates of murder, burglary, and robbery by about half over four or five years.
In the realm of psychology, this actually works. If you do something to keep on top of the small problems, they don't grow and become big problems. They don't inflict collateral damage. Bad code can cause a tremendous amount of collateral damage, unrelated to its own function. It will start hurting other things in the system, if you're not on top of it. So you don't want to allow broken windows on your project.
As soon as something is broken — whether it is a bug in the code, a problem with your process, a bad requirement, bad documentation — something you know is just wrong, you really have to stop and address it right then and there. Just fix it. And if you just can't fix it, put up police tape around it. Nail plywood over it. Make sure everybody knows it is broken, that they shouldn't trust it, shouldn't go near it. It is as important to show you are on top of the situation as it is to actually fix the problem. As soon as something is broken and not fixed, it starts spreading a malaise across the team. "Well, that's broken. Oh I just broke that. Oh well."
Showing You Care, So Others Will Care
DT: It comes down to showing that you care. Take, for example, some code that is kind of shared among the team, but primarily is mine. There's some code in there that is obviously bad, but it doesn't look like I care about it. I'm just leaving it bad. Anybody else coming into that module might say, "Well, Dave doesn't care about it. It's his module. Why should I care about it?" In fact, if you come into my module and do something else that's bad, you can say, "Well, Dave doesn't care. Why should I care?" That kind of decay happens to modules as well as apartment buildings.
On the other hand, suppose I notice an edge condition that doesn't work in my code. I know it's a bug, but the bug is not critical to the application today and I don't have time to fix it. I could at least put a comment in there. Or, even better, I could put assertion in there, so that if the edge condition ever hits, something's going to happen that shows I'm on top of it. By doing that, first of all, I make it easier to identify the problem. But I also show other people that I care about it enough that they will fix problems, too, when they encounter them.
AH: If you walk into a project that's in shambles — with bugs all over, a build that doesn't quite work — you're not going to have incentive to do your best work. But if you go onto a project where everything is pristine, do you want to be the first one to make a bug?
BV: In your book, you tell a story about a tapestry fire.
AH: That is a true story. A former accountant of mine in Connecticut lived in a very upscale, wealthy section of town. This guy lived in a super mansion. He had a tapestry hanging on his wall a little too close to his fireplace, and one day it caught fire. The fire department rolled in. The fire was blazing. The house was about to go up in flames. But the fire department did not simply come charging in the front door. They opened the front door, and they rolled out a little carpet. Then they brought their filthy dirty hoses on their carpet and put the fire out. They rolled their carpet back up and said, thank you very much.
Even with the fire raging, the fire department took the care to put down the carpet and keep their hoses on it. They took extra special care not to mess up this guy's expensive mansion. It was a crisis, but they didn't panic. They maintained some level of cleanliness and orderliness while they took care of the problem. That's the kind of attitude you want to foster on a project, because crises do happen. Stuff bursts into flame and starts to burn up. You don't want to go running around crazy and causing more damage trying to fix it. Roll out the carpet. Do it right.
BV: In your book, The Pragmatic Programmer, you suggest keeping in mind that design decisions are not necessarily final. You recommend organizing the system so design decisions are reversible. How do you balance reversibility with other concerns, such as speed of development, performance, clarity, simplicity? For example, say I decide today that I'll use an Oracle database. Do I use an Oracle-specific API to talk to the database, or a generic database API? Perhaps the Oracle-specific API is clearer or faster, but makes it harder to change databases later. Either way I'm taking a risk.
AH: You're taking a risk either way, but I would say that approach is backwards. Instead of committing to the Oracle-specific API now and hoping it's faster, use the more general API first. If the general API is not fast enough, then make the conscious optimization decision to use the specific API for certain performance-critical parts. Everybody going back to the K&R book has warned against premature optimization, and I think that's an example of it.
DT: It also comes down to our old friend the cost of change curve. The cost of change curve basically says that the cost of making a change increases exponentially over time. There are various expressions of it. For example, the cost of fixing a bug after a system has been deployed is 1000 times more than fixing it when the system is being designed. But the general agreement is that the curve goes up non-linearly as time goes on.
The meter of the cost of change curve starts running when you make a decision. If you don't make a decision, then there's nothing to change, and the curve is still flat.
AH: The world can change its mind as many times as it wants. If you haven't made a decision or committed yet, your cost is zero.
DT: So rather than make a whole bunch of decisions up front and start the meter running, we try to defer each decision as long as we can. We end up with a lot of small cost-of-change curves, because each one hasn't had a chance to get up too high. Cumulatively, the effect of adding up those small curves is a lot less than having one curve that starts at zero that ramps up to infinity real quickly.
Dealing with Uncertainty in Requirements
BV: You say in your book, "Not sure how marketing wants to deploy the system? Think about it up front and you can support a standalone, client-server, or n-tier model just by changing a configuration file. We've written programs that do just that." What complexity does this add, and how much more time does it take? And if marketing doesn't know how it wants to deploy, why are you guessing? How good are you, in practice, at predicting?
AH: It's a multiple-choice question. It's not like we have no idea how to deploy; just pick something and make it up. Marketing says they might deploy the system like A, B, or C, but they don't know yet. They aren't going to know until it's far too late in the game; therefore, you have to be prepared for any one of A, B, or C.
DT: In fact, our previous client had exactly this happen to them. They knew they wanted a client-server application, but they weren't too sure how they wanted to present it to the user. For months they went back and forth between using an applet or application on the front end. They did this as they were writing the program. If, instead, they'd stopped to work out exactly what they were doing up front, they would have delivered the code four months later. Since they didn't know which they we're going to do, they in effect decided to code both an applet and an application. Because of the way Java works, it turned out coding for both added very little overhead.
AH: There was basically just a thin wrapper for each of the environments, and from that point backwards everything was the same.
DT: They could actually choose between applet and application by changing a couple of configuration files.
Fulfilling the Intent of the Requirements
BV: You say in your book, "We like to write adaptable, dynamic systems using metadata to allow the character of applications to change at runtime." To what extent are you making things adaptable beyond the requirements?
AH: That depends on what you mean by the requirements. If you're talking about the formal, written-down requirements document, yeah, we're probably going beyond that. But we're not going beyond what we've gotten from talking with the user and from seeing the environment. You would never go beyond the requirements just on a whim, or just because you thought it was cool. There must obviously be some need. Maybe it's expressed directly by the client, maybe not. Maybe it's implicit in the environment. The client may not be aware that something is going to be a problem directly, but there may be other ways to tell it is going to be a problem. You know from the situation. It would be irresponsible, bordering on malpractice, just to put in extra adaptability where it's not needed.
DT: I can give you an indirect example. We had a house built when we came to the states. We talked to the builder about what we wanted. Among other things, we told him we wanted a lot of storage space. The builder came up with some plans that looked wonderful, and started building. As they got to the sheetrock stage, he was walking around and noticed an unused alcove. He realized if he sheetrocked the inside of the framing of the alcove, he could actually form a new cupboard. He took some scrap sheetrocking and put on the inside of this alcove before sheetrocking over the alcove. The next time we saw him he said, "I've done sheetrocking inside, do you want me to cut a door through and make a cupboard out of this?" We said, "Absolutely." So we got one extra cupboard.
By sheetrocking the frame at that stage, he saved us the hassle of saying, "Hey we could use that space. Rip off all that sheet rock you just put on the outside so we can sheetrock the inside." He'd gone ahead. It wasn't a stated requirement. It wasn't on the plan. But he knew what we wanted, and based on that he made a decision.
AH: That's the key. He knew what they wanted. It's a question of intent, not necessarily a stated requirement. We know they want cupboard space. We know this is a big goal of theirs. We know this is their intent. We are going to try and meet that any way we can.
This interview includes excerpts from a ten-part interview with Andy Hunt and Dave Thomas originally published on Artima.com:
- "Don't Live with Broken Windows: A Conversation with Andy Hunt and Dave Thomas, Part I"
- "Orthogonality and the DRY Principle: A Conversation with Andy Hunt and Dave Thomas, Part II"
- "Good Enough Software: A Conversation with Andy Hunt and Dave Thomas, Part III"
- "Abstraction and Detail: A Conversation with Andy Hunt and Dave Thomas, Part IV"
- "Building Adaptable Systems: A Conversation with Andy Hunt and Dave Thomas, Part V"
- "Programming Close to the Domain: A Conversation with Andy Hunt and Dave Thomas, Part VI"
- "Programming is Gardening, not Engineering: A Conversation with Andy Hunt and Dave Thomas, Part VII"
- "Tracer Bullets and Prototypes: A Conversation with Andy Hunt and Dave Thomas, Part VIII"
- "Programming Defensively: A Conversation with Andy Hunt and Dave Thomas, Part IX"
- "Plain Text and XML: A Conversation with Andy Hunt and Dave Thomas, Part X"
Andy Hunt and Dave Thomas are authors of The Pragmatic Programmer, which is available on Amazon.com:
Andy Hunt and Dave Thomas are also authors of Pragmatic Unit Testing and Pragmatic Version Control, which are available at their very own Pragmatic Store.
Ward's Wiki, the first WikiWikiWeb, created by Ward Cunningham