Recovery Mode: Taking Control of an Out-of-Control Project
Every project goes into Crunch Mode at some point. Crunch Mode is when everybody (or almost everybody) is working lots of extra hours. If you’re in Crunch Mode more than about two weeks at a time, or you’re in Crunch Mode just to stay even with the schedule, your project is out-of-control. But there are ways to avoid OCP.
We’ve all seen one, been a part of one, or at least heard about one -- a project that goes on forever (“it’s 90% done!”), or costs eight times as much as planned, or absorbs every resource a company has (and then some). It has built a life force of It’s own. It Will Not Die. It’s an Out of Control Project (OCP).
As with most problems, the first step in finding a solution is admitting that you have a problem. Because of the nature of our work (and the typical development model as promulgated by Electronic Arts), it’s easy for an OCP to remain “below radar” for a long time -- because of inexperienced or non-existent project management (PM), busy producers, unrealistic expectations (“we’ll make up this slip after Alpha”), or a host of other reasons (most of which are well-intentioned but some of which are almost criminal).
When was the last time YOU felt completely in control throughout your development effort, finished on or ahead of schedule, on or under budget, and had everybody working 40 hours/week or under? To some extent, at some time, I believe ALL projects are out-of-control. And we probably prefer that, since some of the best things come from chaos.
Every project goes into Crunch Mode at some point. Crunch Mode is when everybody (or almost everybody) is working lots of extra hours. If you’re in Crunch Mode more than about two weeks at a time, or you’re in Crunch Mode just to stay even with the schedule, your project is out-of-control.
There often comes a time in the OCP where it is consuming every resource available to it and still slipping. The usual response at this point is to keep working, head down, and pray that everything works out OK. Eventually either the project is finished and ships, or the company notices what’s going on and kills it.
Projects Slipping Repeatedly
At periodic (hopefully weekly) schedule meetings (EA calls them Product Status Meetings) it’s time for schedules to be adjusted if necessary. Watch carefully if a project is slipping continually, especially if it is slipping out each meeting by the time between meetings (slipping a week at each weekly meeting). Keep track of how milestone dates move on projects. When a project slips this way, it means it is standing still. But while standing still it is consuming all the resources assigned to it.
If the project is in the final stages of debugging, it may actually be the case that the project is standing still -- one class A defect means an unshippable project, and if everyone is looking for it and can’t find it, the project IS standing still. But during development it should be very unusual for a product to be standing still -- progress should be made on some aspects almost no matter what.
My heuristic (rule of thumb) for recognizing an OCP is simple -- a project is out-of-control the second time it slips. Poor PM techniques can make this heuristic useless -- if there is no schedule, you can’t tell if a project slips. If there are no well-defined milestones you can’t tell if a project slips. If the schedule is constantly being rebuilt, you can’t tell if a project slips.
It’s common for a project team to say “we were slow getting started because (insert reason here) but we’ll make the time up in the (insert future phase of project here) phase”. This does occasionally happen -- if a project gets a slow start to implementation because of lack of equipment or software or other easily correctable problem, it is conceivable that the second milestone will catch up some or all of the time lost. It is not very common, however.
It’s not unknown for a OCP to be deliberate, as in “I know I can get this project approved if I say it will cost $1,000,000 and be done in 9 months” while knowing that it will more likely take 18 months and cost $2,500,000. I am not talking about honest (if naïve) mistakes, but knowing, deliberate underestimating. This behavior is unprofessional and, in my opinion, constitutes malpractice and should be grounds for termination. The only protection against a deliberate OCP is careful examination of products at their early stages (Script Review and Technical Design Review).
Dealing with an OCP: Conventional Responses
The most common reaction to a slip is “we’ll work harder and make up the time”. An equivalent (but generally more extreme) reaction is to add more appropriate resource to the project (artists if art is slipping, engineers if implementation is slipping, etc.). A third almost equivalent reaction is to extend the schedule by the amount of the slip. Let’s take them one at a time...
Extend Schedule By Slip Amount
Pros: easy; looks like you’ve “solved the problem”
Cons: rarely succeeds; can’t easily be hidden from upper management
This is often the “easy way out” selected by inexperienced PMs or Producers. “We’ve slipped (insert length here), but we’re back on track and will stay on this new schedule”. Wrong! Unless the reason for the initial slip can be found, eliminated, and shown to have no further effect on the schedule, just slipping the schedule is unlikely to succeed.
Most schedule slips have longer-term, overarching effects on a full schedule. If the first milestone is late because half the tasks took more time than planned, you should assume that half the remaining tasks will take more time than planned too! If, on the other hand, the initial slip can be traced directly to one-time problems now solved (“we slipped two weeks because we didn’t have development systems”), merely moving the schedule out may do the trick.
Work Harder (then Work Lots Harder)
Pros: no additional direct cost to the company; can be hidden from upper management
Cons: works in a limited set of circumstances; can have negative consequences (burn out, panic) to team
Extra hours can sometimes restore a project to a controlled state if:
* The team wasn’t already working lots of extra hours
* The slip is smaller (percentage-wise) than the extra hours added
* The extra hours don’t need to be worked for more than about two weeks
If any of these aren’t true, working harder won’t do fix the problem. It may instill a sense of panic in a team, or an awareness that this project is FUBAR or a SNAFU.
Add Resource
Pros: reasonable response most likely to succeed
Cons: can’t be easily hidden from upper management; adds directly to bottom line cost of project, non-proportional gain isn’t intuitively obvious.
Adding resource is probably the best of the common responses. It does require an admission that a problem exists (often to upper management) and it does directly increase the bottom line cost of the project. It also has hidden costs. It requires that new resources be brought up to speed on an existing project (which is probably under-documented), it requires that tasks be re-partitioned (according to a new set of skills available), and in most cases it increases management and communication overhead. See section 4.3, Adding Resources for notes on getting this right.
Dealing with an OCP: Good Responses
A critical precursor to successful project management (PM) is the Development Triangle (see Dynamics of Software Development by Jim McCarthy for additional discussion). Every project can be specified along three axes or legs: Time Required, Features Required, and Resources Available. These three axes are proportional in that:
* Time varies directly with Features (as Features increase/decrease, Time increases/decreases)
* Time varies inversely with Resources (as Resources increase/decrease, Time decreases/increases)
* Features varies directly with Resources (as Resources increase/decrease, Features increases/decreases)
Every tradeoff possible in development is encapsulated in this idea -- to reduce required time, you must reduce features or add resources. To increase features, you must increase resources or increase time. If you must reduce resources, you must increase time or reduce features. And so on.
The Development Triangle is the essential tool that enables you to decide what to do with your product (be in control) by recognizing your available choices.
Update Meetings
The purpose of periodic update meetings is twofold: 1) gather required schedule information; and 2) increase team focus. As discussed briefly above in The Short Version, I consider the critical factor in getting through Recovery Mode to be team focus.
When a team knows what has to be done, by when, and why, and actually agrees with the decisions and the realism of the schedule (because they’ve been part of the process and are kept informed), they can focus on what they are doing. They don’t have to worry about what to do next because they already know. They know what’s important and what’s not.
During most of a project, weekly update meetings with the Project Manager are sufficient. During these meetings the PM should get updates from each team lead (or each team member). Every task completed should be noted. Any upcoming dependencies should be noted and checked to see if they are going to be a problem. Everyone should leave the meeting with a clear idea of what is most important in the next week.
During crisis periods (just before major milestones, when the project is slipping, etc.), daily update meetings may be necessary. These meetings should focus on the top ten things that need to get done -- the most important first. Everyone should leave the meeting with a clear picture of what they have to do in the immediate future.
If these meetings are happening after Alpha, they may also be considered to be Change Management meetings, in which each desired change is brought before a group who will decide whether the change should be implemented.
Adding Resources
Adding resources is neither anathema nor a panacea. It will neither solve all problems nor make all situations worse. Fred Brooks’ pithy saying “adding programmers to a late project makes it later” is not a bad rule of thumb, but it has exceptions.
Adding resource to a project requires up-front and continuing investment: new people must be brought up to speed (which takes both their time and the time of the people on the project -- good documentation can help reduce both those), more management time and effort must be spent to coordinate the increased resources, and communication overhead within the project team is likely to increase (formal meetings will get longer, more time will be spent in informal hallway “meetings” and asking each other exactly what is going on...). This up-front and continuing cost must be carefully weighed against the expected (and likely) benefits. Adding resource (except testing resource) to a project after Alpha is unlikely to shorten the schedule.
Cutting Features/Content
Removing features or content from a project seems like an obvious solution when the schedule is too long. In my experience it is not implemented as often as one might expect.
It does no good to remove features or content already created! For this reason, you must assess and remove features or content relatively early in the project lifecycle. Of course, there are exceptions -- if a feature is associated with many defects, removing the feature may be more efficient than debugging it. If a collection of content already created is only used in association with content or features not yet created, removing it may do some good -- though it will probably have unpleasant morale consequences.
Generally, once a project reaches Alpha there is little point in removing features (since they are already implemented). There may be a point in removing the need for additional content which cannot be produced in the time available.
Putting You Out If Its Misery
Sometimes you just need to call in Well and shoot it in the head. It can be a difficult decision to kill a project, but the usual response when someone hears about it is “why did it take so long?” Usually everyone knows that a project is in serious trouble -- it’s just that the development team (or certain members of it) can sometimes manage to convince themselves that everything is all right or that the problems are not serious.
When it comes time to kill a project, remember that it’s the project you are killing, not the team, not the individuals. Be sensitive -- this may come as a shock to a team which has convinced itself everything is all right -- and it is a stress not unlike losing a job (with which it may come hand-in-hand) or a close relative. Be aware that the team is likely to be some combination of angry, defensive, and scared -- be prepared to reassure them (but always tell them the truth) about their futures, and try to get a guarantee in advance from the company that the team will be taken care of. If they can’t be kept on at the company, at least make sure they will have a reasonable time (and the assistance of the company) to find new work.
Avoiding an OCP: Project Management
The underlying discipline of staying in control of a project is called Project Management or PM. PM is everything from maintaining to-do lists to performing the scheduling portions of a Technical Design Review to reviewing milestones for completeness. PM is normally associated with task lists and PERT or Gantt charts and people who come around and ask questions like “Task 341, ‘Create HUD’ was scheduled to run from 3/5 to 3/7 -- is it done yet? If it’s not, what percentage done is it?”
From a broader perspective, PM is all of the processes which keep a project under control. Here are a few which are specially valuable in controlling or avoiding the OCP...
Start Out Right
Good PM starts at the very beginning of a project. You begin with the realization that what you are about it do is difficult and deserves careful consideration. Typically the start of a project should include creating: a Script (Product Design); a Technical Design Document (Program Design); Task Lists; a baseline Schedule; an initial Budget; and a set of Milestones.
These documents are often slighted or scanted because “everybody knows that things will change”. Well that’s fine, but when you set out on a trip, you don’t ignore the maps because “we don’t know exactly where we’re going and where we’re going to stop for each meal”. In fact, if you’re going hiking and you just plan to wander around a certain area, you pay more attention to taking maps along so that you can see where you are, where you’ve been, and where you can go from here. These documents, especially the Product and Program Designs, are your maps when doing a project.
Ideally, all these documents should be "live"-- constantly updated during development. In practice, they will always lag somewhat behind the real "documentation"-- the code and art. That’s all right so long as you don’t let it be an excuse to do nothing. CASE tools are improving and some primitive round-trip engineering tools already exist which allow you to generate code (or at least headers) from Booch or UML diagrams and to generate diagrams from code.
Scheduling
Scheduling a software project is an ongoing process -- you being with a schedule known to be in error, but you refine it as you go along. There are inevitably tasks which must be added to a schedule and often tasks which become irrelevant and have to be removed. There are always tasks which take a different amount of time than originally expected.
Schedules should be updated on a regular basis. Weekly is probably ideal for most of a project. Careful looks should be taken at the next section of a project at each Milestone review.
A common mistake is to completely rebuild the schedule periodically during development. Once the schedule has been rebuilt it is very difficult to extract original schedule information from it. A better solution is to use an automatic tool which includes baselines and alter the schedule file as time goes on (keeping it under source control if possible) so that the original (baseline) schedule is always available for examination.
Learn To Do Better
Finally, make sure you schedule time during and between projects to learn, both from your experience (remember, experience is what you get when you don’t get what you want) and from others.
Hold postmortem meetings to discuss the most recently completed (or killed) project. Make sure everyone understands that this is a time for honest conversation about the things that went wrong and the things that went right. I would go so far as to arrange for anonymous input to some reasonably neutral party to filter and provide to the group. Identify weak and strong points and change your development process to emphasize the strong and eliminate the weak.
Make sure that everyone on your project is always learning. It’s a good rule of thumb that white-collar workers should spend 10% of their time (that’s .5 days per week) keeping up with the literature. Training classes and conferences should be part of the project budget, and down time between projects should be devoted to improving their skills. Ideally, team members should drive this process, but it’s up to managers to make sure it happens if the company doesn’t want to pay for it or the team is too apathetic to fight for it.
References
A quick look at my bookshelf suggests the following books, sorted by author:
Crunch Mode: Building Effective Systems on a Tight Schedule, John Boddie
Constantine on Peopleware, Larry Constantine
201 Principles of Software Development, Alan Davis
Peopleware, DeMarco & Lister
A Discipline for Software Engineering, Watts Humphrey
Assessment and Control of Software Risks, Capers Jones
Dynamics of Software Development, Jim McCarthy
Code Complete, Steve McConnell
Rapid Development, Steve McConnell
Read more about:
FeaturesAbout the Author
You May Also Like