Estimating the Software Development Projects. Part 2
In the first part of this article, we looked at general considerations about the purpose, structure, and complexities of estimation. Now let's look at how to approach the definition of scope and requirements and how, in fact, to get and describe the cherished numbers of the estimated project's efforts intensity. And at the end, a bit of math awaits you under the hood.
Estimation Method
This section provides step-by-step recommendations for estimating the labor efforts of a project. Each point is backed by years of experience, both in successes and failures. The method proposed is primarily applicable to projects at the stage where requirements are refined to the level of user or functional specifications. Nevertheless, many of the tips and suggestions will be relevant to any development projects and engineering ventures in general.
Step 1. Prerequisites
1. Allocate or demand resources for estimation. Unfortunately, not all managers and clients comprehend the intricacy and labor intensity of a thorough estimation process. Insist that estimators have sufficient time and other resources for their work. The more time invested in estimation, the more accurate it tends to be. Even a few minutes of task analysis can significantly enhance accuracy compared to estimation "on the fly."
In addition to time, access to existing systems, their code and documentation, data, licenses, experts, and other resources may be necessary.
2. Know the project stakeholders and decision-makers. Often, different individuals (such as customer representatives) have different visions and interpretations of what needs to be done and how. Try to identify all worldviews or outline the main conflicts among them.
3. Carefully read the contract, if there is one, even if it's in draft form. You may discover a lot of new information regarding formal obligations related to the project. For example, requirements for work quality, delivery, deadlines, documentation. This directly affects the scope of work and its estimation.
4. Establish communication with the sales department. Misalignment between salespeople and implementers is a very common cause of project problems. Typically, the main KPI for salespeople is sales volume, and they are much less concerned about project execution issues. Make sure you know what they are selling, and they know how you intend to deliver it.
Step 2. Describe the Project Requirements and Scope
Four Categories of Requirements by Degree of Familiarity
At the time of estimation, requirements can be conditionally divided into four parts based on their level of familiarity, clarity, and the amount of risk involved.
Known knowns: Explicitly stated, understandable, and sufficient for accurate estimation. What to do with them? Estimate.
Unknown knowns: Indirectly stated or not stated but easily accessible. Don't hesitate to read requirements and specifications thoroughly, follow hyperlinks in them, visit the client's website, ask SMEs. It doesn't require significant effort but allows for a much better understanding of requirements and identification of hidden risks. What to do with them? Find, read, ask, clarify, translate into Known knowns, estimate.
Known unknowns: Insufficient requirements, incomplete documentation, and the like. For example, the client knows there will be integration with other systems but doesn't know the protocols, formats, and data exchange volumes. What to do with them? First of all, decide whether we want to take on the risks of estimating and fulfilling these requirements. If yes, it would be wise to make assumptions about tasks we don't know and add a time buffer just in case. If not, explicitly state that these requirements are not estimated or are beyond the project scope.
Unknown unknowns: Almost every project experiences something that was not anticipated at the beginning. For example, bugs in external libraries, surprises from browser and operating system vendors, complexities in requirements or implementation, missed requirements. Changes in requirements do not directly fall into this category, but they are often covered by this buffer, as initiating change management procedures for every client request is quite costly. There is little that can be done about this except adding a buffer.
Actions to Prepare Requirements for Estimation
Categorize requirements and anything that may be considered as such as outlined above.
In addition to specifications and official documents, it's worth analyzing meeting minutes, letters of intent, everything down to casual conversations, "because future expectations often become requirements or influence them.
Document expectations, formal and informal, functional and non-functional requirements, references to other documents, specifications, and standards.
If possible, verify the relevance of the provided documentation, as there are cases of estimation based on outdated documents.
Attempt to confirm the list of requirements and assumptions with the client. This may not happen immediately. In the worst case, you can resort to the "default agreement" approach: send a letter requesting confirmation or comments on the scope, subtly hinting that lack of response implies tacit agreement.
Explicitly describe what is not included in the scope: what you are not estimating and don't intend to do, as any, even the most detailed description is subject to various interpretations, conscious or unconscious. A typical example: the client may expect infinite support for the software you are developing if the warranty support period is not explicitly stated.
If, for some reason, you have not estimated a task, it is necessary to explicitly state this:
Never assign a zero estimate if you haven't estimated a task. Sooner or later, someone might assume that the task requires no time. In my practice, such cases have occurred.
Identify risky requirements. By risky, I mean requirements like "the website should work anywhere in the world and support all languages." They are often stated in one line but can increase your project's workload significantly. Typically, the client doesn't grasp the complexity of their implementation, and they don't need "all languages." Unfortunately, uncovering such requirements requires careful re-reading of all available documentation since these "gems" may be scattered across hundreds of pages of text.
Gather non-functional requirements. Few people think about them at the project's outset, so it's worth at least asking the client for rough estimates of data volume, users, system availability and including them in assumptions. Many of these parameters can critically affect your architectural decisions and hence, the project's workload.
Thoroughly analyze inherited artifacts (legacy). At the project's outset, all of them, unfortunately, could become yours. This applies not only to code but also to documentation, data, contracts, third-party libraries, and other project components. Often, it's easier to rewrite code than to try fixing it after previous "experts," and it's better to know this in advance. Besides technical debt, you may inherit other types of debt, such as dependencies on external components, contractual obligations, poor documentation, incomplete data, and so on.
Step 3. Create a Work-Breakdown Structure (WBS) Based on Requirements
Creating a hierarchical structure of work (WBS) is one of the most labor-intensive and critical operations in project estimation. Methods for creating a WBS from requirements are the subject of a separate article or book. However, I'll offer a few considerations important for our topic: A WBS based on decomposition by product (noun-oriented) is easier to estimate using this method than one based on other criteria. Level of WBS detail: lower-level tasks should not exceed 40 to a maximum of 80 hours. Contrary to common recommendations to cover 100% of the work, our WBS may not include certain types of work like meetings and management. The reason for this will become clear below.
For example, for a simple order processing system, the WBS might look like this:
- Login page (UI, Login, Logout)
- Order page (UI, Create order, Confirm order, Cancel order, Search and filtering)
- Integration with payment (Outbound, Inbound)
Step 4. Document Assumptions
If something can be misunderstood, it will be misunderstood. Don't rely on having the same understanding of terminology, language, or common sense as your document readers. Even people from the same linguistic and cultural background, of the same age and education level, can interpret the same document completely differently. Not to mention conflict situations where the other party intentionally interprets documents in their favor.
For example, during the project execution, the latest version of your underlying framework may grow by several versions, and the client may demand an upgrade to it upon project delivery unless stated otherwise in the assumptions. On the other hand, it may seem logical to the executor that the client should pay for this.
Addressing potential differences in interpretations requires experience and effort. Sometimes you need to record practically every word at a meeting with the client or team, but such work significantly reduces future project risks.
Besides resolving differences in interpretations, assumptions also serve as implementation constraints you want to adhere to. This includes specifying under what conditions your estimates remain valid, such as:
- Definition of Done for tasks and the entire project. It outlines the common understanding of task or project completion criteria.
- "Expiry date" of the estimate. Everything is constantly changing: the market, business, people, technologies, and our understanding of them. If your client wants to revisit an estimate made six months ago, it's a good idea to review it for relevance.
- Versions. For which versions of languages, browsers, operating systems, components, hardware, frameworks, and other products are your estimates valid.
- Access to external resources. If you've worked with large corporations, you may be familiar with the struggles of gaining access to their servers, systems, and other resources that can affect the accuracy of your estimates.
- When using external components, especially for the first time, it's prudent to assume that they will work as stated in the documentation and their provider will respond promptly to your requests.
- Hardware and performance requirements. Specify on what hardware your system will achieve planned performance. Even the performance of development environments and the network connection between them, the database, and team members' workstations can significantly affect development time.
- If you know you're estimating a brownfield project without access to all old artifacts, write it in the assumptions: "Upon accessing the old code, the estimate may be revised."
- Licensing, other resources, and expenses. It's a good idea to agree on what you'll need for estimation, if provided by the client or a third party.
- Expected data volume. Large databases and files can significantly increase the time for any operations with them. Just copying a terabyte database to a local computer can take several days or weeks with a poor network connection.
- Any comments, doubts, contradictions. On one project, we were tasked with technically upgrading systems to a new framework version. Our task was to go through the standard upgrade steps, ensure that the code compiles and runs, and that data integrity is maintained. Meanwhile, the client expected a fully tested and functional system. This project was long and difficult and eventually closed with significant losses
Step 5. Estimate Each Task and the Project as a Whole
The proposed method allows for a fairly realistic estimation of project effort, knowing only the development costs and a subjective assessment of risks. If possible, also estimate other types of work besides development, as this can increase the accuracy of the estimate and the level of confidence in it.
The coefficients used in the calculations are derived from past ERP system development projects in which I have participated in estimation and execution. Coefficients can and should vary depending on the domain, technologies, requirements, team experience, and other factors. A reliable source of coefficients can be data from time tracking systems for similar projects, broken down by costs for different categories of work (business analysis and design, development, testing, meetings, management, etc.).
1. Agree on what constitutes development. Before directly estimating tasks, it's a good idea to agree on what constitutes development. For example, in the form of such a list:
The average developer is likely to estimate only the actions highlighted in red. However, it's essential to reach a common understanding among developers, managers, and clients about what constitutes the completion of a task, as seen in the Definition of Done. This helps avoid significant discrepancies in interpretations.
Next, actions 2-5 are performed for each line of the Work Breakdown Structure (WBS) if the corresponding type of work is necessary.
2. Estimate Development Time. For each line of the WBS, estimate the realistic development time in the chosen units of measurement. Considering the inherent optimism in most people, it's advisable to conduct a mental exercise of modeling worst-case scenarios after estimating. This helps identify hidden risks and adjust the realistic estimate upwards. The estimation should consider the amount of work, complexity, and risk.
If different types of development are estimated separately within one task, such as front-end and back-end, their estimates need to be combined.
Be more pessimistic in estimating large tasks and optimistic in estimating small ones. Underestimating large tasks is obviously riskier. Additionally, overestimating small tasks is more noticeable and may create an impression of unprofessionalism or significant overestimation of the entire project, undermining trust in you. However, underestimating small tasks carries almost no risk to the project.
If your team is formed and works well together, Planning Poker can provide accurate estimates, although it is a relatively expensive technique.
3. Add Time for Requirements and Designs Work. Include time for clarifying, describing, agreeing, and refining functional requirements and designs, as well as architectural design and support — about 70% of development time when well-formulated business requirements are available.
4. Add Time for Quality Assurance. Add time for testing activities (writing test documentation and manual testing, including integration testing) — 50% of development time. If you plan to use automated testing, separately add time for it.
5. Add Time for Risks. Estimate the degree of risk (Contingency) for the task. This is a subjective indicator of how confident the developer is in the estimation and understanding of the task, on the following scale:
- +5% — fully confident, low risk;
- +15% — medium risk;
- +30% — high risk.
Above 30% — too high risk, the task requires additional investigation or breaking down into sub-tasks. Although in the early stages of the project, under conditions of high uncertainty, the risk coefficient may be higher.
The risk factor is then multiplied by the total task completion time (BA + DEV + TST).
6. Add Time for Project-wide Tasks. For the entire project, add:
- Time for meetings — 20% of development time. Agile methodologies may require even more time for meetings.
- Time for management — 15% of development time. This includes the team lead's time for managing the team.
- Time for infrastructure management (DevOps) — 5 – 10% of development time.
- Time for other project-related tasks (e.g., automated testing or business trips).
- Project buffer — 5 to 15% of the total time, depending on the risks, project size, complexity of the domain, and requirements, as well as the degree of interdependence among different tasks.
7. Put It All Together. As a result, for our simple order processing system, the calculation might look something like this:
The red color indicates the numbers that must be obtained from the assessors. The rest are calculated automatically.
Round the estimate to whole hours, tens, hundreds, depending on the final result. The number 1574.56 may give the reader a false sense of accuracy, as opposed to 1580 or 1600.
Notice that the labor intensity of the entire project is three times that of the development itself: 971 man-hours versus 312. Interestingly, this ratio coincides with the conclusion in F. Brooks' classic book on project management, The Mythical Man-Month, written as early as 1975. It can also serve as a rough check on the completeness of the estimate.
Step 6. Format the Estimate
A well-packaged estimate is just as important as its structure and numbers. Proper communication is a critical part of the estimation process. The document should look neat, professional, and be intuitively understandable.
Construct the document's structure so that the reader cannot overlook the definition of the project's boundaries and assumptions. Place them on the first page of the presentation or the first sheet of the Excel file. You and the document's recipient should have a very close, if not identical, understanding of what is estimated and under what conditions.
Keep in mind that any estimate communicated to your management or client can be perceived as a commitment. Assumptions and reservations will be forgotten, while the numbers or dates will be remembered.
When dealing with a wide range of scope or if the client is unwilling to fix it, it's a good idea to offer multiple options, much like marketers do in other fields (e.g., "Elite Package," "Business Package," "Economy Package"). Then the client can assess the cost and risks of poorly defined requirements and boundaries and make a more informed decision. The side marketing effect of this approach is that you're calculating various options and risks for the client and allowing them to choose, which is valuable in itself.
The level of detail in the estimate document depends on the specific project, client, and your relationship with them. It's difficult to give universal advice here. Some clients want to see all the details, while others, typically those who trust you more, are only interested in the final figure.
The higher the management level making the decision, the less interested they are in the details and more interested in the final sum. Therefore, when presenting estimates to top management, ensure that they include everything that can be reasonably justified. It's easier to reduce the project budget than to increase it.
Checklists
I strongly recommend creating, using, and promoting checklists. No matter how many projects you have estimated in your life, there is always a chance of missing something. An excellent book on this topic is "The Checklist Manifesto: How to Get Things Right," in which its author, Atul Gawande, describes how checklists save lives and improve outcomes in aviation and medicine. This inexpensive and simple tool may one day save your project too.
Here are examples of items I have used for self-checking to avoid missing deadlines:
- Planning, tracking work hours, budget control;
- Project, financial, and other reporting;
- Clarification and updating of requirements and designs;
- Translation from different languages;
- Onboarding and training new employees;
- Preparation of demo data and environment, conducting demos;
- Release preparation: packaging, rechecking, delivery, writing release documentation;
- Documentation update after code changes;
- The estimation itself (someone has to pay for it eventually);
- Post-release support and warranty. Re-read relevant sections of the contract;
- Code quality review (whether automated or manual) and bringing it to the desired level;
- Automated testing;
- Test documentation;
- Various types of testing besides functional testing;
- Audits and certifications of your solution;
- Support for development environments and costs for possible downtimes;
- Obtaining accesses and downtimes due to their absence;
- Dealing with errors in inherited code or third-party applications, time for integration with them;
- Technical and other debts from previous phases or subprojects;
- Management of test and real data, their migration when changing the data structure;
- Security system development, security testing;
- Code review, refactoring, and retesting after it;
- Performance testing and tuning;
- Synchronization with other teams;
- Intermediate upgrades to new versions of frameworks and components;
- Risks of using new technologies, services, or components, work in new subject areas.
A Bit of Mathematics
Basically, the above is enough to apply this method to evaluate your projects. Next - a little bit of maths "under the hood" and my thoughts on the topic of project evaluation and execution. I would be grateful to professional mathematicians for comments and corrections of inaccuracies.
Random Variables. The time to complete a project, as well as each of its tasks, are random variables. The local goal of project estimation is to find a confidence interval for the project's effort with a high level of confidence.
Empirically, we know that the probability distribution density of task completion time can be modeled as shown in the figure below, with a long right "tail" (a kind of consequence of Parkinson's and Murphy's laws), since the number of negative risks, essentially, is unlimited. Unlike positive risks.
To begin with, we need to find the mathematical expectation of the completion time for each task that constitutes the project (strictly speaking, to estimate it, since the true distribution is not available to us).
To begin with, we need to find the expected time for each task comprising the project (strictly speaking, to estimate it, since the true distribution is inaccessible to us).
Critique of the Three-Point Method. The classic method for finding the estimate of the expected value is the three-point method. It suggests estimating an optimistic, realistic, and pessimistic time for a task:
Then, using formulas, find the estimate of the weighted average and standard deviation. The formulas for finding the mean depend on the assumed distribution, either PERT or triangular:
Note that even in commonly accepted descriptions of the method, there are no clear definitions and differences between the most likely and realistic estimations. Also, not every developer knows and understands what mathematical expectation is.
Let's imagine that we assigned the same task to 25 developers of approximately equal qualifications. Then we obtained the actual values of the time spent, as shown in the diagram below. In this case, the mathematical expectation would be 6.16, while the most likely completion time would be 5. So, with poor communication, the realistic estimation can be off by 20 – 25%.
Giving credit to the rigor of the three-point method, I must note that, in my opinion, it does not provide significantly greater accuracy than the approach I described as "Realistic estimation + risk." At least for three reasons:
1. Coming up with three values for each task is quite labor-intensive. In practice, optimistic and pessimistic estimates can be somewhat arbitrary, especially if there are dozens or hundreds of tasks. 2. Inaccuracy in defining what a, m, b represent introduces inaccuracies into the estimation. 3. "Task risk," in my view, is a much more intuitive category than "three points."
However, the three-point method can be recommended for individual large and risky tasks. It can also help combat the excessive optimism of the estimator and adjust the realistic estimate.
Central Limit Theorem. Strictly speaking, the hypothesis that the estimate of the entire project is equal to the sum of the estimates of all its tasks requires proof. So, suppose we have obtained the expected labor costs for each task. Next, one of the most useful theorems in mathematics will help us, namely the Central Limit Theorem:
The sum of a sufficiently large number of weakly dependent random variables, having approximately the same scales (no one of the summands dominates or contributes significantly to the sum), has a distribution that is close to normal.
This means that under certain assumptions (a large number of variables, similar scales, weak dependencies), the sum of task estimates will be a good approximation of the project completion time, distributed almost normally.
In practice, it is considered that there should be at least 20 summands.
Thus, under the stated conditions, we can take the sum of the estimates of all tasks, considering risks, as the estimation of the project's total labor costs.
Expanding the Confidence Interval. So, will the sum of the estimates of all tasks suffice as the final estimate? Only if we are satisfied with a 50% probability of staying within the budget... To be more confident in completing the project on time and within budget, we need to add some extra margin. When using the three-point method for all tasks, it is possible to calculate it quite strictly, for example, +3 sigmas for a 99.7% confidence interval.
In our method, we add a project buffer ranging from 5% to 15%, the size of which is determined by identified risks, interdependencies between tasks, project size, and the client's tolerance for such precautions:
- 5% — the project is small, requirements are clear and stable, risks are low;
- 15% — the project is large, requirements and scope are not entirely defined, there are strong interdependencies between tasks, and there are risks such as using new technologies, dealing with inherited artifacts, dependency on external components, or contractors.
If there are justified concerns that the client will demand removing the project buffer to reduce the budget, you can proportionally increase the estimates of individual tasks in the final document.
Thus, our estimation method receives some informal mathematical justification.
Conclusion
In conclusion, I'll try to summarize the recommendations for estimating project effort into several points:
- Thoroughly describe the boundaries of the assessed project, assumptions, and risks.
- Find and train qualified estimators and combat the optimism of all estimation participants.
- Properly document the estimate and constantly communicate with all stakeholders.
- Check yourself using checklists and seek assistance from colleagues.
Quality estimation is a necessary but insufficient condition for success. Following best project management practices and adapting to changes at all stages are absolutely necessary. A well-executed and structured estimate can greatly assist in subsequent project activities.
I hope this material will improve the lives of many managers, projects, teams, and companies. Should you have any questions, don't hesitate to contact us!