Frequently Asked Questions about the Full-scale
Model
Question:. Should I average responses by several
engineers?
Answer: If
time does not allow for the engineers to provide evidence of all answers, then averaging
the responses is acceptable if you are certain that there are not any global
predispositions. For example, sometimes ALL
engineers may be overly optimistic or overly pessimistic.
In that case, averaging will not help much. Note
that organizations are just as likely to be pessimistic as optimistic.
Question: Who should answer the questions?
Answer: The development practices such as
analysis, design, code and unit testing should be answered by at least one software
engineer. Ideally, several people should
answer the questions. The results should be
supported by the evidence shown in the below tables. For
example, if someone answers yes to a question then they should be able to provide the
evidence as shown in the applicable tables below.
The system
and regression testing questions should be answered by the software testers. The project management and organization factors
should be answered by a lead software engineer, software project manager or software
manager.
Question:
How were the points defined for the Full-scale and shortcut models?
Answer:
Question: I don't agree with some of the points.
Answer: As indicated in step 3 from above, the points are
determined based on the most accurate predictive model. We use fact as opposed to opinion
to determine the point systems.
Question: How come the back end (systems testing) of the life
cycle process has so many points?
Answer: As discussed in the previous questions, the points
are determined based on fact and not opinion. Putting
that aside, visualize what would happen on a project if the entire systems testing phase
was skipped and the testing activities were not otherwise performed earlier in the
lifecycle (such as with the clean-room methodology)? You probably visualized a real mess. There were organizations in this database that
chose to skip system testing and the result was exactly what you would expect.
Your intuition is on target however. The system testing parameters are what filter the
very bad defect densities from the average defect densities.
However, the parameters at the front of the lifecycle determined the difference
between the average defect densities and the very good defect densities. So, in summary, think of the system testing
parameters as "penalty" measures. You
won't get ahead of the average by doing them, you just avoid getting behind the average.
Question: How come software redundancy is not a parameter?
Answer: Redundancy is not "yet" a parameter
because none of the samples in the database employed redundant software on a project. Please remember that redundant software is NOT the
same software on multiple hardware platforms. Redundant
software is the same software developed by more then one company that is required to
perform the same function but is supposedly unique because two different companies
developed it.
Question: How come a manager that does not code has so many
points?
Answer: If your organization is very very
small (less then 4 total software engineers), a software manager might be able to code AND
manage the other software engineers. However,
on even small software systems there are generally more then 4 software engineers. If the manager is coding then the manager has less
time or is not managing the other software engineers to the level of detail needed.
Question: How come code inspections has so few points?
Answer:
As discussed above, the point system is based on fact and not opinion. Putting that aside, we believe that the code
inspections had a low number of points because of any or all of the below:
The requirements and design reviews
had more points then the code reviews which supports argument b and c from above.
Question: Do you plan to correlate specific brand name UML
design tools?
Answer:
We correlate types of software tools to defect density. However, we never correlate brand name software
tools. The reasons are:
Question: How come programmer skill level is not modeled?
Answer:
This is a good question.
Question: How come the points aren't linearly related to the
correlations?
Answer:
The points measure both relationship
and impact. Correlation measures only
relationship between a practice and defect density but does not measure impact. For example, practice x may have a very high
correlation to lower defect density, but the magnitude of that lower defect density may be
small. On the other hand, there may be a
practice with a weaker correlation but higher impact.
This practice doesn't always produce lower defects, but when it does, the
difference is very measurable. Ideally, it
would be nice to have the practices in place that have both high impact and high
correlation so as to minimize the risk and maximize the returns of implementing that
practice.
Question: It seems to me that all I
have to do is find the items with the highest points that have a no answer and implement
them?
Answer:
Not necessarily.
The items with the highest points may
also be expensive, take calendar time to implement and may have prerequisites that cannot
be resolved easily by your organization. Sometimes
having a few items with a moderate amount of points is the fastest and cheapest way. Frestimate now has a cost model to allow you to see the expense,
difficulty and prerequisites involved with making improvements.
Question:
Does SoftRel have plans to add more application types?
Answer:
Yes, we do this continually. If you send us a
completed Frestimate prediction with actual responses for the SoftRel Full-scale model and include the actual observed
defects and size estimates and the application type, we will include that data in the
model. You can forward a non-disclosure for
signature prior to sending data as well. We
keep the sources of all data confidential.
Question:
How come some industries have higher defect densities then others?
Answer:.
The answer real boils down to how many hours the software must operate in the field
without interruption or service. The longer
this requirement, the smaller the average defect densities are likely to be. The criteria that determine how long the software
must operate without interruption may include:
1. how easy
is the software to service/platform? For
example, the serviceability of the software in a dishwasher is different then a satellite. The dishwasher would require either a maintenance
call or a phone line/remote communications to service the software while the satellite
requires remote communications for uploading/downloading.
2. how many
units containing the software must be supported? There are many more dishwashers then
satellites. One maintenance call for a
dishwasher multiplied by the total number of dishwashers might result in a maintenance
nightmare if the fielded defect density is not low enough.
3. how
visible is the impact from a software defect. You
can see that for the scientific software, the defect density was very small. This is because for this type of application, the
outputs can simply not be incorrect as just one incorrect output would result in a user no
longer having confidence in the software as a whole.