Analytically, this technique is a wonderful idea. But in practice, the technique can be difficult to execute judiciously. First, there is always the question of sample size. For attribute data, relatively large samples are required to be able to calculate percentages with relatively low confidence intervals. If an expert looks at 50 different error scenarios – twice – and the match rate is 96 percent (48 votes vs. 50), the 95 percent confidence interval ranges from 86.29% to 99.51 percent. It is a fairly large margin of error, especially in terms of the challenge of choosing the scenarios, checking them in depth, making sure the value of the master is assigned, and then convincing the examiner to do the job – twice. If the number of scenarios is increased to 100, the 95 per cent confidence interval for a 96 per cent match rate will be reduced to a range of 90.1 to 98.9 per cent (Figure 2). An attribute analysis was developed to simultaneously assess the effects of repeatability and reproducibility on accuracy. It allows the analyst to review the responses of several reviewers if they look at multiple scenarios multiple times. It establishes statistics that assess the ability of evaluators to agree with themselves (repeatability), with each other (reproducibility) and with a master or correct value (overall accuracy) known for each characteristic – over and over again. Unlike a continuous measurement value, which cannot be accurate (on average), any lack of precision in an attribute measurement system inevitably leads to accuracy problems. If the error coder is not clear or undecided on how to encode a defect, different codes are assigned to several defects of the same type, making the database imprecise.
In fact, the vagueness of an attribute measurement system is an important factor in inaccuracies. ISO/TR 14468:2010 contains examples of attribute agreement analysis (AAA) and deduces different results to assess the agreement between evaluators, such as agreement between evaluators. B, agreement between evaluators, agreement of each reviewer with respect to a standard and agreement of all examiners with respect to a standard. First, the analyst should determine that there is indeed attribute data. One can assume that the assignment of a code – that is, the division of a code into a category – is a decision that characterizes the error with an attribute. Either a category is correctly assigned to an error, or it is not. Similarly, the appropriate source location is either attributed to the defect or not. These are “yes” or “no” and “correct allocation” or “wrong allocation” answers.