## BCG InstituteBCGi

## Impact Evaluator

The Impact Evaluator is a tool designed to instantly calculate and evaluate personnel decisions. The information below will help you interpret your results.

### Availability Comparison

An availability comparison is conducted when one group's representation (for example, the percentage of women currently in a given position and/or job group) is compared to that group's availability in the relevant labor market (using availability data from inside or outside the organization). This type of comparison is useful for determining whether a group is underutilized in a particular position, or group of positions.

This analysis should be differentiated from the "Selection Rate" comparison because (under most circumstances) a statistically significant underutilization does not automatically constitute a finding of adverse impact. The reason for this is straight-forward: a selection rate comparison directly evaluates how two groups fared on a particular employment practice, so if one group significantly outperforms the other, direct evidence is gathered regarding the impact of a particular employment practice on the group of interest. Then the attention can shift toward evaluating that particular employment practice for job relatedness (i.e., validity).

By contrast, the Availability Comparison does not (necessarily) consider the impact of one employment practice. Because the comparison is an overall evaluation that considers one group's makeup in a given position compared to their availability, it does not consider all of the practices, procedures, or tests that may have been used to select or promote individuals for that position. Further, it does not take into consideration other factors such as "job interest" or qualification levels of the at-issue group. For example, if outside availability data shows that men are statistically significantly underutilized for a group of clerical jobs, the underutilization could possibly be explained by either lack of interest on the part of men to pursue these positions, or the fact that men performed poorly on the multitude of qualification screens required for entry into the position (or likely some combination of these two factors and others). For these reasons, the Availability Comparison should be considered a "threshold" or "initial inquiry test."

**Results:**

**Probability (Exact Binomial):** This test uses the (two-tail) exact Binomial Probability Test (with mid-p adjustment that mitigates the effects of conservatism of exact methods while continuing to use the exact probabilities from the distribution being analyzed) to assess whether the degree of underutilization is extreme enough to be considered "beyond chance." Values less than .05 (in orange) are "statistically significant”. Because this test compares one group's representation against their availability (rather than comparing the selection rates of two groups, like the Statistical Test on the "Selection Rate Comparison" page), statistically significant findings (without other evidence) should not be considered as direct evidence of discrimination (because both discriminatory and non-discriminatory reasons can possibly account for the group's underutilization).

**Probability (Generalized Binomial):** This test uses an estimator technique for the Binomial Probability Test that usually produces values that are similar to the Exact Test (see note above). When the probability value output by this test approaches .05 or the sample sizes are small (<30), only the Exact Test should be used because the results from this test can overestimate probability values.

**Standard Deviation (Exact Binomial) and Standard Deviation (Generalized Binomial):** These outputs describe the degree of the Statistical Test findings. For example, if the output shows the likelihood of the statistical test value is "1 in 20," this means that the group's underutilization is so extreme that the odds of it occurring by chance is only 1 in 20, or about 5%. In other words, this result indicates that chance can be "ruled out" as a reason for this difference. The "Probability as Std. Deviations" describes the probability value (from the Statistical Test) in terms of standard deviations units, which are sometimes easier to interpret than small probability values. A standard deviation of 1.96 corresponds with a probability value of .05, and a likelihood of 1 chance in 20. Values greater than or equal to 1.96 are considered statistically significant and are highlighted in orange on the report.

**Eighty-Percent (80%) Test:** This test compares the representation of women and each race/ethnic group to their corresponding availability percentages. It is a rudimentary statistical test described in the Uniform Guidelines before more powerful computing capabilities allowed for more exact statistical tests (which are now accepted by courts and preferred). Adverse impact can exist with or without a violation of the 80% Test. Because the 80% Test is easily influenced by small numbers (when data sets are small) and it does not consider the probability distribution related to the data set, a greater consideration should be given to the Statistical Tests. Values less than 80% are highlighted in orange on the report.

### Selection Rate Comparison

A selection rate comparison is conducted when the passing rate of one group (e.g., women, individual minority group, 40+ years of age) is compared to the passing rate of another group (e.g., men, whites, <40 years of age). This type of analysis can be regarded as the “most typical” type of adverse impact analysis, and is specifically explained in the Uniform Guidelines as a “rates comparison” (see Section 4D) that compares passing rates on a practice, procedure, or test. The default setting is to compare each group to “the group with the highest rate” so long as the group with the highest rate accounts for at least 2.0% of the total pool.

Typically, an event-by-event analysis should be the primary comparison in most circumstances [the 1991 Civil Rights Act requires that a “particular employment practice” needs to be identified as the source of adverse impact for a plaintiff to establish a disparate impact case, unless the results are not capable for separation for analysis—see Section 2000e-2(k)(1)(A)(i)]. However, this tool may also be used to compare group passing rates on an overall process (e.g., applied v hired, applicants for promotion v promoted, etc.),

### **Results:**

**Probability (Fshr Exct, Mid-P):** This statistical test uses the (two-tail) Fisher Exact procedure to assess whether the difference in selection rates between two groups (e.g., men vs. women) is extreme enough to be considered "beyond chance." The Lancaster (1961, Significance tests in discrete distributions. J. Amer. Statist. Assoc. 56 223-234) correction has been included as a sensible compromise that mitigates the effects of conservatism of exact methods while continuing to use the exact probabilities from the small-sample distribution being analyzed. Values less than .05 are “statistically significant” and are highlighted in orange on the report.

**Standard Deviation (Fshr exct, Mid-P) and Standard Deviation (Chi-Square):** These outputs describe the degree of the Statistical Test findings. For example, if the output shows the likelihood of the statistical test value is "1 in 20," this means that the difference in passing rates between groups (e.g., men vs. women) is so extreme that the odds of it occurring by chance is only 1 in 20, or about 5%. In other words, this result indicates that chance can be "ruled out" as this reason for this difference. The "Probability as Std. Deviations" describes the probability value (from the Statistical Test) in terms of standard deviations units, which are sometimes easier to interpret than small probability values. A standard deviation of 1.96 corresponds with a probability value of .05, and a likelihood of 1 chance in 20. Values greater than 1.96 are considered statistically significant and are highlighted in orange on the report.

**Eighty Percent (80%) Test:** This test compares the passing rate of women (to men) and each race/ethnic group (to the group with the highest rate). It is a rudimentary statistical test described in the Uniform Guidelines before more powerful computing capabilities allowed for more exact statistical tests (which are now accepted by courts and preferred). Adverse impact can exist with or without a violation of the 80% Test. Because the 80% Test is easily influenced by small numbers (when data sets are small) and it does not consider the probability distribution related to the data set, a greater consideration should be given to the Statistical Tests. Values less than 80% are highlighted in orange on the report.

### Termination (i.e., Retention) Analysis

For ease of interpretation, it is best to think of “Termination Analyses” as “Retention Analyses,” that way one can interpret the results in the same way as other positive outcomes (e.g., hires, promotions, etc.). Here, the retention analysis is conducted by comparing the retention rate of one group (e.g., women, individual minority group, 40+ years of age) to the retention rate of another group (e.g., men, whites, <40 years of age). This type of analysis can be regarded as the “most typical” type of adverse impact analysis, and is specifically explained in the Uniform Guidelines as a “rates comparison” (see Section 4D) that compares the passing rates between two groups (e.g., men and women) on a practice, procedure, or test. The default setting is to compare the retention rate of each group to “the retention rate of the group with the highest rate” so long as the group with the highest rate accounts for at least 2.0% of the total pool.

** ****Results:**

**Probability (Fshr Exct, Mid-P):** This statistical test uses the (two-tail) Fisher Exact procedure to assess whether the difference in selection rates between two groups (e.g., men vs. women) is extreme enough to be considered "beyond chance." The Lancaster (1961, Significance tests in discrete distributions. J. Amer. Statist. Assoc. 56 223-234) correction has been included as a sensible compromise that mitigates the effects of conservatism of exact methods while continuing to use the exact probabilities from the small-sample distribution being analyzed. Values less than .05 are “statistically significant” and are highlighted in orange on the report.

**Standard Deviation (Fshr exct, Mid-P) and Standard Deviation (Chi-Square):** These outputs describe the degree of the Statistical Test findings. For example, if the output shows the likelihood of the statistical test value is "1 in 20," this means that the difference in passing rates between groups (e.g., men vs. women) is so extreme that the odds of it occurring by chance is only 1 in 20, or about 5%. In other words, this result indicates that chance can be "ruled out" as this reason for this difference. The "Probability as Std. Deviations" describes the probability value (from the Statistical Test) in terms of standard deviations units, which are sometimes easier to interpret than small probability values. A standard deviation of 1.96 corresponds with a probability value of .05, and a likelihood of 1 chance in 20. Values greater than 1.96 are considered statistically significant and are highlighted in orange on the report.

**Eighty-Percent (80%) Test:** This test compares the retention rate of women (to men) and each race/ethnic group (to the group with the highest retention rate). It is a rudimentary statistical test described in the Uniform Guidelines before more powerful computing capabilities allowed for more exact statistical tests (which are now accepted by courts and preferred). Adverse impact can exist with or without a violation of the 80% Test. Because the 80% Test is easily influenced by small numbers (when data sets are small) and it does not consider the probability distribution related to the data set, a greater consideration should be given to the Statistical Tests. Values less than 80% are highlighted in orange on the report.