Problem and Setting
A large U.S. grocery store chain wanted to develop a solution to help them select top quality employees. The Store’s Management identified four behavioral characteristics as critical for the success of all store personnel. These behavioral traits were: helping disposition, reliability, rules compliance and dependability with respect to punctuality and attendance.
Based on these requirements, a solution was developed to identify successful top performing employees for hire.
Reductions in turnover saved the grocery chain $750,000.
In order to document the test battery’s validity and to assist in the setting of an appropriate cut score, a concurrent, criterion-referenced validation study was conducted. The concurrent, criterion-referenced validation method requires that the test(s) be administered to current employees. Performance data is then gathered on those employees. If the test(s) were a valid predictor of performance, one would expect a statistically significant correlation between test scores and the performance data collected. In other words, those employees who score high on the test are the same employees who demonstrate high levels of performance. Those employees who do poorly on the test would likely be those who demonstrate poor performance.
The four Scales identified as relevant to the job (i.e., Helping Disposition, Reliability, Rules Compliance and Responsibility) were administered to 108 grocery store personnel. Study participants included cashiers, stock clerks, deli personnel and butchers. The supervisors of these employees were then asked to rate them on reliability, attendance and punctuality, helping disposition, policy compliance and trustworthiness.
Data Analysis and Results
Various statistical methods were used during the data analysis process. The following is a brief summary with associated results.
In order to determine if we could sum all of the performance ratings for each study participant to create an overall measure of job performance, the performance ratings were factor analyzed creating an overall performance measure. This “overall performance measure” was then used as the performance criteria for this study.
To determine the extent to which test scores were related to performance, correlation analyses were performed. The first set of correlation analyses were performed to determine the degree to which the test battery chosen was predictive of the overall job performance rating. In addition, the correlations for the individual scales were also calculated.
The results are presented in Table 1.
A review of Table 1, shows that the test battery had an uncorrected correlation of .35 with overall performance and this correlation was statistically significant (p<.001). When looking at the individual scales, the Reliability Scale was the most predictive (r=.32, p<.001), followed by the Responsibility Scale (r=.27, p<.004), Rules Compliance (r=.26, p<.006) and Helping Disposition (r=.22, p<.023). These results indicate that the overall battery score, as well as the individual scales, were predictive of job performance and all correlations met the legal standards for statistical significance.
Cut Score Analysis
One of the most frequently used measures of the utility of a selection procedure involves evaluating how well the procedure screens out those applicants who will become below satisfactory performers once they are hired. To the extent that these false positives are reduced, the costs associated with them (such as the cost of errors on the job, the costs associated with any additional training, costs of needed overtime to compensate for poor performance and the cost of terminating them and hiring a replacement) are also reduced. While the ideal selection procedure would screen out 100% of these individuals, this is very rarely (if ever) the case. A selection procedure is useful if it reduces the proportion of below satisfactory prospects at all (over random selection or the existing selection procedures). The actual usefulness (or utility) of the selection procedure depends on a number of factors, including the value of a satisfactory or above satisfactory employee and the cost of selecting an applicant who becomes a below satisfactory employee.
In order to evaluate how the test battery score could best be used, the effect of different “cut scores” on the number of unsatisfactory performers was examined, using the study sample. Job performance level was defined as follows: if the employee’s overall job performance rating was less than 4.0 (on a 1-7 performance rating scale), the employee was considered unsatisfactory. Using this definition for unsatisfactory performers, the percentages of the “selected” group considered unsatisfactory at various cut scores were calculated.
Table 2 presents various cut scores, along with their operational validities for the overall job performance ratings, the percentage of applicants who would be screened out and who would “pass” if the cut score were implemented and the resulting percentage of unsatisfactory and satisfactory performers based on the study sample. The suggested cut score (35th percentile) is highlighted. This cut score offers high operational validity and screens out a high percentage of unsatisfactory performers.
Statistics for Cut Score Evaluation
Table 2 above shows that by implementing a cut that would screen out 35% of applicants one could reduce the percentage of unsatisfactory performers from 9% to 3% (i.e., by 67%).
Table 3 shows the expected average performance levels of the grocery store’s employees when using the 35th percentile cut score.
A review of Table 3 shows that the average performance rating for employees in the validation sample who scored below the 35th percentile was 4.8. This is almost one full point below the average performance rating for employees who scored above the 35th percentile (5.60).
From the results of the statistical analyses presented here, it can be concluded that the HRTL Scales evaluated in this validation study were highly predictive of overall job performance for the employees of a large U.S. grocery store.
Aside from significantly increasing the standardization, objectivity and validity of the chain’s selection process, the CEO reported a significant return on investment, documenting the company saved $750,000 in turnover costs in the first year alone of implementing the solution.