SHL, in the SHL Customer Styles Contact manual allege that averaging percentiles is unacceptable psychometric practice. But no citation was provided. Their argument seems to be arcane, at best.

SHL, in the SHL Customer Styles Contact manual allege that averaging percentiles is unacceptable psychometric practice. But no citation was provided. Their argument seems to be arcane, at best.
Through a review of several well known psychometric textbooks, very few even mentioned the issue. For example, Murphy and Davidshofer (1997), perhaps the most highly regarded Psychometrics textbook in Industrial & Organizational Psychology, does not mention this issue. Based on an analysis of other books which discuss the possible use of stens, the issue is so arcane; it is not widely accepted or even discussed among Psychometricians. In addition, it addresses scoring algorithm issues which are of a proprietary nature. In summary, SHL makes assumptions in regards to things they simply do not have information about, and are in fact false. In short, their argument is arcane and incorrect.
SHL and the 16PF use Sten scores, which are unusual. Although textbooks discuss Stanine scores, we could not locate one which explains Sten scores. Nor does SHL provide a citation justifying sten scores. While we will not say they are never appropriate, we can explain the real disadvantage to their use. They are far less interpretable, and have real limitations in practice.
Percentile scores are far more useful than sten scores. SHL acknowledges this, “However, percentiles also have the advantage that they are easily understood and can be very useful when giving feedback of results or discussing results with line managers etc.” (SHL, p.24). T
The Educational Testing Service uses percentiles when reporting the SAT (college entrance exam), GRE (graduate school entrance exam), MCAT (Medical school entrance exam), LSAT (Law School Admissions Test), etc. Chally as well as the majority of other psychometricians prefers that the test information be usable rather than focus clients on arcane statistical procedures.
Sten scores have a real disadvantage in usability. Most hiring decisions are made near the cut-off or “passing” scores. It is not difficult to determine if the very top or very low scores should be hired. However, people around the cut-scores or near the average can be difficult decisions. These are the scores you are most concerned with.
The table (see below) based on data presented in the SHL manual demonstrates how difficult it is to make distinctions near the average when using sten scores. If a person receives a sten score of 5, they are somewhere in the range of the 32nd to 50th percentile. Thus, you don’t know if they are exactly average, or in the bottom third of scores. Managers want to know where exactly they fall. Someone average on one scale can be a good employee if they are above average on other scales. With effective scoring categories (like Chally and others), you have 18 levels of differentiation within their 1 band of scores.
Just as awkwardly for SHL, if someone receives a sten score of 6, they are somewhere between the 51st and 69th percentile, or just barely average to the top third of scores. I think managers what more detail when making a hiring decision. SHL simply can’t provide it. SHL’s tests make such coarse distinctions that their usefulness is limited. SHL says they want to guide “the user not to over-interpret small difference between scores.” (SHL, p. 26)
What they are admitting is their measure cannot reliably differentiate people in the bottom third from someone exactly average, or that average person from someone in the top third. Even more problematic: 37% of their test scores get either a 5 or 6 sten score.
They say they have 10 sten categories, but in fact, they can’t even partition the scores into 10 categories as “raw-score values were not found to correspond to every sten-score value.”. They actually only have 8 categories of test scores.
One more thing about the 16pf, they have separate norms for men and women. In many countries this “discriminatory” test results are illegal and forbidden for employment or work applications including the United States under the Civil Rights Act of 1991. Using the 16pf exposes an organization to legal action in the U.S.
References
• Conn, S., & Rieke, M. (1994). 16pf, 5th Ed., Technical Manual. Institute for Personality and Ability Testing: Champaign, Illinois.
• Murphy, K. R., & Davidshofer, C. O. (1998). Psychological Testing: Principles and Applications, 4th ed. Prentice Hall: upper Saddle River New Jersey.
• SHL (1997). Customer Styles Contact Questionnaire. SHL Group plc: Princeton, NJ.