Reliability --
consistency, trustworthiness, getting the same answer each
time.
Major types:
1. Test-Retest --
The correlation of scores of a test that has been administered twice to a group
of subjects. Example see Shaw and
Wright The custodial mental illness
ideology scale. pages 108-111.
2. Alternate Form --
The correlation between scores of two distributions obtained on the
administration of two tests that have been constructed as having the identical
content. Example see Shaw and
Wright Attitudes toward feminism belief
patterns scale page 278-287.
3.
Split-half Reliability -- From a single administration, a test is divided
into two equal parts. A correlation
is calculated for the two halves.
Some common method for scales used in social work practice.
Example see Shaw and Wright
Opinionaire on attitudes toward
education pages 80-83.
4.
Interitem or Internal Consistency -- Assesses how well the different
items measure the same issue of concept.
The mean of all possible split-half reliabilities.
Difficult to use without a computer.
Commonly used today.
Example see Corcoran and Fischer
Hypercompetitive attitude scale. Pages 353 in volume II.
5.
Scorer Reliability -- The correlation of two scores obtained by
independent examiners. Example see
DSM I, II, III, III-R, IV, etc.
Validity --
What the test measures, how well it does so and that it
does not measure something other than the measurement target.
1.
Content Validity -- The
systematic examination of the test content to determine whether it covers a
representative sample of the behavior domain to be measured.
Type: Face validity,
item analysis
2.
Criterion-Related Validity --
the effectiveness of a test in predicting an individual’s behavior in specific
situations. Example does passing
the state exam mean that one is going to be a good social worker.
Types: Concurrent,
Predictive, synthetic validity
3.
Construct Validity -- The
extent to which a test may be said to measure a theoretical construct or trait.
Types: correlations
with other tests factor analysis, internal consistency, and convergent and
discriminate validation
Fischer, J., & Corcoran, K. (2007). Measures for Clinical Practice and Research: A Sourcebook. NY: Oxford University Press.
Gwet, K. (2001). Handbook of Inter-Rater Reliability: How to Estimate the Level of Agreement Between Two or Multiple Rater. Gaithersburg, MD: STATAXIS.
Shaw, M. V., & Wright, J. M. (1967). Scales for the Measurement of Attitude. NY: McGraw-Hill.