Elvin 070518 - True Score Theory



Reliability has to do with the quality of measurement. In its everyday sense, reliability is the "consistency" or "repeatability" of measures. However, reliability cannot be truly calculated – it can only be estimated. Because of this, there are different types of reliability, each have multiple ways to estimate reliability for that type. In the end, it's important to integrate the idea of reliability with the other major criteria for the quality of measurement, validity, and develop an understanding of the relationships between reliability and validity in measurement.

The True Score Theory

True Score Theory is a theory about measurement. It maintains that every measurement is an additive composite of two components: true ability (or the true level) of the respondent on that measure; and random error.


The simple equation of X = T + eX has a parallel equation at the level of the variance of a measure. That is, across a set of scores, it is assumed that:

var(X) = var(T) + var(eX)

This will have important implications when considering some of the more advanced models for adjusting for errors in measurement.
True score theory is the foundation of reliability theory. A measure that has no random error (i.e., is all true score) is perfectly reliable; a measure that has no true score (i.e., is all random error) has zero reliability. It can be used in computer simulations as the basis for generating "observed" scores with certain known properties.

Measurement Error

The true score theory is a good simple model for measurement, but it may not always be an accurate reflection of reality. In particular, it assumes that any observation is composed of the true value plus some random error value. However, it is possible that some errors are systematic, that they hold across most or all of the members of a group. One way to deal with this notion is to revise the simple true score model by dividing the error component into two subcomponents, random error and systematic error.


Random Error

Random error is caused by any factors that randomly affect measurement of the variable across the sample. For instance, each person's mood can inflate or deflate their performance on any occasion. The important thing about random error is that it does not have any consistent effects across the entire sample. Instead, it pushes observed scores up or down randomly. This means that if we could see all of the random errors in a distribution they would have to sum to 0. The important property of random error is that it adds variability to the data but does not affect average performance for the group.


Systematic Error

Systematic error is caused by any factors that systematically affect measurement of the variable across the sample. For instance, if there is loud traffic going by just outside of a room where people are answering the questionnaire, this noise is liable to affect all of the people's scores. Unlike random error, systematic errors tend to be consistently either positive or negative. Because of this, systematic error is sometimes considered to be bias in measurement.


Reducing Measurement Error

So, how can we reduce measurement errors, random or systematic?

  1. Pilot test the instruments (the feedback survey together with the statistical model to evaluate the data), getting feedback from your respondents regarding how easy or hard the measure was and information about how the testing environment affected their performance (whether there were too many questions, the room was too cold etc).
  2. If you are gathering measures using people to collect the data (as interviewers or observers) you should make sure you train them thoroughly so that they aren't inadvertently introducing error. If it is in the form of a questionnaire, make sure that the questions are easily understood and not ambiguous.
  3. When you collect the data for your study you should double-check the data thoroughly. All data entry for computer analysis should be "double-punched" and verified. This means that you enter the data twice, the second time having your data entry machine check that you are typing the exact same data you did the first time.
  4. Use multiple measures of the same construct. Especially if the different measures don't share the same systematic errors, you will be able to triangulate across the multiple measures and get a more accurate sense of what's going on. This is one of the best ways to deal with measurement errors, especially systematic errors. For example, in a questionnaire, use multiple questions phrased differently to ask about the fun factor of a product (i.e ML).

Research Methods Knowledge Base, www.socialresearchmethods.net/kb/reltypes.php

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-Share Alike 2.5 License.