What is the difference between ordinal, interval and ratio variables? Why should I care?
Many statistics books begin by defining the different kinds of variables you might want to analyze. This scheme was developed by Stevens and published in 1946.
- A categorical variable, also called a nominal variable, is for mutual exclusive, but not ordered, categories. For example, your study might compare five different genotypes. You can code the five genotypes with numbers if you want, but the order is arbitrary and any calculations (for example, computing an average) would be meaningless.
- A ordinal variable, is one where the order matters but not the difference between values. For example, you might ask patients to express the amount of pain they are feeling on a scale of 1 to 10. A score of 7 means more pain that a score of 5, and that is more than a score of 3. But the difference between the 7 and the 5 may not be the same as that between 5 and 3. The values simply express an order. Another example would be movie ratings, from * to *****.
- A interval variable is a measurement where the difference between two values is meaningful. The difference between a temperature of 100 degrees and 90 degrees is the same difference as between 90 degrees and 80 degrees.
- A ratio variable, has all the properties of an interval variable, and also has a clear definition of 0.0. When the variable equals 0.0, there is none of that variable. Variables like height, weight, enzyme activity are ratio variables. Temperature, expressed in F or C, is not a ratio variable. A temperature of 0.0 on either of those scales does not mean 'no heat'. However, temperature in Kelvin is a ratio variable, as 0.0 Kelvin really does mean 'no heat'. Another counter example is pH. It is not a ratio variable, as pH=0 just means 1 molar of H+. and the definition of molar is fairly arbitrary. A pH of 0.0 does not mean 'no acidity' (quite the opposite!). When working with ratio variables, but not interval variables, you can look at the ratio of two measurements. A weight of 4 grams is twice a weight of 2 grams, because weight is a ratio variable. A temperature of 100 degrees C is not twice as hot as 50 degrees C, because temperature C is not a ratio variable. A pH of 3 is not twice as acidic as a pH of 6, because pH is not a ratio variable.
|OK to compute....||Nominal||Ordinal||Interval||Ratio|
|median and percentiles.||No||Yes||Yes||Yes|
|add or subtract.||No||No||Yes||Yes|
|mean, standard deviation, standard error of the mean.||No||No||Yes||Yes|
|ratio, or coefficient of variation.||No||No||No||Yes|
Does it matter for data analysis? The concepts are mostly pretty obvious, but putting names on different kinds of variables can help prevent mistakes like taking the average of a group of zip (postal) codes, or taking the ratio of two pH values. Beyond that, I don't see how putting labels on the different kinds of variables really helps you plan your analyses or interpret the results.
Note that the categories are not as clear cut as they sound. What kind of variable is color? In a psychological study of perception, different colors would be regarded as nominal. In a physics study, color is quantified by wavelength, so color would be considered a ratio variable. What about counts? If your dependent variable is the number of cells in a certain volume, what kind of variable is that. It has all the properties of a ratio variable, except it must be an integer. Is that a ratio variable or not? These questions just point out that the classification scheme appears to be more comprehensive than it is. Read more about these problems.
Variables like pH and the logEC50 don't really fall into any of these categories.
Keywords: levels of measurement