January 20, 2010
50% of what? How exactly are IC50 and EC50 defined?

The definition of EC50 and IC50

The concepts of IC50 and EC50 are fundamental to pharmacology. The EC50 is the concentration of a drug that gives half-maximal response. The IC50 is the concentration of an inhibitor where the response (or binding) is reduced by half.

Seems simple enough. But when you actually go to fit data to determine these values, there are several complexities and ambiguities.

The rest of this article is about IC50 (I for inhibition, for downward sloping dose-response curves). All the ideas can be applied to stimulatory curves and EC50 (E for effective) as well. Just stand on your head when you view the figures.

The ideal situation

This figure shows an ideal situation:

The green symbols show measurements made with controls. The ones on the left (Blank) have no inhibitor, so define "100%". The ones on the right are in the presence of a maximal concentration of a standard inhibitor, so define "0%". The data of the experimental dose-response curve (red dots) extend all the way between the two control values.

When fitting this curve, you need to decide how to fit the top plateau of the curve. You have three choices:

  • Fit the data only, ignoring the Blank control values.
  • Average the Blank control values, and set the parameter Top to be a constant value equal to the mean of the blanks.
  • Enter the blank values as if they were part of the dose-response curve. Simply enter a low dose, perhaps 10-10 or 10-11.  You can't enter zero, because zero is not defined on a log scale.

The results will be very similar with any of these methods, because the data form a complete dose-response curve with a clear top plateau that is indistinguishable from the blank. I prefer the third method, as it analyzes all the data, but that is not a strong preference.

Similarly, there are three ways to deal with the bottom plateau: Fit the data only,  set Bottom to be a constant  equal to the average of the NS controls, and put the NS controls into the fit as if they were a very high concentration of inhibitor. 

That is the ideal situation. There is no ambiguity about what IC50 means. 

A situation where IC50 can be defined in two ways

This figure shows an unusual situation where the inhibition curve plateaus well above the control values (NS) defined by a high concentration of a standard drug. This leads to alternative definitions of IC50.

Clearly, a single value cannot summarize such a curve. You'd need at least two values, one to quantify the middle of the curve (the drug's potency) and one to quantify how low it gets (the drug's maximum effect).

The graph above shows two definitions of the IC50.

The relative IC50 is by far the most common definition, and the adjective relative is usually omitted.  It is the concentration required to bring the curve down to point  half way between the top and bottom plateaus of the curve. The NS values are totally ignored with this definition of IC50. This definition is the one upon which classical pharmacological analysis of agonist and antagonist interactions is based. With appropriate consideration of the biological system and concentrations of interacting ligands, estimated Kd values can often be derived from the IC50 value defined this way (not so for the "so-called absolute IC50" mentioned below). 

The concentration that provokes a response halfway between the Blank and the NS value is sometimes called the absolute IC50,  The horizontal dotted lines show how 100% and 0% are defined, which then defines 50%.  This term is not very standard, and is a bit misleading as there is nothing absolute about an "absolute IC50". Since this value does not quantify the potency of a drug, I think it is more miselading than helpful. Authors of the International Union of Pharmacology Committee on Receptor Nomenclature (1) agree that the concept of absolute IC50 (and that term) is not useful (R. Neubig, personal communication).

If you really want to use the absolute IC50, here are instructions for fitting a curve to find it.

 

Incomplete dose-response curves

Any attempt to determine an IC50 by fitting a curve to the data in the graph above will be useless.  A curve fitting program might, or might not, be able to fit a dose-response curve to the data. But if the curve fits, the value of the IC50 is likely to be meaningless and have a very wide confidence interval. The data simply don't form a top plateau (which would define 100) or a bottom plateau (which would define 0). If data haven't defined 100 or 0, then 50 is undefined too, as is the IC50. 

If you also have control values that define 100 and 0, then the curve can be easily fit. The curve below was created by fitting a dose response curve, but constraining the Top plateau to be a constant value equal to the mean of the Blanks values, and the Bottom plateau equal to the mean of the NS values. 

The value of the IC50 fit this way only makes sense if you assume that higher concentrations of the inhibitor would eventually inhibit down to the NS values. That is an assumption that can't be tested with the data at hand.

The distinction between relative and absolute IC50 doesn't really apply to these data. Because the data don't define a bottom plateau, the IC50 must be defined relative to the NS control values. 

Fitting normalized data 

As you can see from all the examples above, it is not necessary to normalize the data to run from100% down to 0%. You can fit curves using data in their natural units. A common mistake is to assume that fitting dose-response curves requires that data first be normalized.

If you choose to normalize your data, it is essential that you think through carefully (and document in methods sections of papers) how 100% and 0% are defined. There are three strategies you can use:

  • From external controls (Blank and NS in the figures above). Since these values are so important, consider measuring these controls with more replicates than used for the rest of the experiment.
  • From very low and very high concentrations of the test substance.
  • From the plateaus of nonlinear regression. Fit the curve first, and then use the best-fit values of the Top and Bottom plateau to normalize the data. 

If you fit normalized data, you probably want Prism to force the curve to go from 100 down to 0. It won't know to do this, unless you tell it. Don't make the common mistake of normalizing your data, but not constraining the curve to go from 100 down to 0.  You can constrain the curve in two ways:

  • Choose to fit to a normalized model. The normalized models built in to Prism always go between 0 and 100.
  • Use a more general model, butt go to the Constrain tab, and set Bottom to a constant value of 0.0 and Top to a constant value of 100.0.

Summary

The concept of IC50 (or EC50) is a bit ambiguous unless you clearly specify which values define 100% and 0%. 

Reference

1. R. R. Neubig et al. International Union of Pharmacology Committee on Receptor Nomenclature and Drug Classification. XXXVIII. Update on terms and symbols in quantitative pharmacology. Pharmacol Rev (2003) vol. 55 (4) pp. 597-606

 

Download

Download the Prism file used to create all the graph in this article. 

 

 

January 15, 2010
Pooled SD in ANOVA

 ANOVA (one- and two-way) assumes that all the groups are sampled from populations that follow a Gaussian distribution, and that all these populations have the same standard deviation, even if the means differ. Based on this assumption, ANOVA computes a pooled standard deviation. This value is used in post tests.

The ANOVA results in Prism (and most programs) don't report this pooled standard deviation. But it is easy to calculate. As part of the ANOVA table, Prism reports several Mean Square values. One of these is the residual Mean Square (some programs use the term error rather than residual). The mean square values are essentially variances. The square root of the residual Mean Square is the pooled SD. 

How is this a pooled SD?

First, review how a SD of one group is computed: Calculate the difference between each value and the group mean, square those differences, add them up, and divide by the number of degrees of freedom (df), which equals n-1. That value is the variance. Its square root is the SD.

To compute the pooled SD from several groups, calculate the difference between each value and its group mean, square those differences, add them all up (for all groups), and divide by the number of df, which equals the total sample size minus the number of groups. That value is the residual mean square of ANOVA. Its square root is the pooled SD.

This case study uses the concept of pooled SD. 
 

November 22, 2009
Why does a normality test of residuals from nonlinear regression give different results than a norma

 Prism offers normality tests in two places:

  • As part of the Column Statistics analysis. This tests the normality of a stack of data. This analysis is intended to be used for Column data tables. If you entered data onto a Grouped or XY data table with subcolumns, these are averaged, and the calculations are performed only on the set of averages.
  • As part of the Nonlienar regression analysis. This tests the normality of the residuals. A residual is the distance of a value from the best-fit curve. If you entered replicate values into subcolumns, and chose the default option in nonlinear regression to fit each value individually, then the normality test is based on each individual value. 

If you run both normality tests on the same data, they ask different quesitons and so give different answers. 

As an example, create a new XY data table and choose the Michaelis-Menten enzyme kinetics example. There are 10 rows of data in triplicate with two missing values, so 28 Y values in all.

The graphs below show both analyses. The bottom left shows  a normality test as part of nonlinear regression (a choice in the Diagnostics tab), testing the null hypothesis that the 28 residuals from the best fit curve are sampled from a Gaussian distribution. The bottom right shows the results of  a normality test chosen in the Column statistics analysis, Prism first averaged the triplicates to compute ten means (one for each row). It then tests the null hypothesis that those ten means are sampled from a Gaussian distribution. 

The two analyses give different results. If it makes sense to fit a curve (as it does here), then the normality test performed as part of nonlienar regression is helpful, because nonlinear regression is based on the assumption that the residuals are Gaussian. The P value is high, so you conclude that the data are consistent with the assumption that the residuals are Gaussian. In contrast,  the normality test which is part of Column statistics really is not helpful. It tests whether the means of the triplicates are Gaussian. The low P value leads you to reject the assumption that the triplcates are Gaussian. But this is really not a relevant quesiton, so the answer is not useful.

Download the Prism file. 

 

November 18, 2009
Fitting dose-response curve when X is dose, rather than log(dose).

 The dose-response equations built-in to Prism all assume that the X values are log(dose). You can either enter the data with X values as logarithms of doses, or use the Transform analysis to create a results table with the data arranged that way which can then be graphed and fit. 

It is possible to fit data where X values are concentrations, rather than log(concentrations). It is necessary to adjust the equation accordingly. 

Here is the equation built-in to Prism for fitting a variable slope (four-parameter) log(dose) response curve:

   Y=Bottom + (Top-Bottom)/(1+10^((LogEC50-X)*HillSlope))

Here is the equation modified to expect X values to be concentrations, not logarithms, so the concentration does not need to be raised to the tenth power to antilog it:

  Y=Bottom + (Top-Bottom)/(1+ (10^logEC50 /X)^HillSlope)

The equation still fits the logEC50, rather than the EC50. Why? Because the confidence intervals computed by Prism are always symmetrical around the parameter value. But the true uncertainty is only symmetrical on a logEC50 scale.  

Download this Prism file to see how it works. The same data are fit  and graphed twice.

  • In one version the X values are transformed to logarithms, and then fit to the equation built-in to Prism. Here the graph has a linear X axis, but the numbering is converted to powers-of-ten to show that the X values represent logarithms.
  • In the other version, the data are fit with the X values remaining as concentrations and fit to the equation showed above. Here the X axis is stretched to a logarithmic scale (top right of Format Graph dialog). 

The two graphs look identical. The results of the two fits are identical. 

Note that the second graph will only look good in Prism 5, which is smart about plotting curves on axes stretched to a logarithmic scale. Prism 4 was not smart about this, and the resulting curve looks very choppy.

 

Why is the HillSlope applied to the EC50 as well as the X values?

Why doesn't Prism report the standard error of the EC50?