## Please enable JavaScript to view this site.

 Enter data for multiple regression

## 1. Create a data table

From the Welcome or New Table dialog, choose to create a multiple variable data table.

If you are just getting started, choose the sample data for Multiple linear regression (text variables). Alternatively, the Multiple linear regression (dummy coding) sample data shows how categorical variables can be entered directly as numbers.

## 2. Enter data

Each row represents a different observation (individual, animal, experiment or something).

Each column represents a different variable. At its simplest, a variable is a trait, quality, or quantity that you can measure (for example, the weight, height, or age of an individual or animal). Prism 9 allows for three different types of variables:

Categorical: Qualitative information with a finite number of groups or categories. Sex (Female or Male) and Education (High School, College, Post-graduate) are examples.

Continuous: Quantitative information with an infinite number of possible values (numbers). Height (68.3 in, 72.4 in, 61.25343863... in) and Time (8.23 ns, 1.90 min, 5.3924 hrs, 6.5•109 yrs) are examples. Integers can often be treated as categorical information (like group number), but are more often continuous (like number of children or number of questions answered correctly on a test).

Label: Qualitative information used only to identify observations. Experimental IDs, Names, Social Security Numbers, etc. are examples. In most cases, labels are unique for each row.

When entering data into a variable, Prism will automatically detect the type of variable being entered, and display a gray icon at the top of the variable to indicate its type. If Prism doesn’t assign the variable type that you wanted, click the icon to change the variable type manually (the icon will become orange.

Categorical variables can either be entered directly with group names (as text) or as coded variables. Prism will automatically encode categorical variables for use in appropriate analyses using dummy coding (also called indicator coding or reference coding). There are other alternative codings such as effects coding that can be used when entering coded variables manually. One good source to learn about these coding methods is Glantz and Slinker, cited below.

Note that there is no need to code interaction(s) manually. Prism will allow you to add interaction(s) automatically in the parameters dialog.

## 3. Run multiple regression

Click Analyze, choose multiple linear regression from the list of analyses for multiple variable tables, and click OK. The multiple regression dialog has seven tabs:

Model. Choose which variable is the dependent variable and which other variables to include as independent variables. Also choose any interactions or transforms you wish to include in the model.

Reference level. Set a reference level for any categorical variable in the specified model. The reference level generally indicates a “baseline” or “usual” level of the categorical variable and is important for results interpretation.

Interpolation. Use the model built by Prism to predict values for the outcome variable based on values for the predictor variables.

Compare. Choose a second model and specify how the fit of the two models should be compared.

Weighting. Usually all data are weighted equally, but you can specify another weighting scheme.

Diagnostics. Specify which extra results Prism should report.

Residuals. Plot the residuals (the difference between actual and predicted Y values) in several different ways.

## Reference

Glantz and Slinker, Primer of Applied Regression and Analysis of Variance, 3rd edition, Chapter “Using linear regression to do one-way analysis of variance with any number of treatments”, page 391