Data Analysis and Application

This is from Warner (2013) book Page 179

Applied Statistics: From Bivariate Through Multivariate Techniques

Warner, Rebecca (Becky) M. (Margaret)

 

To complete questions 9 and 10, use the bpstudy.sav file in the Resources.Select three variables from the dataset bpstudy.sav.

Two of the variables should be good candidates for a correlation, and the other variable should be a poor candidate for a correlation. Good candidates are variables that meet the assumptions (such as normally distributed, reliably measured, interval-ratio level of measurement). Poor candidates are variables that do not meet assumptions or that have clear problems (such as restricted range, extreme outliers, gross non-normality of distribution shape).

Use the FREQUENCIES procedure to obtain a histogram and all univariate descriptive statistics for each of the three variables.

Create a scatter plot for the two “good candidate” variables.

Create a scatter plot for the “poor candidate” variable using one of the two good variables. Properly embed SPSS output where appropriate in your answer to Question 9 below. Explain which variables are good and poor candidates for correlation analysis and give your rationale. Comment on empirical results from your data screening—both the histograms and scatter plots—as evidence that these variables meet or do not meet the basic assumptions necessary for correlation to be meaningful and honest. What other information would you want to have about the variables in order to make better-informed judgments?

Is there anything that could be done (in terms of data transformations or eliminating outliers for instance) to make your poor candidate variable better?

If so, what would you recommend?