Due Oct. 18 at Midnight.

First, obtain a copy of the RMarkdown for this exam

For this first part of the exam, you can either use surveys_complete.csv or your own data. If you are using your own data, you must have data in which you think you have a numerical predictor variable and a numerical response variable. If you are using surveys_complete, you can use weight and hindfoot_length for this.

  1. Load in your data. Which variable will you be using as a predictor, and which as a response? (5 pts)
# read in data here
# Answer which column is predictor and which is response
  1. Plot the two against each other with a scatter plot. Do the data appear to be related linearly? (5 pts)
# Plot here
#Answer here
  1. Fit the linear model. View the summary. (5 pts)
# Code here
  1. Does the summary make sense? Does our model have good precitive power? Evaluate the residual standard error, intercept, and R-Squared in particular. (10 pts)
# Answer here
  1. Plot the model on the graph. Increase the size of the text so it is comfortably readable at 5 feet. (5 pts)
# Code here
  1. Check the normality of the residuals. Do they look OK? Are we violating assumptions? (5 pts)
#Code here
#Answer here
  1. If you are using surveys_complete: Is there interspecific variation in the linear model? Hint: look at our prior work with facet plots. (15 pts)

If you are not using surveys_complete: Do you think there are groupings within your data that may have a separate linear model? Try grouping the data by that variable and redoing the lm. (15 pts)

#code here

Part Two

In this portion, you are free to use your own data if you have a categorical predictor variable and a response variable. Otherwise, you may use the columns sex and weight in surveys_complete

  1. First, plot the data grouped by sex, or a relevant variable in your data (5 pts)
# plot code here
  1. Try an ANOVA of this data (5 pt)
# ANOVA code here
  1. How about a linear model? What information does this give you that an ANOVA can’t? (5 pts)
#Code here
#Answer here
  1. Plot the lm with the data, with points colored by sex or your relevant variable. (10 pts)
#Plot code here

Part Three

  1. Add and commit this document (5 pts)
#Commands here
  1. Push your changes to github (10 pts)
#Commands here
  1. Make a pull request on my repo so I’m notified. Do this last, when you are done and ready for me to see it (10pts)

MS students

My expectation is that you’ll do this with your own data. If any part of this doesn’t make sense with your data, please get in touch ASAP so we can work it out.

In addition, please take one of the statistical tests we tried, and write it as a function in the R/ folder of your last name directory. Write appropriate documentation with it. Add, commit, and push it to Github. I’ll view it there.