This is a lesson on tidying data. Specifically, what to do when a conceptual variable is spread out over 2 or more variables in a data frame.
Data used: words spoken by characters of different races and gender in the Lord of the Rings movie trilogy
gather() from the tidyr package. Includes references, resources, and exercises.Learner-facing dependencies:
tidy-data sub-directory of the Data Carpentry data directorytidyr package (only true dependency)ggplot2 is used for illustration but is not mission criticaldplyr and reshape2 are used in the bonus contentInstructor dependencies:
curl if you execute the code to grab the Lord of the Rings data used in examples from GitHub. Note that the files are also included in the datacarpentry/data/tidy-data directory, so data download is avoidable.rmarkdown, knitr, and xtable if you want to compile the Rmd to md and html