Know how to read from a CSV file saved in my Public directory and store it as an object in R
Use the str
function and explain its output for a given data frame
c
functionseq
function:
operatorExplain what the factor
function does in R and why it is frequently used to improve the layout of plots
[ ]
operators
-
operator do inside of the [ ]
operatorsUse the ?
function to obtain help on a function or data set in a package in R
Give the definition of tidy data?
Identify whether a data set follows the tidy guidelines and explain why or why not using the definition.
install.packages
library
data
View
Identify the observation unit, variables, and variable types (continuous, discrete, categorical) from a problem statement / data set
Explain how normal forms of data can be used to assist with data analysis
Be able to identify which plot(s) are most appropriate from a problem statement / variable types in a data set.
Explain the differences between the Five Named Graphs
Create a histogram in R using ggplot2
from an appropriate variable in a data set
Explain why a faceted histogram is useful in comparing the distribution of a numeric variable across the groups of a categorical variable.
Compare and contrast the usefulness and differences between faceted histograms and boxplots.
Describe the components of a boxplot
Use a boxplot to compare two distributions
Create a histogram in R using ggplot2
from an appropriate variable in a data set
Describe why pie charts should be replaced with barplots from a visual perception perspective
Explain the differences between stacked, side-by-side, and faceted barplots
ggplot2
from appropriate variable(s) in a data set
Explain how jitter
and/or alpha
can be applied to a plot to help the reader understand the relationship between two variables
ggplot2
from appropriate variable(s) in a data set
Give a scenario where a line-graph is more appropriate than a scatter-plot
Clarify what makes up a statistical graphic
Describe what aes
, geom_
, facet
, and position
correspond to in a ggplot
function call
Clarify why R code produces an error and fix the code to produce the correct result.
Describe how you could use color
to look at the relationship between three continuous variables.
Describe how one can use color
, fill
, size
, etc. to show other multivariate (more than two variables) relationships