A subsection of a data frame is called a slice. We can take slices of character vectors as well:
animal <- c("m", "o", "n", "k", "e", "y")
# first three characters
animal[1:3]
## [1] "m" "o" "n"
# last three characters
animal[4:6]
## [1] "k" "e" "y"
If the first four characters are selected using the slice animal[1:4]
, how can we obtain the first four characters in reverse order?
What is animal[-1]
? What is animal[-4]
? Given those answers, explain what animal[-1:-4]
does.
Use a slice of animal
to create a new character vector that spells the word “eon”, i.e. c("e", "o", "n")
.
Create a plot showing the standard deviation of the inflammation data for each day across all patients.
Write a function called analyze
that takes a filename as a argument and displays the three graphs produced in the previous lesson (average, min and max inflammation over time). analyze("data/inflammation-01.csv")
should produce the graphs already shown, while analyze("data/inflammation-02.csv")
should produce corresponding graphs for the second data set. Be sure to document your function with comments.
R has a built-in function called seq
that creates a list of numbers:
seq(3)
## [1] 1 2 3
Using seq
, write a function that prints the first N natural numbers, one per line:
print_N(3)
## [1] 1
## [1] 2
## [1] 3
Write a function called total
that calculates the sum of the values in a vector. (R has a built-in function called sum
that does this for you. Please don’t use it for this exercise.)
ex_vec <- c(4, 8, 15, 16, 23, 42)
total(ex_vec)
## [1] 108
Exponentiation is built into R:
2^4
## [1] 16
Write a function called expo
that uses a loop to calculate the same result.
expo(2, 4)
## [1] 16
Write a function called analyze_all
that takes a filename pattern as its sole argument and runs analyze
for each file whose name matches the pattern.
Write a function plot_dist
that plots a boxplot if the length of the vector is greater than a specified threshold and a stripchart otherwise. To do this you’ll use the R functions boxplot
and stripchart
.
dat <- read.csv("data/inflammation-01.csv", header = FALSE)
plot_dist(dat[, 10], threshold = 10) # day (column) 10
plot_dist(dat[1:5, 10], threshold = 10) # samples (rows) 1-5 on day (column) 10
One of your collaborators asks if you can recreate the figures with lines instead of points. Find the relevant argument to plot
by reading the documentation (?plot
), update analyze
, and then recreate all the figures with analyze_all
.