Introduction to Simple Random Sampling in R
Simple random Sampling (SRS) is the most basic method of taking a probability sample. A sample of $n$ units is selected from a population $N$ using simple random sampling. Each of the $\binom{N}{n}$ possible samples has the same chance of being selected. The choice of the specific sample can be made using a random number generator on a computer. In this post we will learn about simple random sampling in R, that is, the selection of elements in a sample using simple random sampling.
The following commands will generate random permutations of $n$ integers or random samples from a population of numbers.
Random permutation of integers $1$ to $n$
The sample(n)
may be used to generate a random sample.
sample(10) ## Output [1] 5 8 9 4 3 2 1 6 10 7
Random permutation of elements in a vector $x$
A random selection of elements from a vector can be done using sample(n)
.
x <- c(20, 25, 19, -15, 4, 21, -1, 0, 23) sample(x) ## Output [1] 21 25 0 4 20 -15 -1 19 23
Random Sample of $n$ items from $x$ without replacement
A random selection of $n$ elements from a vector $x$ without replacement using sample(x, n)
x <- c(20, 25, 19, -15, 4, 21, -1, 0, 23) sample(x, 5) ## Output [1] -1 19 21 23 0
Random sample of $n$ items from $x$ with replacement
A random sample of $n$ items from vector $x$ can be selected with replacement using sample(x, 5, replace = T)
x <- c(20, 25, 19, -15, 4, 21, -1, 0, 23) sample(x, 5, replace = T) ## Output [1] 0 -1 4 19 -1
Random Sample with Probabilities
A random sample of $n$ items from $x$ with elements of $x$ having differing probabilities of selection. A vector of probabilities is required for each element in $x$. Note that the sum of elements in the probability vector must be one.
x <- c(23, 45, 69, -1, .9, 4, 25, 19) p <- c(.1, .1, 0, 0, .2, .3, .1, .2) sum(p) sample(x, 5, replace = T, p) ## Output [1] 4 19 19 19 45
Random Selection of Integers without Replacement
The random selection of $n$ integers from the integers 1 to $N$, without replacement can be done using sample(N, n)
sample(1000, 10) ##Output [1] 138 147 911 523 586 163 915 966 951 245
One can estimate $\mu$ and variance of $\mu$.
Let $y_1, y_2, \cdots, y_n$ be the measurements obtained from the simple random sampling of $n$ units from the population. The estimator of population mean $\mu$ is
$$\hat{\mu} = \frac{1}{n} \sum\limits_{i=1}^n y_i$$
with estimated variance of $\hat{\mu}$ given by
$$\hat{var(\hat{\mu})} = \frac{s^2}{n} \left( \frac{N-n}{N}\right)$$
where $s^2 = \frac{1}{n-1} \sum\limits_{i=1}^n (y_i – \overline{y})^2$.