# p-value

A p-value is the probability of observing data at least as favorable to $H_A$ as our current data set, if in fact $H_0$ were true.

If a p-value is low (usually lower than 5%) then we are able to reject $H_0$.

### Calculatig a p-value

The p-value can be calculated as the percentile of the normal distribution given $\bar{x}$, $\sigma$, and $\mu$.:

$$P(\bar{x}>9.7|\mu=8|\sigma=0.5)=0.0003$$

We can also represent this in terms of Z:

$$P(Z>3.4)=0.0003$$

We can also implement this in Julia/R:

# Note the '1-' to account for the > rather than <.
1-pnorm(9.7,8,0.5)

0.0003369292656768552


### Simulating for a p-value

We can compare our results to a simulation to calculate a p-value:

function simulate(success_count, fail_count)
g1 = []
g2 = []

for i in 1:success_count
if rand(1:2) == 1
append!(g1, 'A')
else
append!(g2, 'A')
end
end
for i in 1:fail_count
if rand(1:2) == 1
append!(g1, 'B')
else
append!(g2, 'B')
end
end

return (g1, g2)
end

differences = []
simulation_count = 10000
for i in 1:simulation_count
g1, g2 = simulate(35, 13)
append!(differences, (length(findall(g1 .== 'A'))/length(g1))-(length(findall(g2 .== 'A'))/length(g2)))
end

gr()
histogram(differences, bins=:scott, labels=["difference"])
plot!(title = "Frequency of Difference over \$(simulation_count) simulations")


# Calculate P value
length(findall(x -> (x >= 0.3) || (x <= -0.3),differences))/length(differences)

0.0229