this post was submitted on 11 Oct 2023
8 points (100.0% liked)
The R Project for Statistical Computing
21 readers
1 users here now
Everything about the R programming language.
Rules
- No bigotry
Check out
-
RStudio Community forum
-
#rstats on Twitter
founded 4 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
@Jey_snow @rstats @phdstudents @datascience @socialscience @org_studies
P-values can be generated from various statistical tests, so a P-value gives no indication of whether the appropriate test was used to analyze the data.
Here are a couple of papers on P-values:
https://pubmed.ncbi.nlm.nih.gov/29566133
https://pubmed.ncbi.nlm.nih.gov/26545564
@MarcusMuench @rstats @phdstudents @datascience @socialscience @org_studies
According to the second article:
"...A p value should be interpreted in terms of what would happen if you repeated the measurement multiple times with different samples..."
If I have a census, I would expect zero difference for repeating measurements due to random sampling. Therefore p values are irrelevant for census data.
Thanks for the references!
Careful there. If you had a census of ALL the people in your population then you would not expect any variation, as you wrote. But not because of random sampling, but because next time you sample, you would just sample the exact same people (your whole population). And since the sample stays the same, so do the numbers.
If you however truly took a random sample of the population, then the next time you take a new random sample you ask different people and would therefore also get slightly different numbers. And there p-values are useful, because they are based on exactly this question of "well what if I took another random sample, and then another, and then another and so on".