p-value - wint

p-value

en: p-value

ja: p値、P値

aka. significant probability

ja: 有意確率

統計の謎概念

非常に誤解しやすい。

謎

これはなんだ？wint.icon

注意

有意性と effect size は別なコトに注意せよ。

端的な定義

帰無仮説を棄却できる最小の有意水準

ref. 黒木学 (2020)『数理統計学』p. 197

https://x.com/toshizumi1225/status/1803321269284479289

帰無仮説が成り立つと仮定した確率モデルにおいて実際にデータから計算された統計量よりも極端な統計量が観測される確率

解釈

ざっくり、（モデルから見た）外れ値が出る確率、で良い様だ。

信頼区間や統計的仮説検定と深い関係がある様だ。

分類

p-value

⊂ test statistic（検定統計量）

⊂ sample statistic（標本統計量）= statistic（統計量）

操作的定義

形式的に定式化する。

ref. probability - Two definitions of p-value - Mathematics Stack Exchange (answer)

ℙ, 𝕀 が謎

3種類の定義がある。

The probability of observing under H₀ a sample outcome at least as extreme as the one observed is called the P-value. The smaller the P-value, the more extreme the outcome and the stronger the evidence against H₀.

ref. Rohatgi (2001)

p-value. In general, the p-value is the smallest level α₀ such that we would reject the null-hypothesis at level α₀ with the observed data.

ref. Schervish (2012)

形式的な定義に一番近そう。

p値の満すべき性質（制約）

A p-value $ p(X) is a test statistic satisfying $ 0 \le p(X) \le 1 for every sample point $ x. Small values of $ p(X) give evidence that $ H_1 is true. A p-value is valid if, for every $ \theta \in \Theta_0 and every $ 0 \le \alpha \le 1: $ P_{\theta}(p(X) \le \alpha) \le \alpha.

ref. Casella, Berger (2002) Statistical Inference, ISBN: 9780534243128

def. 8.3.26 (§8.3.4, p. 397)

設定

モデルのパラメーター空間（の部分空間） Θ

population parameter θ

これを2分割する。

帰無仮説 H₀

Θ の2分割の片方

対立仮説 H₁

Θ の2分割のもう片方

標本 s

そのドメインは S？

標本集合 Ω

標本点 ω

事象空間 F

標本のあつまり S？

検定統計量 T

確率測度 P, ℙ: F→I

power function β

検出力函数という関数らしい

cf. 検出力

sample size に比例する？

サイズ = 水準: α ∈ I

def. 偽陽性の sup

size

ref. (Casella; Berger, 2002, def. 8.3.5, p. 385)

level

ref. (Casella; Berger, 2002, def. 8.3.6, pp. 382–383)

同義語

有意水準 α*

自分で決められる。

モデル外部からのパラメーターと言えるか。

2元空間 2 = {0,1}

単位区間 I

単位開区間 $ I_\text{open} = (0,1)

仮説検定関数 φ

a collection of hypothesis tests $ φ_α

指示関数 $ φ_α(x) は α について単調

$ P_\theta(\bold{X}) : P of type 1 error OR 1 - P of type 2 error

ref. (Casella; Berger, 2002, §8.3.1, pp. 382–383)

これで power function β を定義する。

ref. (Casella; Berger, 2002, def. 8.3.1, p. 383)

要約

p値とは、標本ごとに水準を返す関数 p

標本は標本点でなく事象か？ #TODO

定義

$ p: S → I_\text{open} = λs. \inf\{α∈I_\text{open} \mid φ_α(s) = 1\}

prop.

自分の理解と手順

帰無仮説と対立仮説を立ててモデル化する。

ここで有意水準を決められる。

それを下回るなら検定の trade-off として甘受して検定する。

帰無仮説を棄却する一様最強検出力検定を実施したときの水準αのこと？

有意水準なので、あってそう

注意

検定理論も古典統計なので、プロセスの正しさしか示せない。モデルの正しさは別途たしかめる必要がある。

p-値が小さいことからモデルの正しさを結論するのは、後件肯定の誤謬である。

ref. 黒木学 (2020)『数理統計学』p. 197–198

統計的仮説検定の具体例 #TODO

t検定

標本統計量の分布を利用する。

つまり parametric test

Wilcoxonの順位和検定

nonparametric test

ref.

p-value - Wikipedia

Test statistic - Wikipedia

p値の正しい解釈方法 How to Correctly Interpret P Values

https://www.youtube.com/watch?v=vz9cZnB1d1c

Thread by @genkuroki on Thread Reader App – Thread Reader App

https://threadreaderapp.com/images/screenshots/thread/1345346374640967683.jpg

https://twitter.com/genkuroki/thread/1345346374640967683

P値と信頼区間は表裏一体で別の話題にはならない。

https://twitter.com/genkuroki/status/1345346377316933633

P値の定義は大雑把に「データの数値以上の偏りがモデル内で生じる確率」です。

https://twitter.com/genkuroki/status/1334520319193726977

信頼区間=検定で棄却されないモデルのパラメータの範囲

https://x.com/genkuroki/status/1804178038105739275

P値はモデルの確率分布でデータの数値以上に極端な値が生成される確率またはその近似値

https://ncatlab.org/nlab/show/statistical+significance

Size (statistics) - Wikipedia

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations | SpringerLink

https://doi.org/10.1007/s10654-016-0149-3

Biau, David Jean MD1, a; Jolles, Brigitte M. MD Msc, MD2; Porcher, Raphaël PhD1. P Value and the Theory of Hypothesis Testing: An Explanation for New Researchers. Clinical Orthopaedics and Related Research 468(3):p 885-892, March 2010. | DOI: 10.1007/s11999-009-1164-4

https://journals.lww.com/clinorthop/fulltext/2010/03000/p_value_and_the_theory_of_hypothesis_testing__an.32.aspx

https://doi.org/10.1007%2Fs11999-009-1164-4

#statistics #検定理論