/ / / / /

# Risquée p-valeur!

On enseigne dans le cours de Stat 101 à l’UQÀM qu’une p-valeur c’est le « plus petit risque à encourir pour rejeter $H_0$ ».

Qu’est-ce que ça veut dire? Le professeur affirme que l’idée est de contracter la phrase que la « p-valeur est le plus petit seuil sous lequel on rejete $H_0$ ». Il faudrait plutôt dire que « la p-valeur est le plus petit seuil sous lequel on aurait rejeté $H_0​$ », puisque le test nécéssite la spécification du seuil avant d’avoir observé les données, mais passons… Pour effectuer la contraction, il faut ensuite identifier le seuil au risque de première espèce. Problème: comme ce seuil dépend maintenant des données, il n’a plus rien avoir avec le risque de quoi que ce soit.

# Two Sampling Algorithms for Trigonometric Densities

Trigonometric densities (or non-negative trigonometric sums) are probability density functions of circular random variables (i.e. $2\pi$-periodic densities) which take the form $f(u) = a_0 + \sum_{k=1}^n(a_k \sin(k u) + b_k\cos(ku)) \tag{1}$

for some real coefficients $a_k, b_k \in \mathbb{R}$ which are such that $f(u) \geq 0$ and $a_0 = \frac{1}{2\pi} \int f(u)\,du = (2\pi)^{-1}$. These provide flexible models of circular distributions. Circular density modelling comes up in studies about the mechanisms of animal orientation and also come up in bio-informatics in relationship to the protein structure prediction problem (the secondary structure of a protein - the way its backbone folds - is determined by a sequence of angles).

Here I am discussing two simple sampling algorithms for such trigonometric densities. The first is the rejection sampling algorithm proposed in Fernández-Durán et al. (2014) and the second uses negative mixture sampling.

# The Significance of the Adjusted R Squared Coefficient

My friend Anthony Coache and I have been curious about uses and misuses of the adjusted $R^2$ coefficient which comes up in linear regression for model comparison and as a measure of “goodness of fit”. We were underwhelmed by the depth of the literature arguing for its use, and wanted to show exactly how it behaves under certain sets of assumptions. Investigating the issue brought us to re-interpret the adjusted $R^2$ and to highlight a new distribution-free perspective on nested model comparison which is equivalent, under Gaussian assumptions, to Fisher’s classical $F$-test. This generalizes to nested GLMs comparison and provides exact comparison tests that are not based on asymptotic approximations. We still have many questions to answer, but here’s some of what we’ve done.