On enseigne dans le cours de Stat 101 à l’UQÀM qu’une p-valeur c’est le « plus petit risque à encourir pour rejeter $H_0$ ».

Qu’est-ce que ça veut dire? Le professeur affirme que l’idée est de contracter la phrase que la « p-valeur est le plus petit seuil sous lequel on rejete $H_0$ ». Il faudrait plutôt dire que « la p-valeur est le plus petit seuil sous lequel on aurait rejeté $H_0$ », puisque le test nécéssite la spécification du seuil avant d’avoir observé les données, mais passons… Pour effectuer la contraction, il faut ensuite identifier le seuil au risque de première espèce. Problème: comme ce seuil dépend maintenant des données, il n’a plus rien avoir avec le risque de quoi que ce soit.

Trigonometric densities (or non-negative trigonometric sums) are probability density functions of circular random variables (i.e. $2\pi$-periodic densities) which take the form

for some real coefficients $a_k, b_k \in \mathbb{R}$ which are such that $f(u) \geq 0$ and $a_0 = \frac{1}{2\pi} \int f(u)\,du = (2\pi)^{-1}$. These provide flexible models of circular distributions. Circular density modelling comes up in studies about the mechanisms of animal orientation and also come up in bio-informatics in relationship to the protein structure prediction problem (the secondary structure of a protein - the way its backbone folds - is determined by a sequence of angles).

Here I am discussing two simple sampling algorithms for such trigonometric densities. The first is the rejection sampling algorithm proposed in Fernández-Durán et al. (2014) and the second uses negative mixture sampling.

My friend Anthony Coache and I have been curious about uses and misuses of the adjusted $R^2$ coefficient which comes up in linear regression for model comparison and as a measure of “goodness of fit”. We were underwhelmed by the depth of the literature arguing for its use, and wanted to show exactly how it behaves under certain sets of assumptions. Investigating the issue brought us to re-interpret the adjusted $R^2$ and to highlight a new distribution-free perspective on nested model comparison which is equivalent, under Gaussian assumptions, to Fisher’s classical $F$-test. This generalizes to nested GLMs comparison and provides exact comparison tests that are not based on asymptotic approximations. We still have many questions to answer, but here’s some of what we’ve done.

My blog Math Stat Notes has been retired and I will continue to post on this new platform. It will be much easier to write mathematics on here, and I will now also be able to post code!

This is also the occasion for me to rethink the purpose of my blog. Up until now, I have used MathStatNotes as a way to showcase some of my side projects and to explore different ideas. I would now like my blog to be much more research and content-focused, with a slightly refined reading experience.

Hence the “open notebook” idea: you will find on here some of the projects that I’m working on, as well as my ideas and concerns. Note that it’s a raw lab notebook and I’m not trying to make it an easy read.

There is no point in hiding the process of research in my opinion. You find something that could be useful to you in here? Do get in touch! I will tell you more about it.