Proofs from the notebook Olivier's open lab book.

/ / / / /

Les honneurs

J’ai eu la chance au cours de la dernière année d’être accepté à Stanford, à Duke et à l’Université de Toronto pour mon doctorat en statistique. J’ai aussi reçu la bourse Alexander-Graham-Bell du Canada (105 000$ sur trois ans) en étant classé premier parmi toutes les demandes au CRSNG en sciences mathématiques!

C’est une belle récompense pour je crois beaucoup d’efforts, et j’ai bien hâte de poursuivre à Duke l’automne prochain. :^)

How to make a blog like this one

Here’s my new math blogging workflow:

  • I write my posts in Typora, which is a wysiwyg Markdown editor with Latex math support. (It’s really nice and you should check it out!)
  • I then push my project folder to Github and I get published in a few seconds.

It’s very easy to write math in this way. No need to struggle with formatting and rendering errors on Wordpress. You have an editor with instant preview and complete control over the appearence of the site.

Read more

Risquée p-valeur!

On enseigne dans le cours de Stat 101 à l’UQÀM qu’une p-valeur c’est le « plus petit risque à encourir pour rejeter $H_0$ ».

Qu’est-ce que ça veut dire? Le professeur affirme que l’idée est de contracter la phrase que la « p-valeur est le plus petit seuil sous lequel on rejete $H_0$ ». Il faudrait plutôt dire que « la p-valeur est le plus petit seuil sous lequel on aurait rejeté $H_0$ », puisque le test nécéssite la spécification du seuil avant d’avoir observé les données, mais passons… Pour effectuer la contraction, il faut ensuite identifier le seuil au risque de première espèce. Problème: comme ce seuil dépend maintenant des données, il n’a plus rien avoir avec le risque de quoi que ce soit.

Read more

Two Sampling Algorithms for Trigonometric Densities

Trigonometric densities (or non-negative trigonometric sums) are probability density functions of circular random variables (i.e. $2\pi$-periodic densities) which take the form

for some real coefficients $a_k, b_k \in \mathbb{R}$ which are such that $f(u) \geq 0$ and $a_0 = \frac{1}{2\pi} \int f(u)\,du = (2\pi)^{-1}​$. These provide flexible models of circular distributions. Circular density modelling comes up in studies about the mechanisms of animal orientation and also come up in bio-informatics in relationship to the protein structure prediction problem (the secondary structure of a protein - the way its backbone folds - is determined by a sequence of angles).

Here I am discussing two simple sampling algorithms for such trigonometric densities. The first is the rejection sampling algorithm proposed in Fernández-Durán et al. (2014) and the second uses negative mixture sampling.

Read more

The Significance of the Adjusted R Squared Coefficient

My friend Anthony Coache and I have been curious about uses and misuses of the adjusted $R^2​$ coefficient which comes up in linear regression for model comparison and as a measure of “goodness of fit”. We were underwhelmed by the depth of the literature arguing for its use, and wanted to show exactly how it behaves under certain sets of assumptions. Investigating the issue brought us to re-interpret the adjusted $R^2​$ and to highlight a new distribution-free perspective on nested model comparison which is equivalent, under Gaussian assumptions, to Fisher’s classical $F​$-test. This generalizes to nested GLMs comparison and provides exact comparison tests that are not based on asymptotic approximations. We still have many questions to answer, but here’s some of what we’ve done.

Read more