https://hal-normandie-univ.archives-ouvertes.fr/hal-02299561Girardin, ValerieValerieGirardinLMNO - Laboratoire de Mathématiques Nicolas Oresme - UNICAEN - Université de Caen Normandie - NU - Normandie Université - CNRS - Centre National de la Recherche ScientifiqueLequesne, JustineJustineLequesneCLCC Henri Becquerel - Centre de Lutte Contre le Cancer Henri Becquerel Normandie RouenLMNO - Laboratoire de Mathématiques Nicolas Oresme - UNICAEN - Université de Caen Normandie - NU - Normandie Université - CNRS - Centre National de la Recherche ScientifiqueEntropy-Based Goodness-of-Fit Tests. Application to DNA ReplicationHAL CCSD2019DNA replicationKullback–Leibler divergenceRelative entropyGoodness-of-fit testsShannon entropy[MATH.MATH-ST] Mathematics [math]/Statistics [math.ST]Girardin, Valerie2019-09-27 18:17:082021-11-03 06:19:012019-09-27 18:17:08enJournal articles10.1080/03610926.2017.14010841This paper mainly aims at unifying as a unique goodness-of-fit procedure the tests based on Shannon entropy–called S-tests–introduced by Vasicek in 1976, and the tests based on relative entropy–or Kullback-Leibler divergence, called KL-tests–introduced by Song in 2002. While Vasicek’s procedure is widely used in the literature, Song’s has remained more confidential. Both tests are known to have good power properties and to lead to straightforward computations. However, some asymptotic properties of the S-tests have never been checked and the link between the two procedures has never been highlighted. Mathematical justification of both tests is detailed here, leading to show their equivalence for testing any parametric composite null hypothesis of maximum entropy distributions. For testing any other distribution, the KL-tests are still reliable goodness-of-fit tests, whereas the S-tests become tests of entropy level. Moreover, for simple null hypothesis, only the KL-tests can be considered. The methodology is applied to a real dataset of a DNA replication process, issued from a collaboration with biologists. The objective is to validate an experimental protocol to detect chicken cell lines for which the spatiotemporal program of DNA replication is not correctly executed. We propose a two-step approach through entropy-based tests. First, a Fisher distribution with non integer parameters is retained as reference, and then the experimental protocol is validated.