, Chi of E. coli could be considered as an over-represented one from 99 occurrences for a significance degree s of 0.0001. Because Chen-Stein bound is equal to 0.067726, Chen-Stein method does not permit to conclude for significance degrees of 0.01 and 0.001. Moreover, it is well known that Chi of E. coli is a very relevant word in this bacteria. Then, we expect a very small References M. Abadi. Exponential approximation for hitting times in mixing processes, Mathematical Physics Electronic Journal, vol.7, 2001.

Instantes de ocorrência de eventos raros em processos misturadores, 2001. ,

Sharp error terms and necessary conditions for exponential hitting times in mixing processes, Annals of Probability, vol.32, pp.243-264, 2004. ,

A Markov analysis of DNA sequences, J.Theor. Biol, vol.104, pp.633-645, 1983. ,

Two moments suffice for Poisson approximations: the Chen-Stein method, Ann. Prob, vol.17, pp.9-25, 1989. ,

Poisson approximation and the Chen-Stein method, Statist. Sci, vol.5, pp.403-434, 1990. ,

Compound Poisson approximation for nonnegative random variables via Stein's method, Ann. Prob, vol.20, pp.1843-1866, 1992. ,

Markov chain analysis finds a significant influence of neighboring bases on the occurrence of a base in eucaryotic nuclear DNA sequences both protein-coding and noncoding, J. Mol. Evol, vol.21, pp.278-288, 1985. ,

The complete genome sequence of escherichia coli k-12, Science, vol.277, pp.1453-1474, 1997. ,

Poisson approximation for dependant trials, Ann. Prob, vol.3, pp.534-545, 1975. ,

Introduction to Mathematical Analysis, 1996. ,

Characteristics of Chi distribution on different bacterial genomes, Res. Microbiol, vol.150, pp.579-587, 1999. ,

Whole-genome random sequencing and assembly of haemophilus influenzae rd, Science, vol.269, pp.496-512, 1995. ,

Extendable words in nucleotide sequences, Bioinformatics, vol.8, pp.129-135, 1992. ,

Poisson approximations for runs and patterns of rare events, Adv. Appl. Prob, vol.23, pp.851-865, 1991. ,

Statistical analyses of counts and distributions of restriction sites in DNA sequences, Nucl. Acids Res, vol.20, pp.1363-1370, 1992. ,

Markov Chains and Stochastic Stability, 1993. ,

seq++ : analyzing biological sequences with a range of Markov-related models, Bioinformatics, vol.21, pp.2783-2784, 2005. ,

Proteome analysis based on motif statistics, Bioinformatics, vol.18, pp.5161-5171, 2002. ,

LD-SPatt: Large Deviations Statistics for Patterns on Markov Chains, Comp. Biol, vol.11, pp.1023-1033, 2004. ,

URL : https://hal.archives-ouvertes.fr/hal-00271507

The effect of codon usage on the oligonucleotide composition of the e. coli genome and identification of over-and underrepresented sequences by Markov chain analysis, Nucl. Acids Res, vol.15, pp.2627-2638, 1987. ,

Finding words with unexpected frequencies in DNA sequences, J. R. Statis. Soc. B, vol.11, pp.190-192, 1995. ,

A unified approach to word occurrence probabilities, Discr. Appl. Math, vol.104, pp.259-280, 2000. ,

Compound Poisson and Poisson process approximations for occurrences of multiple words in Markov chains, J. Comput. Biol, vol.5, pp.223-253, 1998. ,

Probabilistic and Statistical Properties of Words: An Overview, J. Comput. Biol, vol.7, 2000. ,

Exact distribution of word occurrences in a random sequence of letters, J. Appl. Prob, vol.36, 1999. ,

URL : https://hal.archives-ouvertes.fr/hal-01222427

Structure of chi hotspots of generalized recombination, Cell, vol.24, pp.429-436, 1981. ,

DNA uptake signal sequences in naturally transformable bacteria, Res. Microbiol, vol.150, pp.603-616, 1999. ,

A bound for the error in the normal approximation to the distribution of a sum of dependent random variables, Proc. Sixth Berkeley Symp, vol.2, pp.583-602, 1972. ,

Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies, J. Mol. Biol, vol.281, pp.872-842, 1998. ,

Statistical analysis of yeast genomic downstream sequences reveals putative polyadenylation signals, Nucl. Acids Res, vol.28, pp.1000-1010, 2000. ,