Abstract:
DNA sequences are sequences with elements from the quaternary DNA alphabet $\{A,C,G,T\}$. An important property of them is their directedness and ability to form duplexes as a result of hybridization process, i.e., coalescing two oppositely directed sequences. In biological experiments exploiting this property it is necessary to generate an ensemble of such sequences (DNA codes) consisting of pairs of DNA sequences referred to as Watson–Crick duplexes. Furthermore, for any two words of the DNA code that do not form a Watson–Crick duplex, hybridization energy – stability measure of a potential DNA duplex – is upper bounded by a constant specified by conditions of an experiment. This problem can naturally be interpreted in terms of coding theory. Continuing our previous works, we consider a nonadditive similarity function for two DNA sequences, which most adequately models their hybridization energy. For the maximum cardinality of DNA codes based on this similarity, we establish a Singleton upper bound and present an example of an optimal construction. Using ensembles of DNA codes with special constraints on codewords, which we call Fibonacci ensembles, we obtain a random-coding lower bound on the maximum cardinality of DNA codes under this similarity function.