Detail publikace

Soubor pravidel pro podvzorkování genomických signálů

SEDLÁŘ, K. ŠKUTKOVÁ, H. VÍTEK, M. PROVAZNÍK, I.

Originální název

Set of rules for genomic signal downsampling

Český název

Soubor pravidel pro podvzorkování genomických signálů

Anglický název

Set of rules for genomic signal downsampling

Typ

článek v časopise

Jazyk

en

Originální abstrakt

Comparison and classification of organisms based on molecular data is an important task of computational biology, since at least parts of DNA sequences for many organisms are available. Unfortunately, methods for comparison are computationally very demanding, suitable only for short sequences. In this paper, we focus on the redundancy of genetic information stored in DNA sequences. We proposed rules for downsampling of DNA signals of cumulated phase. According to the length of an original sequence, we are able to significantly reduce the amount of data with only slight loss of original information. Dyadic wavelet transform was chosen for fast downsampling with minimum influence on signal shape carrying the biological information. We proved the usability of such new short signals by measuring percentage deviation of pairs of original and downsampled signals while maintaining spectral power of signals. Minimal loss of biological information was proved by measuring the Robinson-Foulds distance between pairs of phylogenetic trees reconstructed from the original and downsampled signals. The preservation of inter-species and intra-species information makes these signals suitable for fast sequence identification as well as for more detailed phylogeny reconstruction.

Český abstrakt

Porovnání a klasifikace organismů založená na molekulárních datech je důležitá část výpočetní biologie. V tomto článku jsme se zaměřili na redundanci genetické informace v DNA sekvencích.

Anglický abstrakt

Comparison and classification of organisms based on molecular data is an important task of computational biology, since at least parts of DNA sequences for many organisms are available. Unfortunately, methods for comparison are computationally very demanding, suitable only for short sequences. In this paper, we focus on the redundancy of genetic information stored in DNA sequences. We proposed rules for downsampling of DNA signals of cumulated phase. According to the length of an original sequence, we are able to significantly reduce the amount of data with only slight loss of original information. Dyadic wavelet transform was chosen for fast downsampling with minimum influence on signal shape carrying the biological information. We proved the usability of such new short signals by measuring percentage deviation of pairs of original and downsampled signals while maintaining spectral power of signals. Minimal loss of biological information was proved by measuring the Robinson-Foulds distance between pairs of phylogenetic trees reconstructed from the original and downsampled signals. The preservation of inter-species and intra-species information makes these signals suitable for fast sequence identification as well as for more detailed phylogeny reconstruction.

Rok RIV

2015

Vydáno

04.06.2015

Nakladatel

Elsevier

Místo

USA

Strany od

1

Strany do

7

Strany počet

7

URL

Plný text v Digitální knihovně

BibTex


@article{BUT115093,
  author="Karel {Sedlář} and Helena {Škutková} and Martin {Vítek} and Ivo {Provazník}",
  title="Set of rules for genomic signal downsampling",
  annote="Comparison and classification of organisms based on molecular data is an important task of computational biology, since at least parts of DNA sequences for many organisms are available. Unfortunately, methods for comparison are computationally very demanding, suitable only for short sequences. In this paper, we focus on the redundancy of genetic information stored in DNA sequences. We proposed rules for downsampling of DNA signals of cumulated phase. According to the length of an original sequence, we are able to significantly reduce the amount of data with only slight loss of original information. Dyadic wavelet transform was chosen for fast downsampling with minimum influence on signal shape carrying the biological information. We proved the usability of such new short signals by measuring percentage deviation of pairs of original and downsampled signals while maintaining spectral power of signals. Minimal loss of biological information was proved by measuring the Robinson-Foulds distance between pairs of phylogenetic trees reconstructed from the original and downsampled signals. The preservation of inter-species and intra-species information makes these signals suitable for fast sequence identification as well as for more detailed phylogeny reconstruction.",
  address="Elsevier",
  chapter="115093",
  doi="10.1016/j.compbiomed.2015.05.022",
  howpublished="online",
  institution="Elsevier",
  number="p1",
  volume="64",
  year="2015",
  month="june",
  pages="1--7",
  publisher="Elsevier",
  type="journal article"
}