SUPPLEMENT
Spectral Analysis of Sequence Variability in Basic-Helix-loop-helix (bHLH) Protein Domains: Implication of the Helix Structure
Zhi Wang and William R. Atchley
Graduate Program in Biomathematics and Bioinformatics
North Carolina State University, Raleigh, NC 27695-7614, USA
Sequence Database
Multiple Alignments of the bHLH domains of 196 protein sequences in bhlh_database.xls
are used in previous literature and this study. After removing the loop region
between residue 32 and residue 46, only 49 columns remain in the multiple alignments
in bhlh_multiple_alignment.txt .
Justification of the removal of the loop region: Since an alpha-helix generally has a periodicity of 3.40 - 3.91 aa per turn (Kyte, 1995), we mainly focus on the short-range periodic components (i.e. the high-frequency component) in this paper. The entropy values of the loop region are pretty low (refer to the entropy profile of the bHLH domain without the loop removal). Spectral density plots of the entropy profiles of the bHLH domain with the loop removal and without the removal are compared. And it is discovered that the removal of the loop region has nearly no impact on the short-range periodic components. The removal just results in the removal of some long-range periodic components occurred because of the low entropy values of the loop region. Therefore, the results in this paper are not affected by the removal of the loop region.
Entropy and Factor
Means/Variances Profiles
Data can be found in whole_data.xls and the plots
are Fig.1 in the paper.
Perl Script calculating the entropy profile
and
1000 bootstrap entropy samples
ActivePerl (free download)
is required to be installed to support the perl script.
The perl script used in this paper is in entropy_bootstrap_pl.txt
The output entropy is stored in bhlh_entropy.txt and the 1000 bootstrap entropy
samples are stored in entropy_bootstrap.txt.
Spectral Analysis
(I) Finite Fourier transformation method. The
spectral analysis of the entropy and Factor means/variances profiles by Finite
Fourier transformation, white noise tests and the regression analysis are conducted
by the SAS program spectral_fft_tests.sas.
The spectral density plots are listed in Fig.2 in the paper. Fisher¡¯s Kappa
tests and Bartlett's Kolmogorov-Smirnov (BKS) white noise results are summarized
in white_noise_tests_bHLH.doc. Although
white noise tests can provide some useful information, we should be cautious
to equate the statistical significance to the biological significance.
(II) Burg method. The spectral analysis of the
entropy and Factor means/variances profiles by the Burg method are conducted
by the Matlab script Spectral_Burg.m . The spectral
density plots are similar to Fig.2 in the paper and therefore they are listed
in Spectral_burg_plots.pdf in the supplementary
material.
Harmonic Analysis
The harmonic analysis of detecting the best period estimate and its 95% confidence
interval based on the 1000 bootstrap samples is conducted by the Matlab script
entropy_bootstrap_harmonic.m
REFERENCE
Atchley,
W.R., Jieping, Z., Andrew, F. & Tanja, D. (2005)
Solving the protein sequence ¡°metric¡± problem
Proc. Natl. Acad. Sci. USA 102, 6395-6400.(download)