• No results found

An in vitro strategy for the selective isolation of anomalous DNA from prokaryotic genomes - 204429y

N/A
N/A
Protected

Academic year: 2021

Share "An in vitro strategy for the selective isolation of anomalous DNA from prokaryotic genomes - 204429y"

Copied!
8
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl)

UvA-DARE (Digital Academic Repository)

An in vitro strategy for the selective isolation of anomalous DNA from

prokaryotic genomes

van Passel, M.W.J.; Bart, A.; Waaijer, R.J.A.; Luyf, A.C.M.; van Kampen, A.H.C.; van der

Ende, A

Publication date

2004

Published in

Nucleic Acids Research

Link to publication

Citation for published version (APA):

van Passel, M. W. J., Bart, A., Waaijer, R. J. A., Luyf, A. C. M., van Kampen, A. H. C., & van

der Ende, A. (2004). An in vitro strategy for the selective isolation of anomalous DNA from

prokaryotic genomes. Nucleic Acids Research, 32(14), e114.

General rights

It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons).

Disclaimer/Complaints regulations

If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.

(2)

An in vitro strategy for the selective isolation of

anomalous DNA from prokaryotic genomes

M. W. J. van Passel, A. Bart, R. J. A. Waaijer

1

, A. C. M. Luyf

1

,

A. H. C. van Kampen

1

and A. van der Ende*

Department of Medical Microbiology and1Bioinformatics Laboratory, Academic Medical Center, Amsterdam, the Netherlands

Received May 19, 2004; Revised and Accepted July 28, 2004

ABSTRACT

In sequenced genomes of prokaryotes, anomalous DNA (aDNA) can be recognized, among others, by atypical clustering of dinucleotides. We hypothesized that atypical clustering of hexameric endonuclease recognition sites in aDNA allows the specific isolation of anomalous sequences in vitro. Clustering of endo-nuclease recognition sites in aDNA regions of eight published prokaryotic genome sequences was demonstrated. In silico digestion of the Neisseria meningitidis MC58 genome, using four selected endonucleases, revealed that out of 27 of the small fragments predicted (<5 kb), 21 were located in known genomic islands. Of the 24 calculated fragments (>300 bp and <5 kb), 22 met our criteria for aDNA, i.e. a high dinucleotide dissimilarity and/or aberrant GC content. The four enzymes also allowed the identification of aDNA fragments from the related Z2491 strain. Similarly, the sequenced genomes of three strains of Escherichia coli assessed by in silico digestion using XbaI yielded strain-specific sets of fragments of anomalous composition. In vitro applicability of the method was demonstrated by using adaptor-linked PCR, yielding the predicted fragments from the N.meningitidis MC58 genome. In conclusion, this strategy allows the selective isolation of aDNA from prokaryotic genomes by a simple restriction digest–amplification–cloning– sequencing scheme.

INTRODUCTION

Horizontal gene transfer (HGT) was already identified in 1944 by the same experiment that demonstrated the transformation of non-virulent to virulentStreptococcus pneumoniae (1). The extent of HGT as an evolutionary phenomenon had not been addressed quantitatively on genomic scale until Lawrence and Ochman (2) calculated that 18% of the genome of Escherichia coli MG1665 was horizontally transferred since its divergence from theSalmonella lineage 100 million years ago. This identified HGT as a major factor in prokaryotic

genome evolution. Recently, an extensive database of horizontally transferred genes based on complete bacterial and archaeal genomes has been made available (3).

The rationale behind the computational identification of horizontally transferred DNA is the genome hypothesis, which proposes that for a given prokaryotic genus genomic DNA is relatively constant in codon usage and GC content (4,5). In contrast, horizontally acquired anomalous DNA dif-fers in codon usage and/or GC composition from the recipient genome and can therefore be identified when substantial sequence information is available.

An additional parameter in lateral genomics is based on oligonucleotide compositional extremes: the dinucleotide relative abundance values or genome signature r* (6,7). The genome signature is constant among members of a genus, but deviates substantially between members of differ-ent genera (8). When used for intragenomic comparisons,r* makes an excellent parameter for the identification of anom-alous DNA regions. Aberrant dinucleotide frequencies in aDNA are then expressed as the genome dissimilarity d*, being the average dinucleotide relative abundance difference between the aDNA region and the whole genome (6–8). Although the genome signature is capable of identifying clus-ters of alien genes and acquired pathogenicity-associated islands (PAI) with an atypical nucleotide composition, highly expressed regions such as ribosomal clusters can also display aberrant dinucleotide frequencies (8,9).

Till date, to our knowledge, no method exists that uses (one of) these parameters and enables the selective isolation of anomalous DNA sequences from a microbial genome in vitro. In order to develop such a technique, we investigated a special group of oligonucleotide composition extremes: the local overrepresentation in a genome of palindromic hexanu-cleotide sequences, specifically restriction endonuclease recognition sites, in aDNA regions. Like the genomic dinu-cleotide and tetranudinu-cleotide frequencies (10,11), frequencies of restriction sites vary between the genomes of different microbial species (12). Avoidance of cognate recognition sequences is probably the operating mechanism (13,14). An HGT event between different organisms may introduce clus-ters of certain restriction sites in the recipient’s genome. Therefore, digestion of the chromosomal DNA with such a restriction endonuclease can produce a limited number of small restriction fragments, comprising potential anomalous DNA, which can be selectively amplified by adaptor-linked

*To whom correspondence should be addressed. Tel:+31 20 5664862; Fax: +31 20 6979271; Email: a.vanderende@amc.uva.nl

Nucleic Acids Research, Vol. 32 No. 14ª Oxford University Press 2004; all rights reserved

Nucleic Acids Research, 2004, Vol. 32, No. 14 e114 doi:10.1093/nar/gnh115

(3)

PCR [ALP (15)]. The resulting amplicons can be subsequently subcloned and identified by sequence analysis.

Clustering of restriction endonuclease recognition sites in diverse aDNA regions in prokaryotic genomes was illustrated by thein silico assessment of seven genome sequences of five different species. The restriction enzymes for which the hexa-meric recognition sites are underrepresented were identified for each genome, and restriction fragments between clustered sites, being<5 kb, were analysed for nucleotide composition concerning GC percentage and genomic dissimilarity.

Next, the restriction fragments of Neisseria meningitidis MC58 between 300 bp and 5 kb were analysed in silico for both GC content and genome signature compared to the geno-mic values. Also, the restriction fragments obtained with the selected restriction endonucleases fromN.meningitidis MC58 and Z2491 strains were compared.

Finally, in order to demonstrate the applicability of this technique in vitro, ALP was performed on chromosomal DNA from strain MC58 digested by each of the selected restriction endonucleases. The resulting amplicons were sequenced to verify the predicted sequence composition.

MATERIALS AND METHODS Bacterial strain and growth conditions

N.meningitidis MC58 is a serogroup B:15:P1.7,16 strain iso-lated from a case of invasive infection in the UK (16). This wild-type MC58 strain lacks the erythromycin resistance cas-sette insertion in the capsule gene locus in contrast to the sequenced strain MC58 (17). Neisseriae were grown on heated blood (chocolate) agar plates or in liquid Tryptic Soy Broth (DIFCO) medium at 37C in a humidified atmosphere of 5% CO2.

Chromosomal DNA preparation and digestion

Chromosomal DNA was isolated with the Puregene DNA isolation kit (Biozym). Restriction digests and subsequent heat inactivation were carried out according to the manufac-turer’s instructions (Roche).

Adaptor-linked PCR and DNA sequencing

Adaptor-linked PCR was performed as described previously (18). The adaptor and linker sets are MP19 (50-ACG TCG ACT ATC CAT GAA CAG ATC 30) and MP23 (50-GAT CTG TTC ATG-30) for the ScaI-digested genomic template, MP24 (50-ACC GAC GTC GAC TAT CCA TGA ACA-30) and MP20 (50- CTA GTG TTC ATG -30) for both the NheI- and SpeI-digested chromosomal DNA and MP24 and MP23 for the BglII-digested genomic template. PCR amplicons were pur-ified by agarose gel extraction (Qiagen) and subcloned into a pCR2.1 vector (Invitrogen) according to the manufacturer’s instructions.E.coli DH5a was transformed by standard heat shock procedure. The constructed plasmids were isolated with the Wizard Kit (Promega). Inserts were sequenced using stand-ard M13 primers or primer walking on vector or genomic DNA according to the manufacturer’s instruction (ABI). Sequences were analysed using the Staden Package (http://www.mrc-lmb. cam.ac.uk/pubseq/).

Software

The restriction site frequency tables from the various genomes were obtained from http://tools.neb.com/~posfai/FINISHED. Thein silico digestions of the various sequenced genomes (for accession numbers see Table 1) were performed using the Restriction Digest tool from The Institute for Genomic Research (TIGR) (http://www.tigr.org). In silico retrieval and identification of the restriction fragments was performed with the Position Search/Segment Retrieval tool from TIGR (http://www.tigr.org). The different genomes ofN.meningitidis were compared using the Artemis Comparison Tool (ACT) (http://www.sanger.ac.uk).

Data analysis

Fragments were designated anomalous in GC composition if the GC content of the fragment is below the fifth or above the 95th percentile of the genomic GC content distribution, cal-culated with a window and step size identical to the fragment length (http://www.tigr.org).

Thed* value for each restriction fragment was calculated as described earlier by Karlin and colleagues (7). In brief, the

Table 1. Clustering of restriction enzyme recognition sites in anomalous DNA regions in sequenced genomes of various prokaryotes

Organism Accession number (and reference) Enzyme Total number of fragmentsa aDNAb

Haemophilus influenzae Rd20 NC000907 (29) XmaIII 7 7/7

ApaI 7 7/7

E.coli O157:H7 VT2 NC002695 (30) AvrII 5 5/5

XbaI 3 2/3

E.coli K-12 NC000913 (31) XbaI 2 1/1

E.coli CFT073 NC004431 (32) XbaI 7 4/5

Salmonella enterica serovar Typhi CT18 NC003198 (33) XbaI 7 6/6

Methanobacterium thermoautotrophicum delta H NC000916 (34) SpeI 11 7/8

N.meningitidis MC58 NC003112 (17) BglII 9 6/7 ScaI 11 9/10 SpeI 3 3/3 NheI 4 4/4 N.meningitidis Z2491 NC003116 (35) BglII 4 1/2 ScaI 4 3/4 NheI 2 2/2 Total 67/74 (91%) a

All restriction fragments up to 5 kb are considered in this column.

b

For aDNA composition calculations concerning GC percentage and genome dissimilarity, only the fragments between 300 bp and 5 kb were considered.

(4)

dinucleotide relative abundance values rXY* are defined

as the frequency of the dinucleotide XY divided by the product of the background frequencies of the individual nucleotides in the sequence and the reverse complement sequence [r*XY = fXY/(fX* fY)].d* is the average absolute

dinucleotide relative abundance difference given byd*( f, g) = 1/16 *Pj rXY*(f )rXY*(g)j , where rXY* ( f ) denotes the

abundance values calculated for fragmentf andrXY* (g) the

abundance values calculated for the genome g. The d* of each fragment was compared to a distribution of d* values which we constructed for consecutive fragments of identical size obtained from the respective genome sequence. A fragment was scored positive for anomalous DNA composition if this d* value was above the 90th per-centile of the d* distribution. For determination of genome signature dissimilarities, only sequences between 300 bp and 5 kb were considered. Five kilobase pairs is the median size of imported DNA in Neisseria (19), and it also represents a conservative size limit for technical convenience in amplifi-cation procedures. With restriction fragments<300 bp, com-putations of composition are not performed; this limit represents a conservative lower size limit previously used in studies identifying aDNA by codon usage (3,5).

RESULTS

Local overrepresentation of hexameric restriction enzyme recognition sites in anomalous DNA regions of sequenced prokaryotic genomes

In order to identify clustered restriction endonuclease recog-nition sites in the sequenced prokaryotic genomes used in this study, we tested restriction enzymes of which the hexapalin-dromic recognition sites are underrepresented in the genome sequences (http://tools.neb.com/~posfai/FINISHED). The tendency of the recognition sites to cluster in aDNA regions was assessed by analysing the sequence composition of the restriction fragments obtained in silico (Table 1 and supplementary Tables 1–8). Fragments <300 bp were not considered for their genome dissimilarity and GC composi-tion values, because d* values of these small fragments are unreliable; this conservative minimal length is also used by Lawrence and Ochman and Garcia-Vallvee and co-workers (3,5). Nevertheless, many of these restriction fragments <300 bp are adjacent to the other fragments <5 kb in their respective genomes (as an example seeN.meningitidis MC58 in Table 2 and Figure 1). The results showed that the eight analysed genome sequences did contain clusters of endonuclease restriction sites in aDNA regions (Table 1). The aggregated data showed 74 fragments (lengths between 300 bp and 5 kb) of which 67 (91%) were of anomalous composition.

Clustering of hexameric restriction enzyme recognition sites in the genome of N.meningitidis MC58 in

different aDNA regions

Assessment of the occurrence of low-frequency restriction sites in the genome sequence ofN.meningitidis MC58 revealed that many of the recognition sites of BglII, NheI, ScaI and SpeI clustered in the four regions known to contain large stretches

of aDNA. These are also annotated as islands of horizontal transfer (IHT), thereby supporting the notion that in MC58 these recognition sites are relatively overrepresented in regions originating from horizontal transfer (3,8,17) (Figure 1, Table 2).

Of the 27 restriction fragments <5 kb, 21 were located within either of the clusters of anomalous genes described in previous studies (3,8,17) (Figure 1, Table 2). Various ScaI fragments as well as BglII fragments were adjacent to each other in these aDNA regions, confirming the local over-representation of recognition sites in these regions.

Calculation of GC composition and d* values of the 24 restriction fragments between 300 bp and 5 kb obtained byin silico digestion with BglII, NheI, ScaI or SpeI confirmed their anomalous nature; 22 out of 24 of the restriction frag-ments met our criteria for anomalous DNA (Table 2). Of the 24 fragments, 21 had d* values above that of the 90th percentile of the genomic d* value distribution and 11 had a GC percentage lower than that of the fifth percentile of the genomic GC content distribution.

Table 2. Restriction fragment numbers, lengths,d* and GC composition of fragments obtained afterin silico digestion of the genome of N.meningitidis MC58 by four selected restriction enzymes

Fragmentsa GC content Genomic dissimilarity

Enzyme No. Length

(bp) GC% <10th percentile d* (· 103 ) >90th percentile BglII 1 2996b 47  136 + 2 2889b 49  117 + 3 2889c 49  117 + 4 2654c 46  123 + 5 2461 51  136 + 6 1194c 55  125  7 477 34 + 218 + 8 75b ND ND ND ND 9 21d ND ND ND ND NheI 1 4723e 43 + 99 + 2 4392f 40 + 111 + 3 787e 35 + 123  4 670 29 + 282 + ScaI 1 4824g 43 + 121 + 2 4452g 48  142 + 3 2496 48  132 + 4 2179 57  85  5 865h 38 + 171 + 6 699h 38 + 218 + 7 600 50  184 + 8 600h 50  184 + 9 600h 50  182 + 10 533h 51  192 + 11 67h ND ND ND ND SpeI 1 1672 36 + 132 + 2 579i 35 + 212 + 3 470 24 + 274 +

For fragments<300 bp the GC percentage and d* were not determined (ND).

a

Out of 25 fragments, 17 were located within one of the anomalous gene clusters A, B or C described by Karlin (8).

b

Adjacent in anomalous gene cluster A.

cAdjacent in anomalous gene cluster C. d

Present in anomalous gene cluster B.

e

Adjacent in anomalous gene cluster A.

fPresent in anomalous gene cluster C. g

Present in anomalous gene cluster A.

hAdjacent in anomalous gene cluster C. i

(5)

Comparing the different restriction fragment patterns of the sequenced N.meningitidis strains Z2491 and MC58 in silico

The restriction fragment patterns obtained in silico from the two differentN.meningitidis strains showed remarkable differ-ences (Table 3). Various restriction fragments located in the annotated anomalous gene clusters or IHTs inN.meningitidis MC58 were not identified inN.meningitidis Z2491, consistent with the notion that these IHTs are absent in strain Z2491 (17). In addition, two anomalous restriction fragments from MC58 (MC58-ScaI-2179 and MC58-SpeI-470), which were not part of one of the previously mentioned IHTs, were located in aDNA regions only present in MC58. The MC58-ScaI-2179 fragment harboured ORF NMB1829, encoding a TonB-dependent receptor, and MC58-SpeI-470 contained a cluster of six open reading frames (ORFs). The latter showed a number of features typical for a PAI (20), such as an atypical GC content compared to the genome sequence and association with a transfer RNA (tRNA) gene (NMB1595) and an inser-tion sequence IS1106 (ORF NMB1601) at its boundaries. As the functions of these ORFs and their distribution in other pathogenic and non-pathogenic strains are unknown, this region does not formally qualify as PAI, although a heterologous origin is suspected.

Two fragments identifiedin silico in Z2491 were absent in the genome of MC58. Z2491-BglII-97 harboured a part of NMA0604, which encodes a hypothetical protein. The Z2491-ScaI-4101 fragment contained ORFs NMA0785 and NMA0786, encoding hypothetical proteins. Both NMA0785 and NMA0786 display an atypical GC composition and dinu-cleotide composition, and are described as putatively horizon-tally transferred by Garcia-Vallvee and colleagues (3).

Thus, the same set of four enzymes which was used to isolate anomalous sequences from the MC58 strain in silico identified aDNA fragments from the related strain Z2491 strain. Similarly, the sequenced genomes of three strains of E.coli assessed by in silico digestion using XbaI yielded strain-specific sets of fragments with anomalous composition

(supplementary Tables 2–4). Unfortunately, due to sequence ambiguities in the E.coli EDL933 genome sequence, the d* values of the XbaI restriction fragments from this strain could not be readily calculated, although the low GC percentage of these fragments compared to theE.coli genomic GC composi-tion values suggest an anomalous nucleotide composicomposi-tion (supplementary Table 5).

Selective isolation of aDNA in vitro from N.meningitidis MC58 by adaptor-linked PCR

In order to validate that this strategy could be converted into an in vitro strategy with possible applications to unsequenced genomes, chromosomal DNA of strain MC58 was digested in vitro with BglII, NheI, ScaI or SpeI. The fragments obtained from each of the four digests were amplified by ALP. The amplicon pattern is very similar to the expected in silico restriction fragment patterns (Figure 2), albeit the minor dif-ferences observed. These can be explained by the possible inefficient amplification of large fragments (4 kb) in the presence of smaller fragments. The resulting amplicons were subcloned and sequenced, verifying the sequences predicted by the in silico analysis (data not shown). This demonstrated the applicability of this method in vitro.

DISCUSSION

A new parameter based on dinucleotide composition extremes has been introduced to identify genomic islands in complete genomes (6,7). The potential of this and otherin silico meth-ods to identify genomic islands is obviously limited to sequenced genomes. To our knowledge, no in vitro method exists which allows the selective isolation of aDNA from unsequenced genomes, except for subtractive hybridization strategies in which usually two related but different strains are compared (21–23). In order to develop anin vitro tool for the selective isolation of anomalous sequences from unse-quenced genomes, we investigated whether clustering of

Figure 1. Clustering of restriction fragments<5 kb in the aDNA regions of the genome of N.meningitidis MC58 compared with the genome signature distribution. Blocks A, B and C represent the large genomic islands as described by Karlinet al. (8), whereas block D is a large ribosomal protein gene cluster. Block X is a large putative region of horizontal gene transfer identified by Garcia-Vallvee and co-workers (3).

(6)

restriction enzyme recognition sites could lead to the preferential isolation of aDNA from various sequenced genomes.

We demonstrated clustering of genomically underrepre-sented restriction enzyme recognition sites in eight sequenced genomes of five prokaryotic speciesin silico. We found that clustering of these recognition sites occurred predominantly in aDNA regions, including ribosomal loci, but also and more interestingly, in putative horizontally transferred loci which were described by Garcia-Vallvee and co-workers (3). However, some discrepancies between our data and their data-base exist, as the HGT datadata-base ignores non-coding sequences. In the genome ofN.meningitidis MC58, the clustering of the four selected endonuclease recognition sites occurred in the three known IHTs (17), a recently described aDNA region (3), and also in smaller anomalous loci. Comparative analysis of the calculated restriction fragments from N.meningitidis MC58 and Z2491 showed that similar putative horizontally acquired anomalous sequences could be isolated from both strains. Furthermore, aDNA confined to either one of these strains could be identified, suggesting that differences between

strains can be identified and isolated. The strategy was valid-ated by thein vitro amplification of the predicted restriction fragments from the genome ofN.meningitidis MC58.

In this study, only a limited number of sequenced genomes was analysed to illustrate atypical clustering of endonuclease recognition sites in their respective aDNA regions. Theoretic-ally, any prokaryotic genome may contain atypical clustering of endonuclease recognition sites. On the other hand, aDNA which is acquired via horizontal gene transfer is thought to adjust to the host’s nucleotide composition over time in a process called amelioration, the same mutational process that affects the entire genome (5). This implies that only aDNA resulting from evolutionary recent transfer events, in which the nucleotide content of the acquired DNA still differs substantially from the sequence composition of the host gen-ome, can be adequately identified and isolated. Furthermore, in genomes of bacteria, such as Helicobacter pylori, with an extreme plasticity due to high recombination rates, regions of aDNA may be rapidly obscured over time (24). Another potential limitation of our technique is that the restriction enzyme recognition sites may be methylated by

Table 3. Restriction fragments, coordinates, of the different tested N.meningitidis strains, indicating the absence or presence of the different fragments in the other strain (GI refers to the different genomic island as depicted in Figure 1)

Enzyme Size Coordinates GI in strain MC58 Presence in the other strain

Strain MC58 Strain Z2491 Strain MC58 Strain Z2491

BglII — 4213 — 1179114–1183327 — Dispersed in MC58,

with inversions and loss of restriction fragment 2996 — 525422–528418 — A Absent in Z2491 2889 — 1863020–1865909 — C Largely present in Z2491 2889 — 522533–525422 — A Largely present in Z2491 2654 — 1860366–1863020 — C Largely absent in Z2491 2461 2449 726967–729428 875412–877861 — Similar sequences 1194 — 1859172–1860366 — C Largely present in Z2491 477 — 614379–614856 — — Absent in Z2491 — 97 578316–578413 — Absent in MC58 75 75 542107–542182 688499–688574 A Similar sequences 21 — 1444731–1444752 — B Absent in Z2491

NheI 4723 — 505027–509750 — A Largely absent in Z2491

4392 — 1834407–1838799 — C Absent in Z2491

787 776 2231113–2231900 299657–300433 X Similar sequences

670 670 543262–543932 689470–690140 A Similar sequences

ScaI 4824 — 526521–531345 — A Largely absent in Z2491

4452 — 511598–516050 — A Absent in Z2491

— 4101 — 769714–773815 — Partial similarity to MC58-ScaI-533,

MC58-ScaI-600abc, partially absent in MC58

— 3181 — 1928315–1931496 — Similar sequences, but polymorphism

at the recognition site

2496 — 1007047–1009543 — — Largely present in Z2491 2179 — 1925725–1927904 — — Largely absent in Z2491 865 865 1815665–1816530 1927450–1928315 C Similar sequences 699 699 1814966–1815665 1926751–1927450 C Similar sequences 600 — 1447767–1448367 — B Similar to Z2491-ScaI-4101 600 — 1447167–1447767 — B Similar to Z2491-ScaI-4101 600 — 616671–617271 — — Similar to Z2491-ScaI-4101 533 — 1446567–1447100 — B Similar to Z2491-ScaI-4101 67 — 1447100–1447167 — B Similar to Z2491-ScaI-4101

SpeI 1672 — 2223993–2225665 — X Dispersed in MC58, with

inversions and loss of restriction fragment

579 — 1443931–1444510 — B Absent in Z2491

470 — 1659795–1660265 — — Absent in Z2491

(7)

restriction-modification (RM) systems. For example,H.pylori contains many RM enzymes, rendering the genome resistant to their activity (25).

Only hexapalindromic recognition sites of endonucleases have been tested; we did not examine other recognition sites (such as non-palindromic recognition sites). The cores of the restriction sites of the selected restriction enzymes persis-tently consist of the genomically underrepresented tetranu-cleotides previously described by Karlin and co-workers (10). Genomic underrepresentation of tetrapalindromes may be due to structural defects caused by these sequences or special functional roles associated with these sequences (10). Whether genomic aDNA regions, with these tetranucleotides overrepresented, predominantly originate form donor organ-isms in which these tetranucleotides are less associated with structural defects or special functional roles, remains unclear.

The restriction enzymes, for which the recognition sites were often found to cluster atypically in the genomes assessed in this study, such as SpeI, XbaI, AvrII and NheI, are also commonly used for genotyping by pulsed-field gel electro-phoresis (PFGE) (26). For example, to identify an E.coli O157 outbreak cluster, Tsuji and co-workers (27) performed PFGE with the XbaI enzyme. A higher prevalence of the recognition sites of these enzymes in aDNA regions, such as horizontally transferred genes, may partly explain the high differentiating capacity of PFGE when performed with these enzymes. Insertion of horizontally acquired DNA har-bouring these sites in a higher frequency than the recipient genome will result in the introduction of novel small frag-ments, which are usually not visualized by PFGE. However, the large fragment in which the region of aDNA is inserted will disappear from the PGFE pattern.

As the genome signature is conserved between closely related species (28), this technique may enable the selective

isolation of aDNA from novel outbreak strains in the popula-tion of a pathogenic species of which a representative com-plete genome sequence is available, illustrated by the identification of the different anomalous fragments in the two Neisseria strains. It would be of interest to test different neisserial genoclusters for anomalous sequences with this novel strategy. In conclusion, the strategy presented in this study allows the selective isolation of anomalous sequences from prokaryotic genomes by a simple restriction digest– amplification–cloning–sequencing scheme. This simple technique can have major practical applications in studying horizontal gene transfer.

SUPPLEMENTARY MATERIAL

Supplementary Material is available at NAR Online.

ACKNOWLEDGEMENTS

We would like to thank Drs Mark Achtman and Christina Vandenbroucke-Grauls for critically reading the manuscript.

REFERENCES

1. Avery,O.T., MacLeod,C.M. and McCarty,M. (1944) Studies on the chemical nature of the substance inducing transformation of pneumococcal types. Inductions of transformation by a

desoxyribonucleic acid fraction isolated from pneumococcus type III. J. Exp. Med., 79, 137–158.

2. Lawrence,J.G. and Ochman,H. (1998) Molecular archaeology of the Escherichia coli genome. Proc. Natl Acad. Sci. USA, 95, 9413–9417. 3. Garcia-Vallvee,S., Guzman,E., Montero,M.A. and Romeu,A. (2003)

HGT-DB: a database of putative horizontally transferred genes in prokaryotic complete genomes.Nucleic Acids Res., 31, 187–189. Figure 2. Comparison of the restriction fragment length polymorphism (RFLP) pattern and the ALP pattern of N.meningitidis MC58 digested with each of the selected endonucleasesin silico (via www.tigr.org) and the resulting amplification pattern in vitro (endonucleases are depicted above each lane). Lane X depicts the marker X (Roche) with the sizes in base pairs on the left. The fragments NheI-4723 and NheI-4392 could not be readily amplified, probably due to the preferential amplification of smaller fragments.

(8)

4. Grantham,R., Gautier,C., Gouy,M., Mercier,R. and Pave,A. (1980) Codon catalog usage and the genome hypothesis.Nucleic Acids Res., 8, r49–r62.

5. Lawrence,J.G. and Ochman,H. (1997) Amelioration of bacterial genomes: rates of change and exchange.J. Mol. Evol., 44, 383–397. 6. Burge,C., Campbell,A.M. and Karlin,S. (1992) Over- and

under-representation of short oligonucleotides in DNA sequences. Proc. Natl Acad. Sci. USA, 89, 1358–1362.

7. Karlin,S., Ladunga,I. and Blaisdell,B.E. (1994) Heterogeneity of genomes: measures and values.Proc. Natl Acad. Sci. USA, 91, 12837–12841.

8. Karlin,S. (2001) Detecting anomalous gene clusters and pathogenicity islands in diverse bacterial genomes.Trends Microbiol., 9, 335–343. 9. Karlin,S., Campbell,A.M. and Mrazek,J. (1998) Comparative DNA

analysis across diverse genomes.Annu. Rev. Genet., 32, 185–225. 10. Karlin,S., Mrazek,J. and Campbell,A.M. (1997) Compositional biases of

bacterial genomes and evolutionary implications.J. Bacteriol., 179, 3899–3913.

11. Pride,D.T., Meinersmann,R.J., Wassenaar,T.M. and Blaser,M.J. (2003) Evolutionary implications of microbial genome tetranucleotide frequency biases.Genome Res., 13, 145–158.

12. Roberts,R.J., Vincze,T., Posfai,J. and Macelis,D. (2003) REBASE: restriction enzymes and methyltransferases.Nucleic Acids Res., 31, 418–420.

13. Gelfand,M.S. and Koonin,E.V. (1997) Avoidance of palindromic words in bacterial and archaeal genomes: a close connection with restriction enzymes.Nucleic Acids Res., 25, 2430–2439.

14. Karlin,S., Burge,C. and Campbell,A.M. (1992) Statistical analyses of counts and distributions of restriction sites in DNA sequences. Nucleic Acids Res., 20, 1363–1370.

15. Saunders,R.D., Glover,D.M., Ashburner,M., Siden-Kiamos,I., Louis,C., Monastirioti,M., Savakis,C. and Kafatos,F. (1989) PCR amplification of DNA microdissected from a single polytene chromosome band: a comparison with conventional microcloning.Nucleic Acids Res., 17, 9027–9037.

16. McGuinness,B.T., Clarke,I.N., Lambden,P.R., Barlow,A.K., Poolman,J.T., Jones,D.M. and Heckels,J.E. (1991) Point mutation in meningococcal por A gene associated with increased endemic disease. Lancet, 337, 514–517.

17. Tettelin,H., Saunders,N.J., Heidelberg,J., Jeffries,A.C., Nelson,K.E., Eisen,J.A., Ketchum,K.A., Hood,D.W., Peden,J.F., Dodson,R.J.et al. (2000) Complete genome sequence of Neisseria meningitidis serogroup B strain MC58. Science, 287, 1809–1815.

18. Bowler,L., Bart,A. and Van der Ende,A. (2001)Meningococcal Disease: Methods and Protocols. Humana Press, Totowa, NJ. 19. Linz,B., Schenker,M., Zhu,P. and Achtman,M. (2000) Frequent

interspecific genetic exchange between commensal Neisseriae andNeisseria meningitidis. Mol. Microbiol., 36, 1049–1058. 20. Hacker,J., Blum-Oehler,G., Muhldorfer,I. and Tschape,H. (1997)

Pathogenicity islands of virulent bacteria: structure, function and impact on microbial evolution.Mol. Microbiol., 23, 1089–1097.

21. Lisitsyn,N., Lisitsyn,N. and Wigler,M. (1993) Cloning the differences between two complex genomes.Science, 259, 946–951.

22. Bart,A., Dankert,J. and van der Ende,A. (2000) Representational difference analysis ofNeisseria meningitidis identifies sequences that

are specific for the hyper-virulent lineage III clone.FEMS Microbiol. Lett., 188, 111–114.

23. Malloff,C.A., Fernandez,R.C. and Lam,W.L. (2001) Bacterial comparative genomic hybridization: a method for directly identifying lateral gene transfer.J. Mol. Biol., 312, 1–5.

24. Suerbaum,S., Smith,J.M., Bapumia,K., Morelli,G.,

Smith,N.H., Kunstmann,E., Dyrek,I. and Achtman,M. (1998) Free recombination withinHelicobacter pylori. Proc. Natl Acad. Sci. USA, 95, 12619–12624.

25. Kong,H., Lin,L.F., Porter,N., Stickel,S., Byrd,D., Posfai,J. and Roberts,R.J. (2000) Functional analysis of putative restriction-modification system genes in theHelicobacter pylori J99 genome.Nucleic Acids Res., 28, 3216–3223.

26. McClelland,M., Jones,R., Patel,Y. and Nelson,M. (1987) Restriction endonucleases for pulsed field mapping of bacterial genomes. Nucleic Acids Res., 15, 5985–6005.

27. Tsuji,H., Hamada,K., Kawanishi,S., Nakayama,A. and Nakajima,H. (2002) An outbreak of enterohemorrhagicEscherichia coli O157 caused by ingestion of contaminated beef at grilled meat-restaurant chain stores in the Kinki District in Japan: epidemiological analysis by pulsed-field gel electrophoresis.Jpn. J. Infect. Dis., 55, 91–92.

28. Karlin,S. and Burge,C. (1995) Dinucleotide relative abundance extremes: a genomic signature.Trends Genet., 11, 283–290.

29. Fleischmann,R.D., Adams,M.D., White,O., Clayton,R.A., Kirkness,E.F., Kerlavage,A.R., Bult,C.J., Tomb,J.F., Dougherty,B.A.,

Merrick,J.M.et al. (1995) Whole-genome random sequencing and assembly ofHaemophilus influenzae Rd.Science, 269, 496–512.

30. Hayashi,T., Makino,K., Ohnishi,M., Kurokawa,K., Ishii,K., Yokoyama,K., Han,C.G., Ohtsubo,E., Nakayama,K., Murata,T.et al. (2001) Complete genome sequence of enterohemorrhagicEscherichia coli O157:H7 and genomic comparison with a laboratory strain K-12. DNA Res., 8, 11–22.

31. Blattner,F.R., Plunkett,G.,III, Bloch,C.A., Perna,N.T., Burland,V., Riley,M., Collado-Vides,J., Glasner,J.D., Rode,C.K., Mayhew,G.F.et al. (1997) The complete genome sequence ofEscherichia coli K-12. Science, 277, 1453–1474.

32. Welch,R.A., Burland,V., Plunkett,G.,III, Redford,P., Roesch,P., Rasko,D., Buckles,E.L., Liou,S.R., Boutin,A., Hackett,J.et al. (2002) Extensive mosaic structure revealed by the complete genome sequence of uropathogenicEscherichia coli. Proc. Natl Acad. Sci. USA, 99, 17020–17024.

33. Parkhill,J., Dougan,G., James,K.D., Thomson,N.R.,

Pickard,D., Wain,J., Churcher,C., Mungall,K.L., Bentley,S.D., Holden,M.T.et al. (2001) Complete genome sequence of a multiple drug resistantSalmonella enterica serovar Typhi CT18. Nature, 413, 848–852.

34. Smith,D.R., Doucette-Stamm,L.A., Deloughery,C., Lee,H., Dubois,J., Aldredge,T., Bashirzadeh,R., Blakely,D., Cook,R., Gilbert,K.et al. (1997) Complete genome sequence ofMethanobacterium thermoautotrophicum deltaH: functional analysis and comparative genomics.J. Bacteriol., 179, 7135–7155. 35. Parkhill,J., Achtman,M., James,K.D., Bentley,S.D., Churcher,C.,

Klee,S.R., Morelli,G., Basham,D., Brown,D., Chillingworth,T. et al. (2000) Complete DNA sequence of a serogroup A strain of Neisseria meningitidis Z2491. Nature, 404, 502–506.

Referenties

GERELATEERDE DOCUMENTEN

30% tot verbeelding van de handeling / reflectie op in-game aspecten (entertainment) 23% tot reflectie op de wereld om ons heen (verbeelding van cultuur). Het

Proteins involved in the adaptation process are highly conserved while expression and interference systems vary greatly between different organisms.. Unfortunately

Like most other riboswitches it is present in the 5'UTR of mRNA encoding biosynthesis genes relevant to the riboswitch ligand.. Interestingly, the tetrahydrofolate riboswitch is

Het is wel belangrijk niet enkel de doorgaans actieve autochtone groepen vijftig plussers te betrekken maar ook de andere bewoners, aangezien er verschillende wensen ten aanzien

A longitudinal household study (BHPS) is used to analyse the effect of flexible working hour schedules (flexitime) on the mental health, physical health, job satisfaction and

For this aim, we also used OSLOM to detect modules just using the regulatory network based on the introduced co-regulatory similarity measure, and we compared these modules to

We found that the difference between the lowest and highest possible energy is very high, suggesting that DNA mechanics allows for substantial mechanical cues to position

De amoureuze jongeman die aan zijn geliefde schrijft: ‘In een woord gy zijt een lieve poetel!’ De student die met veel bravoure aan zijn jongere broer bericht dat men moet