
Sidewinder can also assemble defined diversities across a large number of positions along the entire length of a DNA sequence to construct combinatorial libraries. In a combinatorial library, each variable position is diversified and assembled into a synthetic sequence with other diversified positions through DNA assembly. These libraries are then sorted, selected or screened for desired functions12,31,32. This approach is particularly useful in protein engineering in which specific codons are varied at known or predicted residues to achieve a modified or improved protein function. Current methods for constructing combinatorial libraries using existing DNA assembly technologies can be limited in various aspects, such as the theoretical library size, coverage, number of positions diversified simultaneously and accuracy of assembly during construction33,34,35,36,37,38,39.
We used Sidewinder to generate a combinatorial library by designing our assembly fragments to divide the gene for the fluorescent protein eGFP into a ten-piece Sidewinder assembly, whereby predefined codon variations were combinatorially diversified across 17 positions across the entire gene, yielding a theoretical library size of 442,368 possible mutation profiles (Fig. 5a,b). The Sidewinder library assembly resulted in a single strong target band (Fig. 5c) that was then cloned into a plasmid and transformed into E. coli cells. Fractions of the library both before and after cloning were analysed using PacBio sequencing for high-fidelity, single-molecule long-read sequencing40.
Fig. 5: Sidewinder assembles large combinatorial libraries with high coverage.
a, Sidewinder library fragments generated by annealing a barcode oligo to an arbitrary number of coding oligos containing predefined mutations (coloured diamonds). b, Schematic of the ten-piece assembly for the fluorescent protein library (position not to scale). c, DNA agarose gel depicts PCR product of library assembly with a single strong target band. d, PacBio sequencing analysis of the pre-clonal Sidewinder library. The pie chart shows the proportion Sidewinder products (pale orange), partially aligned products (subset of fragments 1–10 in the correct order) and PCR and barcode artifacts (grey). e, Junction analysis of the library PacBio sequencing depicting ligations at the 3WJ. f, Violin plot (n = 646 positions across 3,079,525 molecules) and box and whisker plot (min = 0.9947649; max = 0.9999540; quartile 1 (Q1) = 0.9993220, median = 0.9988624, Q3 = 0.9979082, lower bound = 0.9947649, upper bound = 0.9998737) showing the distribution of per-base accuracies for the oligos used in the assembly, excluding intended library mutation positions and the flanking bases. g, The mutation diversity at the codon level showing pre-cloning experimental distribution (saturated, left) and corresponding theoretical codon distribution (desaturated, right). h, Mutation diversity at the gene level showing the proportion of PacBio reads assigned to each of the possible mutation combinations pre-cloning (pale orange) and post-cloning (orange). i, The sequence space of all possible mutation combinations (grey) and the mutation combinations represented from the pre-clonal sequencing (pale orange), post-clonal sequencing (orange) and those combinations seen in both (dark orange). j, The proportion of all observed variants in the pre-clonal and post-clonal sequencing plotted relative to one another. k, The percentage of diversity achieved considering every combination of N mutation positions across the 17 diversity positions of the Sidewinder library. l, Fluorescence area versus height plots, showing populations positive for blue, green, yellow and red fluorescence. The proportion of hits identified over the threshold is labelled for each colour.
Source data
Full size image
For the pre-clonal Sidewinder assembly, 98.88% (3,832,803 reads) were correct ten-piece assemblies, 0.41% were partially assembled with the correct connection of a subset of the ten pieces, and 0.71% were composed of PCR and barcode artifacts (Fig. 5d). Reassuringly, the high-fidelity PacBio data are consistent with previous Nanopore data, further supporting the robustness of the Sidewinder assembly in all demonstrated circumstances. Further analysing the PacBio dataset for all instances of misligated junctions revealed only 37 misligated junctions out of 35,542,842 total observed junctions (Fig. 5e). This corresponds to a misconnection rate at the 3WJ of just 1 in 960,617 (Extended Data Fig. 8).
The median error rate for the oligos used for the assembly was calculated to be 10−2.943 (1 error in 877 bases, or a 99.886% chance of a base being correct) (Fig. 5f). We see that, for these oligos, the per-base accuracy decreases with increased oligo length but due to the required ligation at the 3WJ, accuracy increases across assembly junctions (Extended Data Fig. 9a). These observations suggest that Sidewinder does not introduce additional errors during the assembly and may subtly improve oligo fidelity. Owing to Sidewinder’s high-fidelity for multifragment assemblies, shorter DNA oligos composing a higher number of Sidewinder fragments may provide an advantage in synthesizing nucleotide-perfect genes. On the basis of the observed per-base error rate, an estimated 44.14% of the eGFP variants constructed are expected to be nucleotide-perfect genes. This theoretical value is compared to a true value of 40.88% nucleotide-perfect post-clonal genes in the PacBio sequencing data. This is contrasted to just 8.2% nucleotide perfect clones reported for a library of a 1 kb gene using PCA33.
The diversity of the combinatorial library can be assessed by analysing the mutation profiles (identity of the deliberately encoded mutations) at the codon level, fragment level and gene level to compare the theoretical and experimental distribution of mutations at each level of library. At the codon level, every codon mutation profile is represented in the library with an average absolute deviation of just 8.23 percentage points from the theoretical proportion of occurrence for that codon (Fig. 5g). At the fragment level, all 82 fragment mutation profiles are represented in the final library, in which generally the distribution of mutation profiles seems to have higher variance for fragments that had a higher number of possible mutation profiles, such as fragment 4 (n = 36), compared with fragments with less possible diversity, such as fragment 2 (n = 2) (Extended Data Fig. 9b). Furthermore, there does not appear to be a decrease in the likelihood of incorporation of a coding oligo when there are more mismatches to the fragment’s barcode oligo (Extended Data Fig. 9c), except for when those diversity positions appear in closer proximity to the junction as with diversity position 15 (Fig. 5g).
At the gene level, out of the 442,368 possible mutation profiles, we observed a nearly identical distribution of occurrences in the mutation profiles of the pre- and post-clonal sequencing and achieved a library coverage of 326,733 and 386,978 variants, respectively, for a combined total of 405,778 variants (307,933 overlap) (Fig. 5h,i). By plotting the proportion of occurrences of the mutation profiles, which are represented in both the pre- and post-clonal sequencing, we see a general trend in which the more highly represented clones before cloning remain highly represented after cloning for this gene (Fig. 5j). The 405,778 variants observed correspond to a total library coverage of >91.7% of the 442,368 possible combinations of the 17 mutation positions. Within these mutation profiles, we observe nearly every possible combination of as many as 15 mutation positions (>99.4%) in the library with continued high representation all the way through every possible combination of 17 positions (Fig. 5k), which is an improvement over a recent comparable construction with Golden Gate34.
We predict that Sidewinder may be suitable for libraries of exceedingly large sizes, primarily limited by the fidelity of the oligos used and the ability to select, screen and sort post-clonal products. This library was designed by combining mutations that produce known phenotypes in fluorescent proteins with diverse excitation and emission spectra41,42,43. We amplified the fluorescence signals for each member of the library by growing fluorescent protein-expressing clonal populations in hydrogel microparticles which were then screened by fluorescence-activated cell sorting (FACS)44. This approach enabled the rapid visualization and identification of distinct protein fluorescence expressed within the diverse library. Approximately 5,000,000 clones from the starting library were encapsulated into individual hydrogel microparticles and, of those, 500,000 individual clones were screened using FACS and sorted to isolate mutations that resulted in different fluorescence emission characteristics from 400 nm to 700 nm when excited with 405 nm, 488 nm, 561 nm and 638 nm lasers (Extended Data Fig. 10a,b). Among the 500,000 screened colonies, we observed variants with fluorescence signal corresponding to blue (0.06%), green (4.35%), yellow (0.73%) and red (0.01%) fluorescence proteins (Fig. 5l).
We further analysed a subset of these sorted clones and saw a diversity of excitation and emission peaks (Extended Data Fig. 10c) confirmed through fluorescence microscopy (Extended Data Fig. 10d). This demonstrates that Sidewinder not only enables the assembly of functional DNA fragments but also enables the simultaneous introduction of combinatorial mutations, resulting in highly diverse and functional molecular libraries.
Robinson, N.E., Zhang, W., Ghosh, R. et al. Construction of complex and diverse DNA sequences using DNA three-way junctions.
Nature (2026). https://doi.org/10.1038/s41586-025-10006-0
https://doi.org/10.1038/s41586-025-10006-0 bu içeriği en az 2000 kelime olacak şekilde ve alt başlıklar ve madde içermiyecek şekilde ünlü bir science magazine için İngilizce olarak yeniden yaz. Teknik açıklamalar içersin ve viral olacak şekilde İngilizce yaz. Haber dışında başka bir şey içermesin. Haber içerisinde en az 12 paragraf ve her bir paragrafta da en az 50 kelime olsun. Cevapta sadece haber olsun. Ayrıca haberi yazdıktan sonra içerikten yararlanarak aşağıdaki başlıkların bilgisi var ise haberin altında doldur. Eğer bilgi yoksa ilgili kısmı yazma.:
Subject of Research:
Article Title:
Article References:
Robinson, N.E., Zhang, W., Ghosh, R. et al. Construction of complex and diverse DNA sequences using DNA three-way junctions.
Nature (2026). https://doi.org/10.1038/s41586-025-10006-0
Image Credits: AI Generated
DOI: https://doi.org/10.1038/s41586-025-10006-0
Keywords