Hi all,
I’ve been setting up some Golden Gate Assembly (BsaI/BsmBI based) and was wondering if anyone has run into specific overhangs that systematically fail or give poor assembly efficiency. I know there are rules (e.g. avoid repeats, extreme GC/AT imbalance, palindromes), but are there particular 4-bp sequences people have found to be unreliable?
As an example, I had trouble when using these overhangs:
GCCATG … agtgcttgg
CCATT … ACCTTGAAAATAAA
GCTT … ccaggcatcaaataaaacg
ATGG … atgtatatctccttcttaaagtt
In some cases the assembly failed, even though the design was clean. Has anyone else seen certain overhangs consistently fail in practice?
I think CCAT and ATGG are reverse complements. So that could be a reason.
Hi Aish,
Ligase fidelity is the key component here. T4 DNA Ligase doesn’t handle certain overhangs very well. By avoiding those overhangs you can perform complex assemblies. The original NEB papers are linked below but NEB has a set of web tools based on their research are worthwhile exploring. If you have a list of overhangs that you would like to test use the NEBridge Ligase Fidelity Viewer.
The first paper came out as I was preparing the syntax for Open Yeast. In my original version I had a couple overhangs that would have been very problematic but the paper and tool allowed me to change those so I have near 100% fidelity in assembly. Note that the standard MoClo (A-F), designed before this research, has some minor fidelity issues and Open Yeast inherits those for the transcription unit.
(for others on the forum, the background here is that some assemblies were failing during an MPhil project in my lab and we were worried some of the Reclone syntax overhangs were not working efficiently, and we might need to reclone a bunch of the collections if that is the case )
Thanks so much for opening this discussion! Could you reference the Reclone syntax letters that you were trying to use again?
to everything @osn_scott said! In this case, Chiara and @FernanFederici’s lab used the NEB tools and ran all the checks to design the syntax, all of our overhangs looked good in silico.
Now you’ve listed sequences and not only the junction letters (for which I double checked the Reclone overhangs in the NEB tool at the time we first discussed the problem), I’ve realised I can’t actually see ATGG in the syntax! Other than as the reverse complement to the N1 overhang CCAT, which obviously both have to be present on the parts that are being joined or they wouldn’t stick. Were CCAT/ATGG being used on a second junction in your assembly as well as N1? If so, that was probably your issue and why chunks were getting missed out in the assembly…
I might have completely misinterpreted - if so, perhaps you could quickly sketch the overhangs as they should have looked joining together, and what you actually saw in terms of the final assemblies?
The bigger picture is that if you weren’t using the Reclone syntax as it stands in the figure below, then we don’t need to worry in terms of the DNA collections. That would be a great outcome.
If anyone else has had assembly issues using the syntax above, do please shout because we are at a pivotal moment for fixing things that might be broken… it will only get more expensive from here on in . We are planning to convert all of the original Open Enzyme and Open Reporter collections into compatible “CD” parts and more of the reporters into tags.