top of page

DNA Sequence of 30 bases

GGCCGTGGTGCCCATTGTTCGTCGATCGGGTGATTGCGCT 

Minimum free energy secondary structure corresponding to the sequence above predicted by the algorithm at a temperature of 37.0° C

Picture1.png

Validation accuracy of 99.6% for predicting secondary structure was reached after training the Deep Neural Network for 1600 epochs 

Accuracy.png

DNA STRUCTURE PREDICTION

​

 

 

 

 

 

 

​

 

 

 

 

 

 

 

 

​

​

​

Context

Uniquely programmable DNA strands, that can be parallelly identified without cross-talk, are at the core of technologies that rely on amplification such as high throughput drug screening, diagnostics, DNA data storage, and nanostructure fabrication. However, with an increase in the DNA sequence length, designing a strand with desired properties grows dramatically complex due to a massive increase in number of possible interactions. Current state-of-the-art dynamic programming algorithms such as M-fold, V-fold, NUPACK, etc. are impractical for scaling as they escalate in O(n^3) the computing time and design cost with increase in sequence length and have proved challenging to concurrently design against crosstalk. There is an opportunity to apply the advances in the field of Machine Learning to create a new tool to enable faster, accurate secondary structure prediction and therefore, facile design of DNA complexes.

​

 

Aim

Efficient and accurate prediction of DNA secondary structures using Machine Learning

​

 

Results

I developed an algorithm that could predict the secondary structure and energy of DNA sequences with an accuracy of >99.6% and with an improvement of 3 orders of magnitude in computing time compared to NUPACK for sequences  of lengths 20, 30 and 40 bases and for G-C contents of 50%, 60% and 70%. 

​

 

CONTRIBUTIONS

​

Machine Learning

Programming

Design

Data analysis

​

​

Approach

This research study is still under progress and more advances are coming soon. If you are interested to know more, please contact me.  

​

​

This research has been conducted under the supervision of my advisor Dr. Ashwin Gopinath.

Presented talk at DNA26 International Conference on DNA Computing and Molecular Programming - Sep, 2020

bottom of page