owlposting.com
87
22
roughly 1 day ago

This is neat! I’m not fully through it yet, but just wanted to emphasize this:

> And understanding molecular motion is key for everything in biology, everything in biology is vibrating molecules underneath the surface!

Coming into bio as a programmer, this is the absolute sin qua non rule you need to internalize: there are no boundaries between systems, because everything is jiggling atoms. DNA encodes for genes, except the transcription process is heavily mediated by the physical environment and physical constraints of accessing the DNA; RNA transcribes to amino acid strings, except it’s also a molecule, and so sometimes it folds into a structure and just does shit itself; proteins have a function, except sometimes they have many functions, because the “lock and key” metaphor isn’t wrong, except when you’ve got a billion locks and your key’s kinda floppy, it’ll probably fit more than one. Nature plays with physical systems and will repurpose anything to do anything else - the informatics only take you so far, all the real action is vibrating molecules.

holodro 1 day ago

> Coming into bio as a programmer, this is the absolute sin qua non rule you need to internalize: there are no boundaries between systems, because everything is jiggling atoms.

(Similar background as you.) Another sine qua non rule is that evolution created biology, it wasn't engineered like software and it doesn't decompose like software. Evolution creates hairballs that has don't respect traditional engineering boundaries and abstraction hierarchies.

From that, along with probabilistic molecular jiggling, we get biological systems that are quite difficult to understand, predict, and control.

kurthr 23 hours ago

It's a good start to realize that what underlies all the understanding of science are simplified predictive models, and usually only statistical models at that.

What this means is that running an experiment in many fields is so difficult that replication is a real challenge. There are so MANY ways you can screw up, or you could just have a statistical fluke that screws you over. Just a tiny contamination or seemingly irrelevant missed step will cause a failure. That's why the idea of having journals composed of failed experiments just doesn't work. Unstated experimental process assumptions are legion. Sometimes an expert can look at the result and see what you've done wrong (like bad contacts in "Electron Band Structure In Germanium, My Ass") and often not even that. Sometimes there's something interesting in the failure, but 99% of the time it's just your pitch is so bad you can't hit the strike zone. Do better!

The things that are easy to replicate (and usually they've been specifically designed that way like Starbucks' over roasted beans), have actually been reduced to engineering. They're not on the edge where scientists can get published. That way perverse incentive madness lies.

Enjoy the controlability of inputs, the repeatability of bugs, the near perfection of compilers and memory allocation, the complete independence of variables while you can. Unless that is, you like Rowhammer and voltage glitch attacks.

seamossfet 1 day ago

Great write up, we're working on a drug discovery CAD tool and MD has been one of our focal points. Extremely challenging and fun problem to work on!

What complicates things is the experimental data we get back from labs to validate MD behavior is extremely tricky to work with. Most of what we're working with is NMR data which shows flexibility in areas of the proteins, but even then we're left with these mathematical models to attempt to "make sense" of the flexibility and infer dynamics from that. Sometimes it feels like an art and a science trying to get meaningful insights for lab data like this.

It's extremely difficult to experimentally verify any MD model since, as mentioned in the article, most of the data we're working with are static mugshots in the form of crystal structures.

forgotpwagain 1 day ago

Very cool. There are also methods that allow you to extract some notion of motion from variability in CryoEM data, e.g. CryoDRGN-ET [1].

I'm curious if you've worked with any of those models and how they relate to NMR data and MD simulations.

[1] https://www.nature.com/articles/s41592-024-02340-4

abhishaike 23 hours ago

+1 to this!

I've also written a potentially helpful coverage piece on extracting conformations from cryo-EM data: https://www.owlposting.com/p/a-primer-on-ml-in-cryo-electron...

colingauvin 23 hours ago

There are also techniques that combine both. In my experience (as an experimental structural biologist working in drug design), they frequently disagree.

the__alchemist 1 day ago

That's so cool! What's the software like, compared to say, PyMol? Is it like PyMol, integrated with docking? Are you using MD to position the drugs instead of trying different combos, like Vina does?

edwardbernays 22 hours ago

hello, I have an undergrad degree in computer science and I'm trying to reach myself informatics to get into this field. do you have any tips, or perhaps an internship available?

if you can reach out at all, you can find me at [masterfully dot blundered] on the normal g-domain. I briefly skimmed your profile for contact info but could not find any.

max_ 1 day ago

There is brilliant video by the hedge fund manager DE Shaw about molecular dynamics simulation.

Its very accessible and I found it very interesting — https://youtu.be/PGqCeSjNuTY?feature=shared

GubbinEel 21 hours ago

MD is a great entry point for anyone interested in scientific computing. A naive simulation is super easy to implement but you quickly learn hard lessons regarding performance scaling. I wrote an MD engine as a demo project for learning the basics of CUDA C.

For anyone with further interest in MD, two of the popular engines, Amber and Gromacs have excellent documentation for learning (1, 2). MDAnalysis is a popular analysis package. Their docs give a great rundown of what type of information you can glean from MD (3). If you’re strictly interested in eye candy, there’s a a fabulous blender plugin for visualizing MD simulations and protein structures (4). I also wrote a little Python program for setting up simulation systems you can do some fun stuff with it (5).

(1) https://ambermd.org/Manuals.php

(2) https://manual.gromacs.org/current/index.html

(3) https://www.mdanalysis.org/pages/documentation/

(4) https://bradyajohnston.github.io/MolecularNodes/

(5) https://github.com/AppleIntusion/MMAEVe

viapivov 1 day ago

MD has always fascinated me, but I have always been skeptical about it. What understanding could we get from watching a complex system like a peptide evolve in time? Dissociation constant that is easier to measure than crystalize the molecule? Can we ever be sure that we have attributed all necessary quantum effects such as pi orbitals' interactions?

siver_john 3 hours ago

Biased: Was a researcher in this field...

For MD, specifically the type talked about here, we aren't taking in all the quantum effects, and that is known. Crystalizing molecules, especially large either dynamic proteins or ones in lipids is hard. Crystalizing during transitory states is orders of magnitude more difficult. MD allows us to visualize those transitory states and was used, for example, to observe the unfolding of the spike protein in Sars-Cov2 to assist in designing mRNA vaccines, because the important amino acids could be observed.

There is a lot of times where it is good enough, outpaces current experimental techniques, etc that it is the tool for the job. But it is not perfect and very rarely can stand completely on its own, in say drug discovery or other fields.

siver_john 1 day ago

Amazing article on Molecular dynamics, in the infinite number of things they could add is a small segment on coarse graining. Though I'm biased (and have been thinking about writing one myself).

Granted wished this had been around when I started my journey instead of having to delve into things like the Amber manual... (which I will grant is wonderful for its information but the organization isn't as convenient).

abhishaike 1 day ago

Author here, I wish I added a section on coarse graining as well :) hope you write a post about it!

fentonc 1 day ago

Fun article! I was one of the architects on Anton 2 and Anton 3 at DESRES.

max_ 1 day ago

Hi,

Do you have any resources that you recommend on coarse graining?

I am really interested in the topic.

siver_john 5 hours ago

Depends on how deep you want to go. The pioneering work in this field is the "The multiscale coarse-graining method." series of papers kicked off by Noid. Who is still a person to pay attention to in this field.

I also believe the work of Frank Noe is someone to watch for in the ML potential space for proteins.

I'm also always happy to talk about CG.

max_ 4 hours ago

How do I get in touch?

My email OTOH is in my bio.

siver_john 3 hours ago

Sent an email =)

frgoe 23 hours ago

I am currently working on CG potentials. Can really recommend the basics from Gregory A. Voth.

hoseja 7 hours ago

When I used much of these same methods at uni, all the software was utterly unoptimized "scientist-programmed" spaghetti mess and I strongly suspect it still is.

And there is not a 100x hero hacker that could clean this Augean stable.