This is one of the most interesting (and hilarious) stories I’ve heard this year.
A competitor in a major RNA folding competition lacked GPU access. This “GPU poor” competitor had to innovate, and they ended up beating everyone. (2400 participants)
How? Pure engineering and ingenuity. Instead of tackling the problem with a very large AI model, they were forced to be smarter. They built a complex data pipeline that just… achieved better results. The focus was on data quality and better algorithms. The method used was a TBM data pipeline (1990s tech…). π
Now, the officially winning solution was a hybrid. But the real story is that a heavy, data-centric approach can still out-innovate a pure AI one.
This was RNA folding (not protein folding), a problem with a much smaller dataset, and the “classic” method won. The author even mentions in the comments that the original pipeline had no AI at all and a better score. They technically won despite AI. π
There are so many lessons here, but the main ones are:
AI is not always the solution.
πππππ π ππ‘π¦ ππ π‘βπ πππ‘βππ ππ πππ£πππ‘πππ, as you may have heard.
My main takeaway, though? If you are a researcher in a low-resource setting, know that you can compete. You can win by being more πππ ππ’πππππ’π.
The solution and must read: Stanford- RNA 3D Folding competition solution write up
On the computational biology side, allow me to also plug in some important updates recently from Google:
This week, Google Research and partners (including UC Santa Cruz) released Deepsomatic, an AI tool that identifies cancer-related mutations in a tumor’s genetic sequence to help pinpoint what’s driving the cancer.
Deepsomatic
The AlphaFold Database has been updated with new data and functionalities, continuing its partnership with Google DeepMind and EMBL-EBI.
Alaphafold DB
EMBL-EBI also has a new, free course on how to navigate and use the AlphaFold Database.
Navigating the AlphaFold database