Join us for our weekly series of short talks: nf-core/bytesize.
Just 15 minutes + questions, we focus on topics about using and developing nf-core
pipelines. These are recorded and made available at https://nf-co.re
, helping to build an archive of training material. Got an idea for a talk? Let us know on the #bytesize
Slack channel!
This week, Athanasios Baltzis (@athbaltzis) will talk about the newest developments in the nf-core/proteinfold pipeline.
Video transcription
Note
The content has been edited to make it reader-friendly
(Maxime) Thank you very much for the amazing talk. I will allow everyone to unmute themselves if anyone has any question. Please, let’s go.
(Question) Otherwise I have one question. So at the moment you only have AlphaFold2. And you are planning to add more tools, but not in this first release but in the comming one, right? I assume that the main issue with having more tools is that, it’s a lot of databases that you need as an input. (Answer) Exactly that’s true because each tool uses its own databases, so you need a lot of storage to be able to test everything or even to compare between tools. (Question) May I ask along this line. So you basically retrain the model every time you run the pipeline, or at least like every time an institution retrains their model from scratch, or do you use pre-trained models. (Answer) We use pre-trained models. We just download the already provided models by AlphaFold (Question) And it still takes these huge databases? (Answer) yes, because this is separate from the training process. These databases are needed in order to create the input multiple sequence alignments, to actually have this or all these bunch of homologous sequences, in order for the model to be able to find all the correlations, the interesting new correlations and form the final model. (Maxime) I think we are good with the number of questions. Thanks again, that was an amazing talk now I’m super happy to have learned more about it. I’m really hoping like to see this release coming.