Join us for our weekly series of short talks: nf-core/bytesize.
Just 15 minutes + questions, we focus on topics about using and developing nf-core
pipelines. These are recorded and made available at https://nf-co.re
, helping to build an archive of training material. Got an idea for a talk? Let us know on the #bytesize
Slack channel!
This week, Maxime Garcia (@maxulysse) will share with us all there is to know (well, some of it) about the brand-new nf-core subworkflows!
Video transcription
Note
The content has been edited to make it reader-friendly
subworkflows
that the infrastructure team, so Matthias, Julia, and everyone else, I don’t know who else is involved in the infrastructure team, but they did a pretty good job with all that. And I always like what they do. It looks so fancy, what they’re doing.
No command subworkflow? Oh, yes, maybe just without the “S”, subworkflow, no? subworkflow. And yes, the S was there. Don’t misspell stuff! For the pipeline, we have info
, install
, list
, remove
, and update
to develop new subworkflows that will be very similar to the same command that we have for the modules. I will not show that, but I will show all of that. Let’s try nf-core subworkflow info
. I want to have an info about the new subworkflow you want to install. Is the subworkflow locally installed? No, because I want to install it. Please select a subworkflow. I want to select the “vcf_annotate_snpeff”, and I have some nice information about all that. It does perform annotation, I mean, snpEff, and then bgzip plus tabix index resulting VCF file. That’s perfect. We do need a metamap vcf, a version of the snpEff database, an optional path to the root cache folder for snpEff, and then we have output. Compressed vcf file plus tabix index, html report, and of course the version.
What were the other commands that we could see? Install
, list
, remove
, update
. Let’s check list
. I want to list. List local. No local, that’s some logic. And list remote
, that’s the same one. Then let’s go for install
. Wait, before I actually install, let’s remove the one that we had. Git remove subworkflow local vcf_annotate_snpeff. I removed my… I removed my local version of this subworkflow. I will now install the new version. that’s subworkflows install
. And I want “vcf annotated”. I could copy from there, but I want to try out what’s happening if I don’t do anything. This is so fancy! I love that! I am a big fan of auto-completion and stuff that is doing that. That was super fast, so it’s done.
Use the following statement to include the subworkflow. I will grab that copy. I launch my code. I have a subworkflow, I have that local. This is my meta subworkflow where I do everything. I’m just copy pasting this new command here. Let’s just align everything well. That looks good… Annotate all… Here we’ll use this new subworkflow which is located there. That’s nice. I don’t have my “vcf_annotate_snpeff” anymore because it’s now there. That’s wonderful. That’s what we expect.
Where are you with the kit? So we deleted some file, we added some new file, and we modified some file. Let’s add the new file. We move to file. The meta.jml is different enough so it’s like a new… It doesn’t register as the renaming. The script itself for the subworkflow is exactly the same which makes sense because I created it yesterday and I basically copy pasted everything. What is happening in the module.json file? Get this module. This is new. Installed by module, installed by module. That’s interesting. It’s just looking, I like to check everything. I think it’s important. So let’s add this new file. Let’s commit everything. Lets push. Let’s create the pull request. We want to do that in dev. Let’s replace subworkflow. Create the pull request. That looks good.
I’m thinking there is just one lesson that I need to do but this is very specific to Sarek. Yes, I need to change the path to the file here. We are doing pytest with tags and we are watching if some of the files are being changed or not from one PR to another. And then we are triggering the test just on that. For that, because the path is not the same anymore, I just update the path. This is done. Let’s commit that as well. Let’s push. I’m hoping that we are done with this pull request. Yes, we can see that it was failing before. I’m pretty sure because of the tests that were failing. Now everything is triggered. I can see pytest workflow is being triggered at the moment. I’m guessing once it triggered, it will figure out which test it has to run or not, but that’s something else. I think that’s good for that.
Let’s go back to here. I paste the rest of the history here in my slides, and I will share my slides after this talk. I think now it’s time to thank everyone and to go for the questions. This are the institutes that are participating in nf-core. I really need to update that slide because I think it’s already one year old and I’m pretty sure we have like more people now. Same with the contributor, but I really want to thank everyone that is contributing to nf-core because it’s a community and that’s a community effort and without everyone else we wouldn’t do anything. If you have any questions, please ask them because that was mainly just a demo and that was fairly simple. I’m pretty sure people have more questions.
(host) Thank you very much. You’re now able to unmute yourself. If you have questions, either put them in the chat or ask them straight away. I think I saw some questions. It’s not a question, it was a comment. Someone is being very happy that there are already 24 subworkflows.
(speaker) Yes, because we started the sub workflow at the hackathon properly. That was when? Last month or two months ago?
(host) Last month.
(speaker) Yeah, so 24 just in a month. That’s good. I’m pretty sure we’ll have more and more coming. And I know that Mathias is working on adding the command line help for nf-core tools soon. I’m guessing we’re waiting for release of tools for that. John, do you have a question?
(question) Yes. Can you hear me? Thanks. Very interesting talk. I’m quite new to this, but I use Nextflow and I am also a little bit used to nf-core. But this thing about subworkflows, is this specific to nf-core or is it something that can apply to other Nextflow pipelines?
(answer) This is something that can be applied to any pipeline. We developed that first with nf-core in mind, like the module, but then every module, like in Nextflow, everything can be a module. Every process can be a module. Every chain of process can be a module. Even the workflow itself can be a module. You can import whatever you want, however you want. Definitely what we are creating here with nf-core, like this subworkflow stuff, it can be used in the broader Nextflow community without any issue.
(question continued) Okay. Thanks.
(question) I also have a question or maybe comment also. That was a great presentation. However, I was going to ask, maybe my first comment is similar to what John just said. The presentation sounded more like subworkflows were nf-core things instead of a Nextflow thing. I think that’s why he was asking that question about whether subworkflows were nf-core or Nextflow. My other question is that, what’s the naming convention for subworkflows in nf-core? Is it like the first word is a verb followed by the names for the tools that you are chaining together in that subworkflow? Because I noticed some pattern like that, but maybe I’m wrong.
(answer) Yes, we have this convention. It’s definitely like an nf-core thing only. I’m guessing like other people that develop stuff might want to follow the convention as well. I’m happy to talk more about that. But I think we have this convention. I think it’s the input file type, which is the first. Then it should be a verb and then the list of the tools that are used. For example, like in that case, what we were doing with this subworkflow that I just added, it was vcf underscore annotate underscore snpeff.
(question continued) Yeah, thanks.
(host) Thank you very much. Are there any more questions? It doesn’t seem so. If you have more questions, as usual, you can go to Slack, either in the bytesize channel or there’s actually a channel also for subworkflows? (speaker) Yes, there is a channel for sub workflows. A channel for tools as well, obviously.
(host) Obviously. Or you can directly ask Maxime. Otherwise, I would like to thank Maxime for the talk and, as usual, for funding the Chan Zuckerberg Initiative. And you all for listening. Thank you very much.