Workflows
What is a Workflow?Filters
Article-GADES
This repository represents generating and benchmarking the results of the GADES package for Distance Matrix Calculation
Installation
git lfs install
git clone https://github.com/lab-medvedeva/Article-GADES.git
cd Article-GADES
Put the Real datasets in the MEX format to the folder Datasets/Real
.
Running benchmark using Docker Deployment
docker run --gpus all \
-v $PWD/Datasets:/workspace/Article-GADES/Datasets
...
Name: PhysioNet CascadeCSVM Kfold Contact Person: support-compss@bsc.es Access Level: public License Agreement: Apache2 Platform: COMPSs Machine: MareNostrum5
Kfold to evaluate CascadeCSVM accuracy on PhysioNet dataset (https://b2drop.bsc.es/index.php/s/8Q8MefXX2rrzaWs). This application used dislib-0.9.0
Name: PhysioNet RF Kfold Contact Person: support-compss@bsc.es Access Level: public License Agreement: Apache2 Platform: COMPSs Machine: MareNostrum5
Kfold to evaluate RandomForest accuracy on PhysioNet dataset (https://b2drop.bsc.es/index.php/s/8Q8MefXX2rrzaWs). This application used dislib-0.9.0
Genome assembly workflow for nanopore reads, for TSI
Input:
- Nanopore reads (can be in format: fastq, fastq.gz, fastqsanger, or fastqsanger.gz)
Optional settings to specify when the workflow is run:
- [1] how many input files to split the original input into (to speed up the workflow). default = 0. example: set to 2000 to split a 60 GB read file into 2000 files of ~ 30 MB.
- [2] filtering: min average read quality score. default = 10
- [3] filtering: min read length. default = 200
- [4] ...
Process argo data with the Pangeo Ecosystem and visualise them with Ocean Data View (ODV)
A R workflow for proteomics data analysis is reported. This pipeline was basing on protein expression projects, stored on the PRIDE database and reported on the COVID-19 Data portal. This is an R pipeline to analyze protein expression data, built on lung cell lines infected by SARS-CoV-2 variants: B.1, Delta, and Omicron BA.1 (Mezler et al. 2023) https://www.ebi.ac.uk/pride/archive/projects/PXD037265. This pipeline can obtain DEPs for each variant, starting from normalized protein expression ...
GraphRBF is a state-of-the-art protein-protein/nucleic acid interaction site prediction model built by enhanced graph neural networks and prioritized radial basis function neural networks. This project serves users to use our software to directly predict protein binding sites or train our model on a new database. Identification of protein-protein and protein-nucleic acid binding sites provides insights into biological processes related to protein functions and technical guidance for disease ...
Swedish Earth Biogenome Project - Genome Assembly Workflow
The primary genome assembly workflow for the Earth Biogenome Project at NBIS.
Workflow overview
General aim:
flowchart LR
hifi[/ HiFi reads /] --> data_inspection
ont[/ ONT reads /] --> data_inspection
hic[/ Hi-C reads /] --> data_inspection
data_inspection[[ Data inspection ]] --> preprocessing
preprocessing[[ Preprocessing ]] --> assemble
assemble[[ Assemble ]] --> validation
validation[[ Assembly
...
Secondary metabolite biosynthetic gene cluster (SMBGC) Annotation using Neural Networks Trained on Interpro Signatures