“Workflow” is used to refer in general to modelling and IT management of all tasks and actors in the composition of a business process. Scientific Workflows can be used by researchers to formalise and structure complex scientific experiments in order to enable and accelerate scientific discoveries. Research communities have developed different workflow systems and a large numbers of workflows to run their experiments (Deelman et al., 2009).These systems differ in terms of workflow description languages and workflow management system. The workflow management system is a software service that provides and controls all or only a part of the runtime of a workflow instance. The main components and functionalities of a workflow management system are in: workflow definition, workflow specification, grid resources, information services, and a workflow enactment engine capable of scheduling, data transfers and fault tolerance.
A workflow language provides a way to describe a workflow and to make its execution possible through a workflow engine. It is like a programming language, each workflow management system has it's own language (commonly XML based).
There are more than 50 different workflow management system, e.g.: BioPipe, BizTalk, BPWS4J, DAGMan, GridAnt, Grid Job Handler, GRMS, GWFE, GWES, IT Innovation Enactment Engine, JIGSA, JOpera, Kepler, Karajan, OSWorkflow, Pegasus (uses DAGMan), Platform Process Manager, ScyFLOW, SDSC Matrix, SHOP2, Taverna, Triana, WebAndFlo, WFEE, wftk, YAWL Engine, Unicore…
In the following the commonly used WMS in Astronomy
The Taverna suite of tools are bringing together a range of features to make it easier for users to find, design and execute complex workflows and share them with other people. Initially developed by the European Bioinformatics Institute and the University of Manchester it is now used by many other communities.
Taverna is a Java, open source code (LPGL licensed – GPL for Astronomy versions).
Taverna workbench 2.5 for Astronomy enables you to graphically create, edit and run workflows on your computer. It is distributed by http://www.taverna.org.uk/download/workbench/2-5/astronomy/.
Taverna Workbench Astronomy 2.5 executable is taverna-workbench-astronomy-2.5.0-standalone.zip.
Taverna server 2.5.4 is the remote workflow execution service that enables you to set up a dedicated server for executing workflows remotely. It is distributed by http://www.taverna.org.uk/download/server/2-5/.
Taverna server 2.5.4 WAR is tavernaserver-2.5.4.war.zip.
Kepler is a generic science oriented workflow system (Astronomy, ecology, bioinformatics, geology…) Kepler is designed to help scientists, analysts, and computer programmers create, execute, and share models and analyses across a broad range of scientific and engineering disciplines. Kepler can operate on data stored in a variety of formats, locally and over the internet, and is an effective environment for integrating disparate software components, such as merging “R” scripts with compiled “C” code, or facilitating remote, distributed execution of models.
Kepler is a Java, open source code (BSD).
Kepler actual version is 2.5 and it is available at kepler site https://code.kepler-project.org/code/kepler/releases/installers/2.5/
Kepler 2.5 installer for linux can be downloaded (here)
Trianais a workflow system originally built to provide a tool for rapid analysis of data from gravitational waves. At the beginning, the procedures were modelled and executed locally or remotely using RMI. Recently, Triana has been extended to incorporate components that are distributed, grid computing-oriented or Web Services oriented. Triana comes with a wide variety of built-in tools. There is an extensive signal-analysis toolkit, an image-manipulation toolkit, a desk-top publishing toolkit, and many more.
Triana is an open source project that is being developed mainly by Cardiff University. Contribution to code developing can be done using GitHub
Triana version 4.0 can be dowloaded here.
Pegasus is a highly fault tolerant workflow management system that runs workflow applications in many different environments including desktops, campus clusters, grids, and now clouds. Pegasus enables scientists to construct workflows in abstract terms without worrying about the details of the underlying execution environment or the particulars of the low-level specifications required by the middleware (Condor, Globus, or Amazon EC2). Pegasus also bridges the current cyberinfrastructure by effectively coordinating multiple distributed resources.
Pegasus is an open source code (Apache License)
The versione of Pegasus available at the time the document has been prepared is 4.6.0 and it is available for from Pegasus site http://download.pegasus.isi.edu/pegasus/4.6.0/
Pegasus 4.6.0 source code can be downloaded here
To be filled in (TJD, Astron, 7 june 2016)
WS-PGRADE/gUSE, is a collaborative and community oriented application development environment that allows developers and end–users to develop and share workflows, workflow graphs, workflow templates, and ready-to-run workflow applications. It is based on WS-PGRADE is a workflow engine that offers generic services to handle distribution, monitoring and execution of workflows modules. It is able to run workflows written using differ workflow languages requiring different workflow management systems (non-native workflows). A meta-workflow is a workflow that involves both native and non-native workflows as its constituent parts. The ability to design and execute meta-workflows is a peculiar characteristic of the SHIWA Simulation platform that uses the WS-PGRADE/gUSE.
WS-PGRADE/gUSE is an open source project hosted by source forge https://sourceforge.net/projects/guse/
WS-PGRADE/gUSE version 3.7 can be downloaded here
Yabi an open source software system designed to provide transparent access to high performance computing. It is ‘a workflow engine that solves the problem of workflow deployment across disparate legacy HPC resources’. Designed to provide a more intuitive workflow environment, Yabi is a web-based user service intended to target an audience which may not have specialized programming skills.
Yabi is an open source software (GPL), freely available at bitbucket https://bitbucket.org/ccgmurdoch/yabi/downloads.
Yabi source code can be downloaded here.
HTCondor DAGmanis a manager for directed acyclic graph (DAG). A DAG can be used to represent a set of programs where the input, output, or execution of one or more programs is dependent on one or more other programs. The programs are nodes (vertices) in the graph, and the edges (arcs) identify the dependencies. HTCondor finds machines for the execution of programs, but it does not schedule programs (jobs) based on dependencies. DAGMan submits jobs to HTCondor in an order represented by a DAG and processes the results. An input file defined prior to submission describes the DAG, and a HTCondor submit description file for each program in the DAG is used by HTCondor.
DAGMan and HTCondor are open source software (Apache), the version available at April 2016 is 8.4.4
DAGMan is part of the HTCondor software, which can be downloaded here.
Moteur is a flexible and efficient workflow deployement of data-intensive applications. It is an opensource software.
Apache Airavata is a software framework for executing and managing computational jobs and workflows on distributed computing resources including local clusters.
Galaxy is a scientific workflow, data integration,[5][6] and data and analysis persistence and publishing platform.
Swift parallel scripting language: A scripting language with many of the capabilities of scientific workflow systems built-in.
Ergatis is a web-based utility that is used to create, run, and monitor reusable computational analysis pipelines (mainly Bioinformatics)
Tavaxy is a pattern based workflow system for the bioinformatics domain, focusing on genome comparison and sequence analysis.
Deelman, E., Gannon, D., Shields, M., Taylor, I., 2009. Workflows and e-science: An overview of workflow system features and capabilities. Future Generation Computer Systems 25, 528 – 540.
Barker, Adam; Van Hemert, Jano (2008), Scientific Workflow: A Survey and Research Directions, Lecture Notes in Computer Science 4967, Gdansk, Poland: Springer Berlin / Heidelberg, pp. 746–753
S. Olabarriaga et al., “Scientific Workflow Management – For Whom?,” e-Science (e-Science), 2014 IEEE 10th International Conference on, Sao Paulo, 2014, pp. 298-305.
* TopCAT Astronomical catalogue handling, including remote fetching of catalogues using SQL and VO protocols. Source code topcat_src.zip. Project is being developed on GitHUB. License: GPL
* TapSH tapsh is a simple shell to query TAP servers, built on GAVO's tapquery library. tapsh-latest.tar.gz gavodachs-latest.tar.gz gavostc-latest.tar.gz gavoutils-latest.tar.gz gavovot-latest.tar.gz. License: GPL
Non-Open Source and can not be used for Obelics: * VoPlot AstroStat – non-open source
* AstroGrid Source no longer available on the web