Patterns in XPFlow

XPFlow offers a large set of constructs that allow to build arbitrary workflows. This section summarizes most of them. For more information on workflow patterns see the seminal work of Van Der Alst et al.

Concepts

Process

Describes a workflow using a domain-specific language.

process :main do 
    log "Hello"
end

Activity

Describes an activity which can be called in a process. Contrary to processes, activities contain arbitrary Ruby code.

activity :say_hello do
    puts "hello"
end
process :main do 
    say_hello
end

Basic patterns

Sequence

This pattern is the most common one and is often implicit (for example in process definitions). It executes activities in a given order.

sequence do
    log "First activity"
    log "Second activity"
end

Parallel

The parallel pattern executes a given list of activities in parallel and waits for all of them to finish. The amount of parallelism is limited by the runtime, but may be limited implicitely even more.

parallel do
    log "First activity"
    log "Second activity"
end

The execution order of first activity and the remaining ones is unspecified. However, the second activity will always run before the third one.

Run

The run pattern allows to run activities or other processes as a part of workflow execution.

sequence do
    run :subprocess
    run :subactivity
end

Foreach

The foreach pattern executes a workflow block for each element in a list sequentially.

foreach list do |element|
    log "Processing #{element}"
end

Forall

This pattern executes a workflow block for each element in a list in parallel and waits for all branches to finish.

forall list do |element|
    log "Processing #{element}"
end

In the example above, the precise ordering of how elements on list are processed may differ from the list order. However, the result of forall pattern is ordered as the original list.

On

The on pattern is an equivalent of if statement from programming languages.

on (x == :ready) do
    log "x is ready"
    otherwise
    log "x is not ready"
end

The otherwise clause is optional.

Switch

Similarly, the switch pattern is an equivalent of a statement of the same name from programming languages.

switch val do
    on(3) { log "value is 3" }
    on(5) { log "value is 5" }
    default do
        log "value is different"
    end
end

Unbound switch

A variant of switch pattern allows to run subworkflow depending on arbitrary conditions.

switch do
    on(val == 3) { log "value is 3" }
    on(val2 == 5) { log "value2 is 5" }
    default do
        log "no case matched"
    end
end

Note, however, that only the first match will be executed, even if there are multiple matches.

Multi

The behavior of the multi pattern is similar to unbound switch, however every match will be executed in parallel.

multi do
    on(val == 3) { log "value is 3" }
    on(val2 == 5) { log "value2 is 5" }
    default do
        log "no case matched"
    end
end

In the example above, if value is 3 and value2 is 5, then two workflow blocks will be executed in parallel. If no condition matches, then default block is executed.

Info

The result of a workflow block inside the info pattern carries additional information on its execution, in particular its execution time.

sequence do
    r = info { sleep 0.05 }
    log "Execution time is #{r.time}"
end

Checkpoint

XPFlow enables taking checkpoints (or snanpshots) during workflow execution. During execution tt these well defined points, the state of the workflow will be saved to disk and can be used to restart execution in the future. Currently, only the state of a workflow is saved, the platform state is ignored.

process :main do
    log "Before checkpoint"
    checkpoint :name_of_checkpoint
    log "After checkpoint"
end

If you run this code snippet two times, you will notice that the first log line will not show up during the second run. By default, XPFlow restart experiment execution from the latest checkpoint. You may ignore checkpoints completely with -I switch.

Advanced patterns

Many

The many pattern differs from parallel in that it only waits for a subset of its paralell subworkflows to finish.

many(2) do
    log "First activity"
    log "Second activity"
    log "Third activity"
end

In the example above, the exeution of the many workflow block will finish when any two of three subworkflows are finished.

Any

This pattern is a special case of many where the first parameter is equal to one. It means that the pattern execution will finishes as soon as one of parallel subfworkflows will finish.

any do
    log "First activity"
    log "Second activity"
    log "Third activity"
end

Formany

The formany pattern is a loop version of many and is similar to forall. It executes a subworkflow in parallel for all elements in a given list and finishes if a subset of them finishes.

formany 10, list do |element|
    log "Processing #{element}"
end

Forany

The forany pattern is a special case of formany and finishes when any one of parallel iterations finishes.

forany list do |element|
    log "Processing #{element}"
end

The example above is equivalent to formany with the first parameter equal to 1.

Seqtry

The subworkflows in seqtry pattern are executed sequentially as long as each fails to execute properly. The execution finishes when the first execution is successful or fails if all subworkflows failed.

seqtry do
    code { 1/0 }
    log "Success"
    log "Another success"
end

In the example above, the first subworkflow (code activity) will be executed and will fail (due to the division by zero). The execution will continue with the second workflow which will return with success, finishing the execution of seqtry pattern. Therefore, the last workflow will not be executed.

Times

The times pattern executes a given workflow block a number of times sequentially.

times do |iteration|
    log "Iteration #{iteration}"
end

Try

The try block executes a subworkflow retrying its execution if it fails. After the given number of retries, it propagates the error to a parent workflow.

try :retry => 5 do
    log "Failing operation"
end

Result

The result pattern stores the result of a subworkflow inside a YAML file. If the file already exists, the subworkflow is not executed.

result "file.yaml" do
    log "Computing results"
    value([ 1, 2 ])
end

In the example above, the array of two numbers will be stored in a file. If the file exists already, the subworkflow does not execute and the previous result is used instead.