<?php  
            include_once( $_SERVER['DOCUMENT_ROOT']."/static/includes/common.inc.php" );
            do_html_header("Documentation");
        ?><div id="content">
<div class="navheader">
<table width="100%" summary="Navigation header"><tr>
<td width="20%" align="left">
<a accesskey="p" href="about.php">Prev</a> </td>
<td width="60%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="20%" align="right"> <a accesskey="n" href="installation.php">Next</a>
</td>
</tr></table>
<hr>
</div>
<div class="chapter" title="Chapter 2. Tutorial">
<div class="titlepage"><div><div><h2 class="title">
<a name="tutorial"></a>Chapter 2. Tutorial</h2></div></div></div>
<div class="toc"><dl>
<dt><span class="section"><a href="tutorial.php#idp8550784">2.1. Introduction</a></span></dt>
<dt><span class="section"><a href="tutorial.php#idp7549680">2.2. Getting Started</a></span></dt>
<dt><span class="section"><a href="tutorial.php#idp6757456">2.3. Generating the Workflow</a></span></dt>
<dt><span class="section"><a href="tutorial.php#idp9218688">2.4. Information Catalogs</a></span></dt>
<dt><span class="section"><a href="tutorial.php#idp7707680">2.5. Configuring Pegasus</a></span></dt>
<dt><span class="section"><a href="tutorial.php#idp7712304">2.6. Planning the Workflow</a></span></dt>
<dt><span class="section"><a href="tutorial.php#idp7740880">2.7. Submitting the Workflow</a></span></dt>
<dt><span class="section"><a href="tutorial.php#idp7746144">2.8. Monitoring the Workflow</a></span></dt>
<dt><span class="section"><a href="tutorial.php#idp7761008">2.9. Debugging the Workflow</a></span></dt>
<dt><span class="section"><a href="tutorial.php#idp7772528">2.10. Collecting Statistics</a></span></dt>
<dt><span class="section"><a href="tutorial.php#idp7789952">2.11. Workflow Dashboard</a></span></dt>
<dt><span class="section"><a href="tutorial.php#idp7838832">2.12. Conclusion</a></span></dt>
</dl></div>
<div class="section" title="2.1. Introduction">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="idp8550784"></a>2.1. Introduction</h2></div></div></div>
<p>This tutorial will take you through the steps of creating and
    running a simple workflow using Pegasus. This tutorial is intended for new
    users who want to get a quick overview of Pegasus concepts and usage. The
    tutorial covers the creating, planning, submitting, monitoring, debugging,
    and generating statistics for a simple diamond-shaped workflow. More
    information about the topics covered in this tutorial can be found in
    later chapters of this user's guide.</p>
<p>All of the steps in this tutorial are performed on the command-line.
    The convention we will use for command-line input and output is to put
    things that you should type in bold, monospace font, and to put the output
    you should get in a normal weight, monospace font, like this:</p>
<pre class="programlisting">[user@host dir]$ <span class="bold"><strong>you type this</strong></span>
you get this</pre>
<p>Where <code class="literal">[user@host dir]$</code> is the terminal prompt,
    the text you should type is “<code class="literal">you type this</code>”, and the
    output you should get is "<code class="literal">you get this</code>". The terminal
    prompt will be abbreviated as <code class="literal">$</code>. Because some of the
    outputs are long, we don’t always include everything. Where the output is
    truncated we will add an ellipsis '...' to indicate the omitted
    output.</p>
<p><span class="bold"><strong>If you are having trouble with this tutorial,
    or anything else related to Pegasus, you can contact the Pegasus Users
    mailing list at <code class="email">&lt;<a class="email" href="mailto:pegasus-users@isi.edu">pegasus-users@isi.edu</a>&gt;</code> to get
    help.</strong></span></p>
</div>
<div class="section" title="2.2. Getting Started">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="idp7549680"></a>2.2. Getting Started</h2></div></div></div>
<p>In order to reduce the amount of work required to get started we
    have provided several virtual machines that contain all of the software
    required for this tutorial. Virtual machine images are provided for <a class="link" href="tutorial_vm.php#vm_virtualbox" title="A.2. VirtualBox">VirtualBox</a>, <a class="link" href="tutorial_vm.php#vm_amazon" title="A.3. Amazon EC2">Amazon EC2</a> and <a class="link" href="tutorial_vm.php#vm_futuregrid" title="A.4. FutureGrid">FutureGrid</a>. Information about deploying the
    tutorial VM on these platforms is in <a class="link" href="tutorial_vm.php" title="Appendix A. Tutorial VM">the
    appendix</a>. Please go to the appendix for the platform you are using
    and follow the instructions for starting the VM found there before
    continuing with this tutorial.</p>
<p><span class="bold"><strong>Advanced Users:</strong></span> In the case that
    you want to install Pegasus and Condor and go through the tutorial on your
    own machine instead of using one of the virtual machines, the tutorial
    files are available in the <code class="filename">doc/tutorial</code> directory of
    the Pegasus source distribution. These files will need to be modified in
    several places to fix the paths to the users home directory (which is
    assumed to be <code class="filename">/home/tutorial</code>). It is assumed that
    Pegasus was installed from the RPM, so the path to the Pegasus install is
    assumed to be <code class="filename">/usr</code>. Condor should be installed in the
    "Personal Condor" configuration. You will also need a passwordless ssh key
    to enable SCP file transfers to/from localhost. Getting everything set up
    correctly can be tricky, so we recommend getting started with one of the
    VMs if you are not familiar with Condor and UNIX.</p>
<p>The remainder of this tutorial will assume that you have a terminal
    open to the directory where the tutorial files are installed. If you are
    using one of the tutorial VMs these files are located in the tutorial
    user's home directory <code class="filename">/home/tutorial</code>.</p>
</div>
<div class="section" title="2.3. Generating the Workflow">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="idp6757456"></a>2.3. Generating the Workflow</h2></div></div></div>
<p>We will be creating and running a simple diamond-shaped workflow
    that looks like this:</p>
<div class="figure">
<a name="idp6758720"></a><p class="title"><b>Figure 2.1. Diamond Workflow</b></p>
<div class="figure-contents"><div class="mediaobject"><img src="images/concepts-diamond.jpg" alt="Diamond Workflow"></div></div>
</div>
<br class="figure-break"><p>In this diagram, the ovals represent computational jobs, the
    dog-eared squares are files, and the arrows are dependencies.</p>
<p>Pegasus reads workflow descriptions from DAX files. The term “DAX”
    is short for “Directed Acyclic Graph in XML”. DAX is an XML file format
    that has syntax for expressing jobs, arguments, files, and
    dependencies.</p>
<p>In order to create a DAX it is necessary to write code for a DAX
    generator. Pegasus comes with Perl, Java, and Python libraries for writing
    DAX generators. In this tutorial we will show how to use the Python
    library.</p>
<p>The DAX generator for the diamond workflow is in the file
    <code class="filename">generate_dax.py</code>. Look at the file by typing:</p>
<pre class="programlisting">$ <span class="bold"><strong>more generate_dax.py</strong></span>
...</pre>
<div class="tip" title="Tip" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Tip</h3>
<p>We will be using the <code class="literal">more</code> command to inspect
      several files in this tutorial. <code class="literal">more</code> is a pager
      application, meaning that it splits text files into pages and displays
      the pages one at a time. You can view the next page of a file by
      pressing the spacebar. Type 'h' to get help on using
      <code class="literal">more</code>. When you are done, you can type 'q' to close
      the file.</p>
</div>
<p>The code has 5 sections:</p>
<div class="orderedlist"><ol class="orderedlist" type="1">
<li class="listitem"><p>A few system libraries and the Pegasus.DAX3 library are
        imported. The search path is modified to include the directory with
        the Pegasus Python library.</p></li>
<li class="listitem"><p>The name for the DAX output file is retrieved from the
        arguments.</p></li>
<li class="listitem"><p>A new ADAG object is created. This is the main object to which
        jobs and dependencies are added.</p></li>
<li class="listitem"><p>Jobs and files are added. The 4 jobs in the diagram above are
        added and the 6 files are referenced. Arguments are defined using
        strings and File objects. The input and output files are defined for
        each job. This is an important step, as it allows Pegasus to track the
        files, and stage the data if necessary. Workflow outputs are tagged
        with “transfer=true”.</p></li>
<li class="listitem"><p>Dependencies are added. These are shown as arrows in the diagram
        above. They define the parent/child relationships between the jobs.
        When the workflow is executing, the order in which the jobs will be
        run is determined by the dependencies between them.</p></li>
</ol></div>
<p>Generate a DAX file named <code class="filename">diamond.dax</code> by
    typing:</p>
<pre class="programlisting">$ <span class="bold"><strong>./generate_dax.py diamond.dax</strong></span>
Creating ADAG...
Adding preprocess job...
Adding left Findrange job...
Adding right Findrange job...
Adding Analyze job...
Adding control flow dependencies...
Writing diamond.dax</pre>
<p>The <code class="filename">diamond.dax</code> file should contain an XML
    representation of the diamond workflow. You can inspect it by
    typing:</p>
<pre class="programlisting">$ <span class="bold"><strong>more diamond.dax</strong></span>
...</pre>
</div>
<div class="section" title="2.4. Information Catalogs">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="idp9218688"></a>2.4. Information Catalogs</h2></div></div></div>
<div class="toc"><dl>
<dt><span class="section"><a href="tutorial.php#tut_site_catalog">2.4.1. The Site Catalog</a></span></dt>
<dt><span class="section"><a href="tutorial.php#idp9232416">2.4.2. The Transformation Catalog</a></span></dt>
<dt><span class="section"><a href="tutorial.php#idp7701216">2.4.3. The Replica Catalog</a></span></dt>
</dl></div>
<p>There are three information catalogs that Pegasus uses when planning
    the workflow. These are the <a class="link" href="tutorial.php#tut_site_catalog" title="2.4.1. The Site Catalog">Site
    Catalog</a>, <a class="link" href="tutorial.php#tut_xform_catalog">Transformation
    Catalog</a>, and <a class="link" href="tutorial.php#tut_replica_catalog">Replica
    Catalog</a>.</p>
<div class="section" title="2.4.1. The Site Catalog">
<div class="titlepage"><div><div><h3 class="title">
<a name="tut_site_catalog"></a>2.4.1. The Site Catalog</h3></div></div></div>
<p>The site catalog describes the sites where the workflow jobs are
      to be executed. Typically the sites in the site catalog describe remote
      clusters, such as PBS clusters or Condor pools. In this tutorial we
      assume that you have a Personal Condor pool running on localhost. If you
      are using one of the tutorial VMs this has already been setup for
      you.</p>
<p>The site catalog is in <code class="filename">sites.xml</code>:</p>
<pre class="programlisting">$ <span class="bold"><strong>more sites.xml</strong></span>
...
﻿    ﻿&lt;!-- The local site contains information about the submit host --&gt;
    &lt;!-- The arch and os keywords are used to match binaries in the transformation catalog --&gt;
    &lt;site handle="local" arch="x86_64" os="LINUX"&gt;

        &lt;!-- These are the paths on the submit host were Pegasus stores data --&gt;
        &lt;!-- Scratch is where temporary files go --&gt;
        &lt;directory type="shared-scratch" path="/home/tutorial/run"&gt;
            &lt;file-server operation="all" url="file:///home/tutorial/run"/&gt;
        &lt;/directory&gt;
        &lt;!-- Storage is where pegasus stores output files --&gt;
        &lt;directory type="local-storage" path="/home/tutorial/outputs"&gt;
            &lt;file-server operation="all" url="file:///home/tutorial/outputs"/&gt;
        &lt;/directory&gt;

        &lt;!-- This profile tells Pegasus where to find the user's private key for SCP transfers --&gt;
        &lt;profile namespace="env" key="SSH_PRIVATE_KEY"&gt;/home/tutorial/.ssh/id_rsa&lt;/profile&gt;
    &lt;/site&gt;


...</pre>
<p>There are two sites defined in the site catalog: “local” and
      “PegasusVM”. The “local” site is used by Pegasus to learn about the
      submit host where the workflow management system runs. The “PegasusVM”
      site is the personal Condor pool running on your (virtual) machine. In
      this case, the local site and the PegasusVM site refer to the same
      machine, but they are logically separate as far as Pegasus is
      concerned.</p>
<p>The local site is configured with a “storage” file system that is
      mounted on the submit host (indicated by the file:// URL). This file
      system is where the output data from the workflow will be stored. When
      the workflow is planned we will tell Pegasus that the output site is
      “local”.</p>
<p>The PegasusVM site is configured with a “scratch” file system
      accessible via SCP (indicated by the scp:// URL). This file system is
      where the working directory will be created. When we plan the workflow
      we will tell Pegasus that the execution site is “PegasusVM”.</p>
<p>The local site also has an environment variable called
      SSH_PRIVATE_KEY that tells Pegasus where to find the private key to use
      for SCP transfers. If you are running this tutorial on your own machine
      you will need to set up a passwordless ssh key and add it to
      authorized_keys. If you are using the tutorial VM this has already been
      set up for you.</p>
<p>Pegasus supports many different file transfer protocols. In this
      case the site catalog is set up so that input and output files are
      transferred to/from the PegasusVM site using SCP. Since both the local
      site and the PegasusVM site are actually the same machine, this
      configuration will just SCP files to/from localhost, which is just a
      complicated way to copy the files.</p>
<p>Finally, the PegasusVM site is configured with two profiles that
      tell Pegasus that it is a plain Condor pool. Pegasus supports many ways
      of submitting tasks to a remote cluster. In this configuration it will
      submit vanilla Condor jobs.</p>
</div>
<div class="section" title="2.4.2. The Transformation Catalog">
<div class="titlepage"><div><div><h3 class="title">
<a name="idp9232416"></a>2.4.2. The Transformation Catalog</h3></div></div></div>
<p>The transformation catalog describes all of the executables
      (called “transformations”) used by the workflow. This description
      includes the site(s) where they are located, the architecture and
      operating system they are compiled for, and any other information
      required to properly transfer them to the execution site and run
      them.</p>
<p>For this tutorial, the transformation catalog is in the file
      <code class="filename">tc.dat</code>:</p>
<pre class="programlisting">$ <span class="bold"><strong>more tc.dat</strong></span>
...
﻿# This is the transformation catalog. It lists information about each of the
# executables that are used by the workflow.

tr preprocess {
    site PegasusVM {
        pfn "/home/tutorial/bin/preprocess"
        arch "x86_64"
        os "linux"
        type "INSTALLED"
    }
}


...</pre>
<p>The <code class="filename">tc.dat</code> file contains information about
      three transformations: preprocess, findrange, and analyze. These three
      transformations are referenced in the diamond DAX. The transformation
      catalog indicates that all three transformations are installed on the
      PegasusVM site, and are compiled for x86_64 Linux.</p>
<p>The actual executable files are located in the
      <code class="filename">bin</code> directory. All three executables are actually
      symlinked to the same Python script. This script is just an example
      transformation that sleeps for 30 seconds, and then writes its own name
      and the contents of all its input files to all of its output
      files.</p>
</div>
<div class="section" title="2.4.3. The Replica Catalog">
<div class="titlepage"><div><div><h3 class="title">
<a name="idp7701216"></a>2.4.3. The Replica Catalog</h3></div></div></div>
<p>The final catalog is the Replica Catalog. This catalog tells
      Pegasus where to find each of the input files for the workflow.</p>
<p>All files in a Pegasus workflow are referred to in the DAX using
      their Logical File Name (LFN). These LFNs are mapped to Physical File
      Names (PFNs) when Pegasus plans the workflow. This level of indirection
      enables Pegasus to map abstract DAXes to different execution sites and
      plan out the required file transfers automatically.</p>
<p>The Replica Catalog for the diamond workflow is in the
      <code class="filename">rc.dat</code> file:</p>
<pre class="programlisting">$ <span class="bold"><strong>more rc.dat</strong></span>
# This is the replica catalog. It lists information about each of the
# input files used by the workflow.

# The format is:
# LFN     PFN    pool="SITE"

f.a    file:///home/tutorial/input/f.a    pool="local"</pre>
<p>This replica catalog contains only one entry for the diamond
      workflow’s only input file. This entry has an LFN of “f.a” with a PFN of
      “file:///home/tutorial/input/f.a” and the file is stored on the local
      site, which implies that it will need to be transferred to the PegasusVM
      site when the workflow runs. The Replica Catalog uses the keyword "pool"
      to refer to the site. Don't be confused by this: the value of the pool
      variable should be the name of the site where the file is located from
      the Site Catalog.</p>
</div>
</div>
<div class="section" title="2.5. Configuring Pegasus">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="idp7707680"></a>2.5. Configuring Pegasus</h2></div></div></div>
<p>In addition to the information catalogs, Pegasus takes a
    configuration file that specifies settings that control how it plans the
    workflow.</p>
<p>For the diamond workflow, the Pegasus configuration file is
    relatively simple. It only contains settings to help Pegasus find the
    information catalogs. These settings are in the
    <code class="filename">pegasus.conf</code> file:</p>
<pre class="programlisting">$ <span class="bold"><strong>more pegasus.conf</strong></span>
# This tells Pegasus where to find the Site Catalog
pegasus.catalog.site=XML3
pegasus.catalog.site.file=sites.xml

# This tells Pegasus where to find the Replica Catalog
pegasus.catalog.replica=File
pegasus.catalog.replica.file=rc.dat

# This tells Pegasus where to find the Transformation Catalog
pegasus.catalog.transformation=Text
pegasus.catalog.transformation.file=tc.dat</pre>
</div>
<div class="section" title="2.6. Planning the Workflow">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="idp7712304"></a>2.6. Planning the Workflow</h2></div></div></div>
<p>The planning stage is where Pegasus maps the abstract DAX to one or
    more execution sites. The planning step includes:</p>
<div class="orderedlist"><ol class="orderedlist" type="1">
<li class="listitem"><p>Adding a job to create the remote working directory</p></li>
<li class="listitem"><p>Adding stage-in jobs to transfer input data to the remote
        working directory</p></li>
<li class="listitem"><p>Adding cleanup jobs to remove data from the remote working
        directory when it is no longer needed</p></li>
<li class="listitem"><p>Adding stage-out jobs to transfer data to the final output
        location as it is generated</p></li>
<li class="listitem"><p>Adding registration jobs to register the data in a replica
        catalog</p></li>
<li class="listitem"><p>Task clustering to combine several short-running jobs into a
        single, longer-running job. This is done to make short-running jobs
        more efficient.</p></li>
<li class="listitem"><p>Adding wrappers to the jobs to collect provenance information so
        that statistics and plots can be created when the workflow is
        finished</p></li>
</ol></div>
<p>The <code class="literal">pegasus-plan</code> command is used to plan a
    workflow. This command takes quite a few arguments, so we created a
    <code class="filename">plan_dax.sh</code> wrapper script that has all of the
    arguments required for the diamond workflow:</p>
<pre class="programlisting">$ <span class="bold"><strong>more plan_dax.sh</strong></span>
...</pre>
<p>The script invokes the <code class="literal">pegasus-plan</code> command with
    arguments for the configuration file (<code class="literal">--conf</code>), the DAX
    file (<code class="literal">-d</code>), the submit directory
    (<code class="literal">--dir</code>), the execution site
    (<code class="literal">--sites</code>), the output site (<code class="literal">-o</code>) and
    two extra arguments that prevent Pegasus from removing any jobs from the
    workflow (<code class="literal">--force</code>) and that prevent Pegasus from adding
    cleanup jobs to the workflow (<code class="literal">--nocleanup</code>).</p>
<p>Top plan the diamond workflow invoke the
    <code class="filename">plan_dax.sh</code> script with the path to the DAX
    file:</p>
<pre class="programlisting">$ <span class="bold"><strong>./plan_dax.sh diamond.dax</strong></span>
2012.07.24 21:11:03.256 EDT:   

I have concretized your abstract workflow. The workflow has been entered 
into the workflow database with a state of "planned". The next step is to 
start or execute your workflow. The invocation required is:

pegasus-run  /home/tutorial/submit/tutorial/pegasus/diamond/run0001


2012.07.24 21:11:03.257 EDT:   Time taken to execute is 1.103 seconds
</pre>
<p>Note the line in the output that starts with
    <code class="literal">pegasus-run</code>. That is the command that we will use to
    submit the workflow. The path it contains is the path to the submit
    directory where all of the files required to submit and monitor the
    workflow are stored.</p>
<p>This is what the diamond workflow looks like after Pegasus has
    finished planning the DAX:</p>
<div class="figure">
<a name="idp7735632"></a><p class="title"><b>Figure 2.2. Diamond DAG</b></p>
<div class="figure-contents"><div class="mediaobject"><img src="images/concepts-diamond-dag.png" width="378" alt="Diamond DAG"></div></div>
</div>
<br class="figure-break"><p>For this workflow the only jobs Pegasus needs to add are a directory
    creation job, a stage-in job (for f.a), and a stage-out job (for f.d). No
    registration jobs are added because all the files in the DAX are marked
    register="false", and no cleanup jobs are added because we passed the
    <code class="literal">--nocleanup</code> argument to
    <code class="literal">pegasus-plan</code>.</p>
</div>
<div class="section" title="2.7. Submitting the Workflow">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="idp7740880"></a>2.7. Submitting the Workflow</h2></div></div></div>
<p>Once the workflow has been planned, the next step is to submit it to
    DAGMan/Condor for execution. This is done using the
    <code class="literal">pegasus-run</code> command. This command takes the path to the
    submit directory as an argument. Run the command that was printed by the
    <code class="filename">plan_dax.sh</code> script:</p>
<pre class="programlisting">$ <span class="bold"><strong>pegasus-run submit/tutorial/pegasus/diamond/run0001</strong></span>
-----------------------------------------------------------------------
File for submitting this DAG to Condor       : diamond-0.dag.condor.sub
Log of DAGMan debugging messages             : diamond-0.dag.dagman.out
Log of Condor library output                 : diamond-0.dag.lib.out
Log of Condor library error messages         : diamond-0.dag.lib.err
Log of the life of condor_dagman itself      : diamond-0.dag.dagman.log

Submitting job(s).
1 job(s) submitted to cluster 19.
-----------------------------------------------------------------------

Your Workflow has been started and runs in base directory given below

cd submit/tutorial/pegasus/diamond/run0001

*** To monitor the workflow you can run ***

pegasus-status -l submit/tutorial/pegasus/diamond/run0001

*** To remove your workflow run ***
pegasus-remove submit/tutorial/pegasus/diamond/run0001
</pre>
</div>
<div class="section" title="2.8. Monitoring the Workflow">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="idp7746144"></a>2.8. Monitoring the Workflow</h2></div></div></div>
<p>After the workflow has been submitted you can monitor it using the
    <code class="literal">pegasus-status</code> command:</p>
<pre class="programlisting">$ <span class="bold"><strong>pegasus-status submit/tutorial/pegasus/diamond/run0001</strong></span>
STAT  IN_STATE  JOB                                               
Run      01:48  diamond-0                                         
Run      00:05   |-findrange_ID0000002                            
Run      00:05   \_findrange_ID0000003                            
Summary: 3 Condor jobs total (R:3)

UNREADY   READY     PRE  QUEUED    POST SUCCESS FAILURE %DONE
      2       0       0       3       0       3       0  37.5
Summary: 1 DAG total (Running:1)
</pre>
<p>This command shows the workflow (diamond-0) and the running jobs (in
    the above output it shows the two findrange jobs). It also gives
    statistics on the number of jobs in each state and the percentage of the
    jobs in the workflow that have finished successfully.</p>
<p>Use the <code class="literal">watch</code> command to continuously monitor the
    workflow:</p>
<pre class="programlisting">$ <span class="bold"><strong>watch pegasus-status submit/tutorial/pegasus/diamond/run0001</strong></span>
...</pre>
<p>You should see all of the jobs in the workflow run one after the
    other. After a few minutes you will see:</p>
<pre class="programlisting">(no matching jobs found in Condor Q)
UNREADY   READY     PRE  QUEUED    POST SUCCESS FAILURE %DONE
      0       0       0       0       0       8       0 100.0
Summary: 1 DAG total (Success:1)
</pre>
<p>That means the workflow is finished successfully. You can type
    <code class="literal">ctrl-c</code> to terminate the <code class="literal">watch</code>
    command.</p>
<p>If the workflow finished successfully you should see the output file
    <code class="filename">f.d</code> in the <code class="filename">output</code> directory.
    This file was created by the various transformations in the workflow and
    shows all of the executables that were invoked by the workflow:</p>
<pre class="programlisting">$ <span class="bold"><strong>more output/f.d</strong></span>
/home/tutorial/bin/analyze:
/home/tutorial/bin/findrange:
/home/tutorial/bin/preprocess:
This is the input file of the diamond workflow
/home/tutorial/bin/findrange:
/home/tutorial/bin/preprocess:
This is the input file of the diamond workflow
</pre>
<p>Remember that the example transformations in this workflow just
    print their name to all of their output files and then copy all of their
    input files to their output files.</p>
</div>
<div class="section" title="2.9. Debugging the Workflow">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="idp7761008"></a>2.9. Debugging the Workflow</h2></div></div></div>
<p>In the case that one or more jobs fails, then the output of the
    <code class="literal">pegasus-status</code> command above will have a non-zero value
    in the <code class="literal">FAILURE</code> column.</p>
<p>You can debug the failure using the
    <code class="literal">pegasus-analyzer</code> command. This command will identify
    the jobs that failed and show their output. Because the workflow
    succeeded, <code class="literal">pegasus-analyzer</code> will only show some basic
    statistics about the number of successful jobs:</p>
<pre class="programlisting">$ <span class="bold"><strong>pegasus-analyzer submit/tutorial/pegasus/diamond/run0001</strong></span>
pegasus-analyzer: initializing...

****************************Summary***************************

 Total jobs         :      7 (100.00%)
 # jobs succeeded   :      7 (100.00%)
 # jobs failed      :      0 (0.00%)
 # jobs unsubmitted :      0 (0.00%)
</pre>
<p>If the workflow had failed you would see something like this:</p>
<pre class="programlisting">$ <span class="bold"><strong>pegasus-analyzer submit/tutorial/pegasus/diamond/run0002</strong></span>
pegasus-analyzer: initializing...

**************************Summary*************************************

 Total jobs         :      7 (100.00%)
 # jobs succeeded   :      2 (28.57%)
 # jobs failed      :      1 (14.29%)
 # jobs unsubmitted :      4 (57.14%)

**********************Failed jobs' details****************************

====================preprocess_ID0000001==============================

 last state: POST_SCRIPT_FAILED
       site: PegasusVM
submit file: preprocess_ID0000001.sub
output file: preprocess_ID0000001.out.003
 error file: preprocess_ID0000001.err.003

-----------------------Task #1 - Summary-----------------------------

site        : PegasusVM
hostname    : ip-10-252-31-58.us-west-2.compute.internal
executable  : /home/tutorial/bin/preprocess
arguments   : -i f.a -o f.b1 -o f.b2
exitcode    : -128
working dir : -

-------------Task #1 - preprocess - ID0000001 - stderr---------------

FATAL: The main job specification is invalid or missing.
</pre>
<p>In this example I removed the <code class="filename">bin/preprocess</code>
    executable and re-planned/re-submitted the workflow (that is why the
    command has run0002). The output of <code class="literal">pegasus-analyzer</code>
    indicates that the preprocess task failed with an error message that
    indicates that the executable could not be found.</p>
</div>
<div class="section" title="2.10. Collecting Statistics">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="idp7772528"></a>2.10. Collecting Statistics</h2></div></div></div>
<p>The <code class="literal">pegasus-statistics</code> command can be used to
    gather statistics about the runtime of the workflow and its jobs. The
    <code class="literal">-s all</code> argument tells the program to generate all
    statistics it knows how to calculate:</p>
<pre class="programlisting">$ <span class="bold"><strong>pegasus-statistics –s all submit/tutorial/pegasus/diamond/run0001</strong></span>

**************************SUMMARY******************************
# legends
# Workflow summary:
#       Summary of the workflow execution. It shows total
#       tasks/jobs/sub workflows run, how many succeeded/failed etc.
#       In case of hierarchical workflow the calculation shows the 
#       statistics across all the sub workflows.It shows the following 
#       statistics about tasks, jobs and sub workflows.
#
#     * Succeeded - total count of succeeded tasks/jobs/sub workflows.
#     * Failed - total count of failed tasks/jobs/sub workflows.
#     * Incomplete - total count of tasks/jobs/sub workflows that are 
#       not in succeeded or failed state. This includes all the jobs 
#       that are not submitted, submitted but not completed etc. This  
#       is calculated as  difference between 'total' count and sum of 
#       'succeeded' and 'failed' count.
#     * Total - total count of tasks/jobs/sub workflows.
#     * Retries - total retry count of tasks/jobs/sub workflows.
#     * Total Run - total count of tasks/jobs/sub workflows executed 
#       during workflow run. This is the cumulative of retries, 
#       succeeded and failed count.
# Workflow wall time:
#       The walltime from the start of the workflow execution to the 
#       end as reported by the DAGMAN.In case of rescue dag the value
#       is the cumulative of all retries.
# Workflow cumulative job wall time:
#       The sum of the walltime of all jobs as reported by kickstart. 
#       In case of job retries the value is the cumulative of all retries.
#       For workflows having sub workflow jobs (i.e SUBDAG and SUBDAX 
#       jobs), the walltime value includes jobs from the sub workflows 
#       as well.
# Cumulative job walltime as seen from submit side:
#       The sum of the walltime of all jobs as reported by DAGMan.
#       This is similar to the regular cumulative job walltime, but 
#       includes job management overhead and delays. In case of job
#       retries the value is the cumulative of all retries. For workflows 
#       having sub workflow jobs (i.e SUBDAG and SUBDAX jobs), the 
#       walltime value includes jobs from the sub workflows as well.

-----------------------------------------------------------------------
Type            Succeeded  Failed  Incomplete  Total     Retries  Total Run
Tasks           4          0       0           4     ||  0        4                   
Jobs            7          0       0           7     ||  0        7                   
Sub Workflows   0          0       0           0     ||  0        0                   
-----------------------------------------------------------------------

Workflow wall time:                               3 mins, 25 secs, (205 s)
Workflow cumulative job wall time:                2 mins, 0 secs, (120 s)
Cumulative job walltime as seen from submit side: 2 mins, 0 secs, (120 s)

Summary: submit/tutorial/pegasus/diamond/run0001/statistics/summary.txt

************************************************************************
</pre>
<p>The output of <code class="literal">pegasus-statistics</code> contains many
    definitions to help users understand what all of the values reported mean.
    Among these are the total wall time of the workflow, which is the time
    from when the workflow was submitted until it finished, and the total
    cumulative job wall time, which is the sum of the runtimes of all the
    jobs.</p>
<p>The <code class="literal">pegasus-statistics</code> command also writes out
    several reports in the <code class="filename">statistics</code> subdirectory of the
    workflow submit directory:</p>
<pre class="programlisting">$ <span class="bold"><strong>ls submit/tutorial/pegasus/diamond/run0001/statistics/</strong></span>
breakdown.csv  jobs.txt          summary.txt         time.txt
breakdown.txt  summary-time.csv  time-per-host.csv   workflow.csv
jobs.csv       summary.csv       time.csv            workflow.txt</pre>
<p>The file <code class="filename">breakdown.txt</code>, for example, has min,
    max, and mean runtimes for each transformation:</p>
<pre class="programlisting">$ <span class="bold"><strong>more submit/tutorial/pegasus/diamond/run0001/statistics/breakdown.txt</strong></span>
# legends
# Transformation - name of the transformation.
# Count          - the number of times the invocations corresponding to 
#                  the transformation was executed.
# Succeeded      - the count of the succeeded invocations corresponding 
#                  to the transformation.
# Failed         - the count of the failed invocations corresponding to 
#                  the transformation.
# Min(sec)       - the minimum invocation runtime value corresponding to 
#                  the transformation.
# Max(sec)       - the maximum invocation runtime value corresponding to 
#                  the transformation.
# Mean(sec)      - the mean of the invocation runtime corresponding to 
#                  the transformation.
# Total(sec)     - the cumulative of invocation runtime corresponding to 
#                  the transformation.

# a1f5ba03-a827-4d0a-8d59-9941cbfbd83d (diamond)
Transformation   Count  Succeeded  Failed  Min     Max     Mean     Total 
analyze          1      1          0       30.008  30.008  30.008   30.008 
dagman::post     7      7          0       5.0     6.0     5.143    36.0 
findrange        2      2          0       30.009  30.014  30.011   60.023 
pegasus::dirmanager 1   1          0       0.194   0.194   0.194    0.194 
pegasus::transfer 2     2          0       0.248   0.411   0.33     0.659 
preprocess       1      1          0       30.025  30.025  30.025   30.025

# All
Transformation   Count  Succeeded  Failed  Min     Max     Mean     Total 
analyze          1      1          0       30.008  30.008  30.008   30.008 
dagman::post     7      7          0       5.0     6.0     5.143    36.0 
findrange        2      2          0       30.009  30.014  30.011   60.023 
pegasus::dirmanager 1   1          0       0.194   0.194   0.194    0.194 
pegasus::transfer 2     2          0       0.248   0.411   0.33     0.659 
preprocess       1      1          0       30.025  30.025  30.025   30.025
</pre>
<p>In this case, because the example transformation sleeps for 30
    seconds, the min, mean, and max runtimes for each of the analyze,
    findrange, and preprocess transformations are all close to 30.</p>
</div>
<div class="section" title="2.11. Workflow Dashboard">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="idp7789952"></a>2.11. Workflow Dashboard</h2></div></div></div>
<p>The Virtual Box image is also bundled with the Pegasus Service
    bundle.This is available as a separate project in <a class="ulink" href="https://github.com/pegasus-isi/pegasus-service" target="_top">Github</a>. The
    pegasus-service-server is developed in Python and uses the Flask framework
    to implement the web interface.The users can then connect to this server
    using a browser to monitor/debug workflows.</p>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Note</h3>
<p>The workflow dashboard can only monitor workflows which have
        been executed using Pegasus 4.2.0 and above.</p>
<p>Currently, only the Virtual Box Tutorial image for 4.3.0 has the
        dashboard enabled. It is not enabled in the EC2 and FutureGrid
        image.</p>
</div>
<p>By default, the server is configured to listen on all network
    interfaces on port 5000. A user can view the dashboard on
    http://&lt;IP_ADDRESS&gt;:5000/</p>
<p>By default, the dashboard server can only monitor workflows run by
    the current user i.e. the user who is running the
    pegasus-service-server.</p>
<p>To access the workflow dashboard, in the Virtual BOX VM you can
    launch firefox by clicking the globe icon in the top menu of the desktop.
    The home page for the dashboard is accessible at
    http://localhost:5000</p>
<p>The Dashboard's home page lists all workflows, which have been run
    by the current-user. The home page shows the status of each of the
    workflow i.e. Running/Successful/Failed. The home page lists only the top
    level workflows (Pegasus supports hierarchical workflows i.e. workflows
    within a workflow). The rows in the table are color coded</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem"><p><span class="bold"><strong>Green</strong></span>: indicates workflow
        finished successfully.</p></li>
<li class="listitem"><p><span class="bold"><strong>Red</strong></span>: indicates workflow
        finished with a failure.</p></li>
<li class="listitem"><p><span class="bold"><strong>Blue</strong></span>: indicates a workflow is
        currently running.</p></li>
</ul></div>
<div class="figure">
<a name="idp7800848"></a><p class="title"><b>Figure 2.3. Dashboard Home Page</b></p>
<div class="figure-contents"><div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="100%"><tr><td><img src="images/dashboard_home.png" width="100%" alt="Dashboard Home Page"></td></tr></table></div></div>
</div>
<br class="figure-break"><p>To view details specific to a workflow, the user can click on
    corresponding workflow label. The workflow details page lists workflow
    specific information like workflow label, workflow status, location of the
    submit directory, etc. The details page also displays pie charts showing
    the distribution of jobs based on status.</p>
<p>In addition, the details page displays a tab listing all
    sub-workflows and their statuses. Additional tabs exist which list
    information for all running, failed, and successful jobs.</p>
<p>The information displayed for a job depends on it's status. For
    example, the failed jobs tab displays the job name, exit code, links to
    available standard output, and standard error contents.</p>
<div class="figure">
<a name="idp7805776"></a><p class="title"><b>Figure 2.4. Dashboard Workflow Page</b></p>
<div class="figure-contents"><div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="100%"><tr><td><img src="images/dashboard_workflow_details.png" width="100%" alt="Dashboard Workflow Page"></td></tr></table></div></div>
</div>
<br class="figure-break"><p>To view details specific to a job the user can click on the
    corresponding job's job label. The job details page lists information
    relevant to a specific job. For example, the page lists information like
    job name, exit code, run time, etc.</p>
<p>The job details page also shows tab's for failed, and successful
    task invocations (Pegasus allows users to group multiple smaller task's
    into a single job i.e. a job may consist of one or more tasks)</p>
<div class="figure">
<a name="idp7810016"></a><p class="title"><b>Figure 2.5. Dashboard Job Description Page</b></p>
<div class="figure-contents"><div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="100%"><tr><td><img src="images/dashboard_job_details.png" width="100%" alt="Dashboard Job Description Page"></td></tr></table></div></div>
</div>
<br class="figure-break"><p>The task invocation details page provides task specific information
    like task name, exit code, duration etc. Task details differ from job
    details, as they are more granular in nature.</p>
<div class="figure">
<a name="idp7813568"></a><p class="title"><b>Figure 2.6. Dashboard Invocation Page</b></p>
<div class="figure-contents"><div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="100%"><tr><td><img src="images/dashboard_invocation_details.png" width="100%" alt="Dashboard Invocation Page"></td></tr></table></div></div>
</div>
<br class="figure-break"><p>The dashboard also has web pages for workflow statistics and
    workflow charts, which graphically renders information provided by the
    pegasus-statistics and pegasus-plots command respectively.</p>
<p>The Statistics page shows the following statistics.</p>
<div class="orderedlist"><ol class="orderedlist" type="1">
<li class="listitem"><p>Workflow level statistics</p></li>
<li class="listitem"><p>Job breakdown statistics</p></li>
<li class="listitem"><p>Job specific statistics</p></li>
</ol></div>
<div class="figure">
<a name="idp7821248"></a><p class="title"><b>Figure 2.7. Dashboard Statistics Page</b></p>
<div class="figure-contents"><div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="100%"><tr><td><img src="images/dashboard_statistics.png" width="100%" alt="Dashboard Statistics Page"></td></tr></table></div></div>
</div>
<br class="figure-break"><p>The Charts page shows the following charts.</p>
<div class="orderedlist"><ol class="orderedlist" type="1">
<li class="listitem"><p>Job Distribution by Count/Time</p></li>
<li class="listitem"><p>Time Chart by Job/Invocation</p></li>
<li class="listitem"><p>Workflow Execution Gantt Chart</p></li>
</ol></div>
<p>The chart below shows the invocation distribution by count or
    time.</p>
<div class="figure">
<a name="idp7828816"></a><p class="title"><b>Figure 2.8. Dashboard Plots - Job Distribution</b></p>
<div class="figure-contents"><div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="100%"><tr><td><img src="images/dashboard_plots_job_dist.png" width="100%" alt="Dashboard Plots - Job Distribution"></td></tr></table></div></div>
</div>
<br class="figure-break"><p>The time chart shown below shows the number of jobs/invocations in
    the workflow and their total runtime</p>
<div class="figure">
<a name="idp7832288"></a><p class="title"><b>Figure 2.9. Dashboard Plots - Time Chart</b></p>
<div class="figure-contents"><div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="100%"><tr><td><img src="images/dashboard_plots_time_charts.png" width="100%" alt="Dashboard Plots - Time Chart"></td></tr></table></div></div>
</div>
<br class="figure-break"><p>The workflow gantt chart lays out the execution of the jobs in the
    workflow over time.</p>
<div class="figure">
<a name="idp7835744"></a><p class="title"><b>Figure 2.10. Dashboard Plots - Workflow Gantt Chart</b></p>
<div class="figure-contents"><div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="100%"><tr><td><img src="images/dashboard_plots_wf_gantt.png" width="100%" alt="Dashboard Plots - Workflow Gantt Chart"></td></tr></table></div></div>
</div>
<br class="figure-break">
</div>
<div class="section" title="2.12. Conclusion">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="idp7838832"></a>2.12. Conclusion</h2></div></div></div>
<p>Congratulations! You have completed the tutorial.</p>
<p>If you used Amazon EC2 or FutureGrid for this tutorial make sure to
    terminate your VM. Refer to the <a class="link" href="tutorial_vm.php" title="Appendix A. Tutorial VM">appendix</a> for more information about how to do
    this.</p>
<p>Refer to the other chapters in this guide for more information about
    creating, planning, and executing workflows with Pegasus.</p>
<p>Please contact the Pegasus Users Mailing list at
    <code class="email">&lt;<a class="email" href="mailto:pegasus-users@isi.edu">pegasus-users@isi.edu</a>&gt;</code> if you need help.</p>
</div>
</div>
<div class="navfooter">
<hr>
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left">
<a accesskey="p" href="about.php">Prev</a> </td>
<td width="20%" align="center"> </td>
<td width="40%" align="right"> <a accesskey="n" href="installation.php">Next</a>
</td>
</tr>
<tr>
<td width="40%" align="left" valign="top">Chapter 1. Introduction </td>
<td width="20%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="40%" align="right" valign="top"> Chapter 3. Installation</td>
</tr>
</table>
</div>
</div><?php  
            do_html_footer();
        ?>
