<?php  
            include_once( $_SERVER['DOCUMENT_ROOT']."/static/includes/common.inc.php" );
            do_html_header("Documentation");
        ?><div id="content">
<div class="navheader">
<table width="100%" summary="Navigation header"><tr>
<td width="20%" align="left">
<a accesskey="p" href="data_transfers.php">Prev</a> </td>
<td width="60%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="20%" align="right"> <a accesskey="n" href="service.php">Next</a>
</td>
</tr></table>
<hr>
</div>
<div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="job_throttling"></a>10.6. Job Throttling</h2></div></div></div>
<div class="toc"><dl class="toc"><dt><span class="section"><a href="job_throttling.php#job_throttling_across_workflows">10.6.1. Job Throttling Across Workflows</a></span></dt></dl></div>
<p><span class="emphasis"><em>Issue:</em></span> For large workflows you may want to
    control the number of jobs released by DAGMan in local condor queue, or
    number of remote jobs submitted.</p>
<p><span class="emphasis"><em>Solution:</em></span> HTCondor DAGMan has knobs that can be
    tuned at a per workflow level to control it's behavior. These knobs
    control how it interacts with the local HTCondor Schedd to which it
    submits jobs that are ready to run in a particular DAG. These knobs are
    exposed as<a class="link" href="profiles.php#dagman_profiles" title="12.2.8. The Dagman Profile Namespace"> DAGMan profiles</a>
    (maxidle, maxjobs, maxpre and maxpost) that you can set in your properties
    files. </p>
<div class="table">
<a name="dagman_throttling_profiles"></a><p class="title"><b>Table 10.3. Useful dagman Commands that can be specified in the properties
        file.</b></p>
<div class="table-contents"><table summary="Useful dagman Commands that can be specified in the properties
        file." border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Property Key </strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>dagman.maxpre<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>MAXPRE<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>String</p></div></td>
<td>sets the maximum number of PRE scripts within the DAG
              that may be running at one time</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>dagman.maxpost<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>MAXPOST<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>String</p></div></td>
<td>sets the maximum number of POST scripts within the DAG
              that may be running at one time</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>dagman.maxjobs<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>MAXJOBS<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>String</p></div></td>
<td>sets the maximum number of jobs within the DAG that will
              be submitted to Condor at one time.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>dagman.maxidle<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>MAXIDLE<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>String</p></div></td>
<td>Sets the maximum number of idle jobs allowed before
              HTCondor DAGMan stops submitting more jobs. Once idle jobs start
              to run, HTCondor DAGMan will resume submitting jobs. If the
              option is omitted, the number of idle jobs is unlimited.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>dagman.[CATEGORY-NAME].maxjobs<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>[CATEGORY-NAME].MAXJOBS<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>String</p></div></td>
<td>is the value of maxjobs for a particular category. Users
              can associate different categories to the jobs at a per job
              basis. However, the value of a dagman knob for a category can
              only be specified at a per workflow basis in the
              properties.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>dagman.post.scope<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>POST.SCOPE<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>String</p></div></td>
<td>scope for the postscripts. <div class="orderedlist"><ol class="orderedlist" type="1">
<li class="listitem"><p>If set to <span class="bold"><strong>all</strong></span> ,
                    means each job in the workflow will have a postscript
                    associated with it.</p></li>
<li class="listitem"><p>If set to <span class="bold"><strong>none</strong></span> ,
                    means no job has postscript associated with it. None mode
                    should be used if you are running vanilla / standard/
                    local universe jobs, as in those cases Condor traps the
                    remote exitcode correctly. None scope is not recommended
                    for grid universe jobs.</p></li>
<li class="listitem"><p>If set to <span class="bold"><strong>essential</strong></span>, means only essential
                    jobs have post scripts associated with them. At present
                    the only non essential job is the replica registration
                    job.</p></li>
</ol></div>
</td>
</tr>
</tbody>
</table></div>
</div>
<p><br class="table-break"></p>
<p>Within a single workflow, you can also control the number of jobs
    submitted per type ( or category ) of jobs. To associate categories, you
    needs to associate dagman profile key named category with the jobs and
    specify the property dagman.[CATEGORY-NAME].* in the properties file. More
    information about HTCondor DAGMan categories can be found in the <a class="ulink" href="http://research.cs.wisc.edu/htcondor/manual/v8.3.5/2_10DAGMan_Applications.html#SECTION003108400000000000000" target="_top">HTCondor
    Documentation</a>.</p>
<p>HTCondor also exposes useful configuration parameters that can be
    specified in it's configuration file (condor_config_val -conf will list
    the condor configuration files), to control job submission across
    workflows. Some of the useful parameters that you may want to tune
    are</p>
<div class="table">
<a name="condor_throttling_variables"></a><p class="title"><b>Table 10.4. Useful HTCondor Job Throttling Configuration Parameters</b></p>
<div class="table-contents"><table summary="Useful HTCondor Job Throttling Configuration Parameters" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>HTCondor Configuration
            Parameter</strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Parameter Name: </strong></span></strong></span>START_LOCAL_UNIVERSE<span class="bold"><strong><span class="bold"><strong><br>
Sample Value  : </strong></span></strong></span>TotalLocalJobsRunning &lt; 20<span class="bold"><strong><br>
</strong></span></p></div></td>
<td>Most of the pegauss added auxillary jobs ( createdir,
            cleanup, registration and data cleanup ) run in the local universe
            on the submit host. If you have a lot of workflows running,
            HTCondor may try to start too many local universe jobs, that may
            bring down your submit host. This global parameter is used to
            configure condor to not launch too many local universe
            jobs.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Parameter Name: </strong></span></strong></span>GRIDMANAGER_MAX_JOBMANAGERS_PER_RESOURCE<span class="bold"><strong><span class="bold"><strong><br>
Sample Value  : </strong></span></strong></span>Integer<span class="bold"><strong><br>
</strong></span></p></div></td>
<td>For grid jobs of type gt2, limits the number of
            globus-job-manager processes that the condor_gridmanager lets run
            at a time on the remote head node. Allowing too many
            globus-job-managers to run causes severe load on the head note,
            possibly making it non-functional. Usually the default value in
            htcondor ( as of version 8.3.5) is 10.<p>This parameter is
            useful when you are doing remote job submissions using HTCondor-G.
            </p>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Parameter Name: </strong></span></strong></span>GRIDMANAGER_MAX_SUBMITTED_JOBS_PER_RESOURCE<span class="bold"><strong><span class="bold"><strong><br>
Sample Value  : </strong></span></strong></span> Integer<span class="bold"><strong><br>
</strong></span></p></div></td>
<td>An integer value that limits the number of jobs that a
            condor_gridmanager daemon will submit to a resource. A
            comma-separated list of pairs that follows this integer limit will
            specify limits for specific remote resources. Each pair is a host
            name and the job limit for that host. Consider the example
            <pre class="programlisting">GRIDMANAGER_MAX_SUBMITTED_JOBS_PER_RESOURCE = 
                         200, foo.edu, 50, bar.com, 100.</pre>
<p>
            In this example, all resources have a job limit of 200, except
            foo.edu, which has a limit of 50, and bar.com, which has a limit
            of 100. Limits specific to grid types can be set by appending the
            name of the grid type to the configuration variable name, as the
            example GRIDMANAGER_MAX_SUBMITTED_JOBS_PER_RESOURCE_CREAM = 300 In
            this example, the job limit for all CREAM resources is 300.
            Defaults to 1000 ( as of version 8.3.5).</p>
<p>This
            parameter is useful when you are doing remote job submissions
            using HTCondor-G.</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="job_throttling_across_workflows"></a>10.6.1. Job Throttling Across Workflows</h3></div></div></div>
<p><span class="emphasis"><em>Issue:</em></span> DAGMan throttling knobs are per
      workflow, and don't work across workflows. Is there any way to control
      different types of jobs run at a time across workflows?</p>
<p><span class="emphasis"><em>Solution:</em></span> While not possible in all cases, it
      is possible to throttle different types of jobs across workflows if you
      configure the jobs to run in vanilla universe by leverage <a class="ulink" href="http://research.cs.wisc.edu/htcondor/manual/v8.2/3_12Setting_Up.html#SECTION0041215000000000000000" target="_top">HTCondor
      concurrency limits</a>. Most of the Pegasus generated jobs ( data
      transfer jobs and auxillary jobs such as create dir, cleanup and
      registration) execute in local universe where concurrency limits don't
      work. To use this you need to do the following</p>
<div class="orderedlist"><ol class="orderedlist" type="1">
<li class="listitem">
<p>Get the local universe jobs to run locally in vanilla
          universe. You can do this by associating condor profiles universe
          and requirements in the site catalog for local site or individually
          in the transformation catalog for each pegasus executable. Here is
          an example local site catalog entry.</p>
<pre class="programlisting"> &lt;site handle="local" arch="x86_64" os="LINUX"&gt;
      &lt;directory type="shared-scratch" path="/shared-scratch/local"&gt;
         &lt;file-server operation="all" url="file:///shared-scratch/local"/&gt;
      &lt;/directory&gt;
      &lt;directory type="local-storage" path="/storage/local"&gt;
         &lt;file-server operation="all" url="file:///storage/local"/&gt;
      &lt;/directory&gt;

      &lt;!-- keys to make jobs scheduled to local site run on local site in vanilla universe --&gt;
      &lt;profile namespace="condor" key="universe"&gt;vanilla&lt;/profile&gt;
      &lt;profile namespace="condor" key="requirements"&gt;(Machine=="submit.example.com")&lt;/profile&gt;
   &lt;/site&gt;
</pre>
<p>Replace the Machine value in requirements with the hostname of
          your submit host.</p>
</li>
<li class="listitem"><p>Copy condor_config.pegasus file from share/pegasus/htcondor
          directory to your condor config.d directory.</p></li>
</ol></div>
<p>Starting Pegasus 4.5.1 release, the follow values for concurrency
      limits is associated with different types of jobs Pegasus
      creates.</p>
<div class="table">
<a name="pegasus_concurrency_limits_mapping"></a><p class="title"><b>Table 10.5. Pegasus Job Types To Condor Concurrency Limits</b></p>
<div class="table-contents"><table summary="Pegasus Job Types To Condor Concurrency Limits" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Pegasus Job Type</strong></span></td>
<td><span class="bold"><strong>HTCondor Concurrency Limit
              Compatible with distributed condor_config.pegasus
              </strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p>Data Stagein Job<span class="bold"><strong><br>
</strong></span></p></div></td>
<td>pegasus_transfer.stagein</td>
</tr>
<tr>
<td><div class="literallayout"><p>Data Stageout Job<span class="bold"><strong><br>
</strong></span></p></div></td>
<td>pegasus_transfer.stageout</td>
</tr>
<tr>
<td><div class="literallayout"><p>Inter Site Data Transfer Job<span class="bold"><strong><br>
</strong></span></p></div></td>
<td>pegasus_transfer.inter</td>
</tr>
<tr>
<td><div class="literallayout"><p>Worker Pacakge Staging Job</p></div></td>
<td>pegasus_transfer.worker</td>
</tr>
<tr>
<td><div class="literallayout"><p>Create Directory Job<span class="bold"><strong><br>
</strong></span></p></div></td>
<td>pegasus_auxillary.createdir</td>
</tr>
<tr>
<td><div class="literallayout"><p>Data Cleanup Job<span class="bold"><strong><br>
</strong></span></p></div></td>
<td>pegasus_auxillary.cleanup</td>
</tr>
<tr>
<td><div class="literallayout"><p>Replica Registration Job<span class="bold"><strong><br>
</strong></span></p></div></td>
<td>pegasus_auxillary.registration</td>
</tr>
<tr>
<td><div class="literallayout"><p>Set XBit Job<span class="bold"><strong><br>
</strong></span></p></div></td>
<td>pegasus_auxillary.chmod</td>
</tr>
<tr>
<td><div class="literallayout"><p>User Compute Job<span class="bold"><strong><br>
</strong></span></p></div></td>
<td>pegasus_compute</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Note</h3>
<p>It is not recommended to set limit for compute jobs unless you
        know what you are doing.</p>
</div>
</div>
</div>
<div class="navfooter">
<hr>
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left">
<a accesskey="p" href="data_transfers.php">Prev</a> </td>
<td width="20%" align="center"><a accesskey="u" href="optimization.php">Up</a></td>
<td width="40%" align="right"> <a accesskey="n" href="service.php">Next</a>
</td>
</tr>
<tr>
<td width="40%" align="left" valign="top">10.5. Optimizing Data Transfers </td>
<td width="20%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="40%" align="right" valign="top"> Chapter 11. Pegasus Service</td>
</tr>
</table>
</div>
</div><?php  
            do_html_footer();
        ?>
