<?php  
            require('/srv/new-pegasus.isi.edu/includes/common.php'); 
            pegasus_header("11.5. Optimizing Data Transfers");
        ?><div class="breadcrumbs">
<span class="breadcrumb-link"><a href="index.php">Pegasus 4.8.0 User Guide</a></span> &gt; <span class="breadcrumb-link"><a href="optimization.php">Optimizing Workflows for Efficiency and Scalability</a></span> &gt; <span class="breadcrumb-node">Optimizing Data Transfers</span>
</div><hr><div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="data_transfers"></a>11.5. Optimizing Data Transfers</h2></div></div></div>
<p><span class="emphasis"><em>Issue:</em></span> When it comes to data transfers, Pegasus
    ships with a default configuration which is trying to strike a balance
    between performance and aggressiveness. We obviously want data transfers
    to be as quick as possibly, but we also do not want our transfers to
    overwhelm data services and systems.</p>
<p><span class="emphasis"><em>Solution:</em></span> Starting 4.8.0 release, the default
    configuration of Pegasus now adds transfer jobs and cleanup jobs based on
    the number of jobs at a particular level of the workflow. For example, for
    every 10 compute jobs on a level of a workflow, one data transfer job(
    stage-in and stage-out) is created. The default configuration also sets
    how many threads such a pegasus-transfer job can spawn. Cleanup jobs are
    similarly constructed with an internal ratio of 5.</p>
<p>Additionally, Pegasus makes use of DAGMan categories and associates
    the following default values with the transfer and cleanup
    jobs.
    </p>
<div class="table">
<a name="table-default-job-categories"></a><p class="title"><b>Table 11.3. Default Category names associated by Pegasus</b></p>
<div class="table-contents"><table class="table" summary="Default Category names associated by Pegasus" border="1">
<colgroup>
<col>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>DAGMan Category
            Name</strong></span></td>
<td><span class="bold"><strong>Auxillary Job applied
            to.</strong></span></td>
<td><span class="bold"><strong>Default Value Assigned in generated
            DAG file</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p>stage-in </p></div></td>
<td>data stage-in jobs</td>
<td>10</td>
</tr>
<tr>
<td><div class="literallayout"><p>stage-out</p></div></td>
<td>data stage-out jobs</td>
<td>10</td>
</tr>
<tr>
<td><div class="literallayout"><p>stage-inter</p></div></td>
<td>inter site data transfer jobs</td>
<td>-</td>
</tr>
<tr>
<td><div class="literallayout"><p>cleanup</p></div></td>
<td>data cleanup jobs</td>
<td>4</td>
</tr>
<tr>
<td><div class="literallayout"><p>registration </p></div></td>
<td>registration jobs</td>
<td>1 (for file based RC)</td>
</tr>
</tbody>
</table></div>
</div>
<p><br class="table-break">
    </p>
<p>Information on how to control manully the maxinum number of stagein
    and stageout jobs can be found in the <a class="link" href="transfer.php#data_movement_nodes" title="10.2.5. Addition of Separate Data Movement Nodes to Executable Workflow">Data Movement Nodes</a> section.</p>
<p>How to control the number of threads pegasus-transfer can use
    depends on if you want to control standard transfer jobs, or PegasusLite.
    For the former, see the <a class="link" href="properties.php#transfer_props" title="13.3.9. Transfer Configuration Properties">pegasus.transfer.threads</a> property, and for
    the latter the <a class="link" href="properties.php#transfer_props" title="13.3.9. Transfer Configuration Properties">pegasus.transfer.lite.threads</a>
    property.</p>
</div><div class="navfooter">
<hr>
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left">
<a accesskey="p" href="hierarchial_workflows.php">Prev</a> </td>
<td width="20%" align="center"><a accesskey="u" href="optimization.php">Up</a></td>
<td width="40%" align="right"> <a accesskey="n" href="job_throttling.php">Next</a>
</td>
</tr>
<tr>
<td width="40%" align="left" valign="top">11.4. Hierarchical Workflows </td>
<td width="20%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="40%" align="right" valign="top"> 11.6. Job Throttling</td>
</tr>
</table>
</div><?php  
            pegasus_footer();
        ?>
