globusrun-ws: Job failed: Staging error for RSL element fileStageOut. Error authenticating user at source/dest hostServer refused performing the request. Custom message: Server refused GSSAPI authentication. (error code 1) [Nested exception message: Custom message: Unexpected reply: 421 Idle Timeout: closing control connection.] [Caused by: Server refused performing the request. Custom message: Server refused GSSAPI authentication. (error code 1) [Nested exception message: Custom message: Unexpected reply: 421 Idle Timeout: closing control connection.]]I was asked to test:
control_preauth_timeout value from its default of 30 seconds; the globus gridftp team was now suggesting 200 seconds as the default
control_preauth_timeout value to 300 seconds worked. I was able to push jobs through at a high rate without ever seeing the timeout error. I eventually hit different failure modes when I tried pushing 400 jobs at a time. [23490] Mon Dec 10 09:11:22 2007 :: New connection from: XXXX [23490] Mon Dec 10 09:11:22 2007 :: DN /DC=org/DC=doegrids/OU=People/CN= XXXX successfully authorized.I evaluated the results a they depend on how many separate jobs I threw at the CE: 20-jobs, 100-jobs, and 400-jobs. It's unclear how much interaction goes on between the client and server between these log-entries, so the meaning of the absolute time scale is also unclear. However, as seen in the plot below, the time difference was only weakly dependent on loading the CE - mean values of 59, 57, and 87 seconds for the 3 cases. In any case, the value of 300 seconds (and probably 200 seconds) is large enough for this configuration.
| Time btween server connect and authentication (View Image for details) |
|
<job> ...
<fileStageOut>
<maxAttempts>5</maxAttempts>
<transfer>
....
</transfer>
</fileStageOut>
This did NOT work. I continued to have authentication errors.
Delegating user credentials...Failed. globusrun-ws: globus_i_delegate.c::1142: Error trying to delegate globus_i_delegate.c::673: Error querying delegation factories ManagedJobFactoryService_client.c::2209: Failed sending request ManagedJobFactoryPortType_GetMultipleResourceProperties. globus_xio_system_select.c:globus_l_xio_system_cancel_cb:664: Operation was canceled globus_xio_system_select.c:globus_l_xio_system_cancel_cb:664: Operation timed out
Delegating user credentials...Failed. globusrun-ws: globus_i_delegate.c::1142: Error trying to delegate globus_delegation_client_util.c:globus_l_delegation_client_util_get_cert_cb:488: DelegationFactoryPortType_GetResourceProperty callback failed. DelegationFactoryService_client.c::720: Failed sending request DelegationFactoryPortType_GetResourceProperty. globus_xio_system_select.c:globus_l_xio_system_cancel_cb:664: Operation was canceled globus_xio_system_select.c:globus_l_xio_system_cancel_cb:664: Operation timed out
Delegating user credentials...Done. Submitting job...Failed. Cleaning up any delegated credentials...Done. globusrun-ws: globus_i_submit.c::731: Error submitting job ManagedJobFactoryService_client.c::5202: Failed sending request ManagedJobFactoryPortType_createManagedJob. globus_xio_system_select.c:globus_l_xio_system_cancel_cb:664: Operation was canceled globus_xio_system_select.c:globus_l_xio_system_cancel_cb:664: Operation timed out
control_preauth_timeout 200 into the $GLOBUS_LOCATION/etc/gridftp.conf file.
| I | Attachment | Action | Size | Date | Who | Comment |
|---|---|---|---|---|---|---|
| |
Globusrun_Comparison_between_0.8_and_0.6 | manage | 576.5 K | 10 Dec 2007 - 21:13 | SuchandraThapa | Results from comparing globusrun-ws at 1Hz with different settings |
| |
gftp_auth_dt.gif | manage | 13.8 K | 10 Dec 2007 - 19:01 | JeffPorter | time between gridftp server connect and authentication |