Service error handling control

With service error handling control, you configure events and corresponding actions to take on sessions, tasks, and service instances when a service is in a specific state of its lifecycle. This feature also enables you to configure timeouts for all methods within the service and actions to take when a timeout occurs.

For example, in the service onInvoke() call, you could configure that if the method exits, the system is to restart the service.

Scope


Applicability

Details

Operating system

  • Linux, UNIX

  • Windows

Allows for

  • Configuration of actions the system should take upon failure exceptions, fatal exceptions, exit, and return for service lifecycle methods:

    • Register()

    • onCreateService()

    • onSessionEnter()

    • onSessionUpdate()

    • onInvoke()

    • onSessionLeave()

    • onDestroyService()

  • Configuration of timeout values for service lifecycle methods and actions to take.

  • Configuration of custom control codes upon which the system is to take specific actions.

Dependencies

n/a

Limitations

For backwards compatibility, Symphony accepts Symphony 3.1 and later formats in the <Control></Control> section of the application profile. Application profiles that are registered with the 3.1 format are modified by the system to the current format.

Note that you cannot use both Symphony 3.1 and a later format in the same application profile.


About service error handling control

Default and possible configurations for service error handling

The following table lists API methods that can be used in service code, possible events that can be configured for each method, and possible actions that can be taken on workload (session or task) and on service instances upon trigger of the event.

In the table, a default control code of 0 is assumed.



Service methods for which you can define events and actions

The following table lists the service methods for which you can define events and actions.

Event

Description

Register()

Register() is an internal method used by the system.

onCreateService()

Create the service container.

onSessionEnter()

Get common data and store it for later with the session context.

onSessionUpdate()

Get an update to existing common data

onInvoke()

Process the input message.

onSessionLeave()

Free the common data and all updates to common data for the session. Used with onSessionEnter().

onDestroyService()

Destroy and unload the service instance from the service container. The service instance is no longer associated with the session.


Events that trigger actions on workload and service instances


Condition

Description

Return

Defines the action to take upon successful return of the method.

Timeout

Defines the action to take when the method times out.

If you know that a service method invocation should not exceed a certain amount of time, you can configure it to be terminated after a specific time period has elapsed.

Failure Exception

Defines the action to take when a failure exception occurs within the specified method.

A FailureException indicates that the operation failed in the service but is worth trying on a different compute host.

Fatal Exception

Defines the action to take when a fatal exception occurs within the specified method.

A FatalException indicates that the operation failed in the service and is not likely to be successful if attempted on a different compute host.

Exit

Defines the action to take on the service instance when the service exits while invoking the method.


Actions that can be taken on workload

The following actions are possible on workload (sessions and tasks). Note that the possible combination of actions varies according to the method.


Possible action

Description

retry

When a specified event occurs, retry the method up to the number of times configured by the session and task retry limits in the application profile.

For SessionEnter and SessionUpdate, the system attempts to bind the session to the service instance up to the sessionRetryLimit in the application profile before the session is aborted.

Note:

The retry count for both of these methods are considered together. For example, if SessionEnter fails once and SessionUpdate fails twice, then the session rerun count is equal to 3. Therefore the SessionRetryCount should be set to a value that accounts for both SessionEnter and SessionUpdate failures.

For Invoke, the system attempts to run the task up to the taskRetryLimit defined in the application profile before the task is failed.

fail

When a specified event occurs, abort the session or fail the task, and propagate errors to the client application.

For SessionEnter or SessionUpdate, immediately abort the session. Do not retry the method.

For Invoke, immediately fail the task. Do not retry the method.

succeed

When the method succeeds, continue taking the action in the method until completion. No further action is taken on workload. This is a normal return.


Actions that can be taken on service instances

The following actions are possible on the service instance. Note that the possible combination of actions varies according to the method.


Possible action

Description

blockHost

When the specified event occurs, terminate the running service instance on this host and do not use this host to start any other service instance for the application.

The host on which the service instance was running is added to the blocked host list for the application. This host is no longer selected to run work for the application until it is explicitly unblocked through the EGO command-line or the Platform Management Console.

restartService

When the specified condition occurs, terminate the service instance, start a new service instance on the same host, and recover state. There is no limit to the number of times that a service instance can be restarted.

keepAlive

When the specified condition occurs, take no action on the running service instance.


Control codes

The control code is an integer returned either normally through the return of the method, or returned when a fatal or failure exception occurs. You can configure specific numbers to trigger actions in all methods except Register() and DestroyService().

The default value for control codes is 0. So if you do not explicitly set a control code in your service, then return is considered to have a control code of 0. Symphony executes the behavior defined for control code ="0" for the service event that occurs.

If you want, for example, to sometimes restart the service instance and sometimes not, then use any number other than 0 for your control code. For example, if you want to restart the service instance on every 10th invoke, indicate a code of 1. A code of 0 indicates the default action.

Configuration to modify service error handling

You can modify service error handling behavior in the application profile, Control section.

Configuration format

<Control>
    <Method name="Register" >
       <Timeout duration="60" actionOnSI="blockHost"/>
        <Exit actionOnSI="blockHost"/>
    </Method>
    <Method name="CreateService" >
       <Timeout duration="0" actionOnSI="blockHost"/>
       <Exit actionOnSI="blockHost"/>
       <Return controlCode="0" actionOnSI="keepAlive"/>
       <Exception type="failure" controlCode="0" actionOnSI="blockHost"/>
       <Exception type="fatal" controlCode="0" actionOnSI="blockHost"/>
   </Method>
   <Method name="SessionEnter" >
       <Timeout duration="0" actionOnSI="blockHost" actionOnWorkload="retry"/>
       <Exit actionOnSI="blockHost" actionOnWorkload="retry"/>
       <Return controlCode="0" actionOnSI="keepAlive" actionOnWorkload="succeed"/>
       <Exception type="failure" controlCode="0" actionOnSI="keepAlive" actionOnWorkload="retry"/>
      <Exception type="fatal" controlCode="0" actionOnSI="keepAlive" actionOnWorkload="fail"/>
   </Method>
   <Method name="SessionUpdate" >
       <Timeout duration="0" actionOnSI="blockHost" actionOnWorkload="retry"/>
       <Exit actionOnSI="blockHost" actionOnWorkload="retry"/>
       <Return controlCode="0" actionOnSI="keepAlive" actionOnWorkload="succeed"/>
       <Exception type="failure" controlCode="0" actionOnSI="keepAlive" actionOnWorkload="retry"/>
      <Exception type="fatal" controlCode="0" actionOnSI="keepAlive" actionOnWorkload="fail"/>
   </Method>
   <Method name="Invoke" >
      <Timeout duration="0" actionOnSI="restartService" actionOnWorkload="retry"/>
      <Exit actionOnSI=" restartService" actionOnWorkload="retry"/>
      <Return controlCode="0" actionOnSI="keepAlive" actionOnWorkload="succeed"/>
      <Exception type="failure" controlCode="0" actionOnSI="keepAlive" actionOnWorkload="retry"/>
      <Exception type="fatal" controlCode="0" actionOnSI="keepAlive" actionOnWorkload="fail"/>
   </Method>
   <Method name="SessionLeave" >
     <Timeout duration="0" actionOnSI="restartService"/>
     <Exit actionOnSI=" restartService"/>
     <Return controlCode="0" actionOnSI="keepAlive" customizedDebugAction="none"/>
     <Exception type="failure" controlCode="0" actionOnSI="keepAlive"/>
     <Exception type="fatal" controlCode="0" actionOnSI="keepAlive"/>
   </Method>
   <Method name="DestroyService" >
     <Timeout duration="15"/>
   </Method>
</Control>

Configuration example: Block the host when the service process exits during the invoke

<Control>
   ...
     <Method name="Invoke" >
        <Exit actionOnSI="blockHost"/>
     </Method>
</Control>

Configuration example: Do not rerun the task when a service times out

Configure a timeout value for the onInvoke() method and specify to fail the workload when the method times out.

<Control>
   ...
   <Method name="Invoke" >
       <Timeout duration="15" actionOnWorkload="fail" />
   </Method>
</Control>

Configuration example: Do not block hosts ever, under any situations

It is possible that in your environment you do not want hosts to be blocked under any circumstances. As a best practice, setting the system to not block hosts ever is not recommended. This is because if a service fails on a host, if the host is not blocked, the system may end up in an endless loop attempting to start the service on the same host on which it is always failing and continuously writing errors to the log files. In a very short time, your log files grow very large and take up too much disk space on your machine.

However, should you need to do this in your environment, you can accomplish this by setting all actionOnSI parameters to not block the host:

<SIM startUpTimeout="60" blockHostOnTimeout="false" blockHostOnVersionMismatch="false">
</SIM>
<Control>
      <Method name="Register" >
         <Timeout duration="60" actionOnSI="restartService"/>
         <Exit actionOnSI="restartService"/>
    </Method>
    <Method name="CreateService" >
        <Timeout duration="0" actionOnSI="restartService"/>
        <Exit actionOnSI="restartService"/>
        <Return controlCode="0" actionOnSI="keepAlive"/>
        <Exception type="failure" controlCode="0" actionOnSI="restartService"/>
        <Exception type="fatal" controlCode="0" actionOnSI="restartService"/>
    </Method>
    <Method name="SessionEnter" >
        <Timeout duration="0" actionOnSI="restartService" actionOnWorkload="retry"/>
        <Exit actionOnSI="restartService" actionOnWorkload="retry"/>
        <Return controlCode="0" actionOnSI="keepAlive" actionOnWorkload="succeed"/>
        <Exception type="failure" controlCode="0" actionOnSI="keepAlive" actionOnWorkload="retry"/>
        <Exception type="fatal" controlCode="0" actionOnSI="keepAlive" actionOnWorkload="fail"/>
    </Method>
    <Method name="SessionUpdate" >
        <Timeout duration="0" actionOnSI="restartService" actionOnWorkload="retry"/>
        <Exit actionOnSI="restartService" actionOnWorkload="retry"/>
        <Return controlCode="0" actionOnSI="keepAlive" actionOnWorkload="succeed"/>
        <Exception type="failure" controlCode="0" actionOnSI="keepAlive" actionOnWorkload="retry"/>
        <Exception type="fatal" controlCode="0" actionOnSI="keepAlive" actionOnWorkload="fail"/>
    </Method>
    <Method name="Invoke" >
       <Timeout duration="0" actionOnSI="restartService" actionOnWorkload="retry"/>
       <Exit actionOnSI="restartService" actionOnWorkload="retry"/>
       <Return controlCode="0" actionOnSI="keepAlive" actionOnWorkload="succeed"/>
       <Exception type="failure" controlCode="0" actionOnSI="keepAlive" actionOnWorkload="retry"/>
       <Exception type="fatal" controlCode="0" actionOnSI="keepAlive" actionOnWorkload="fail"/>
    </Method>
    <Method name="SessionLeave" >
       <Timeout duration="0" actionOnSI="restartService"/>
       <Exit actionOnSI=" restartService"/>
       <Return controlCode="0" actionOnSI="keepAlive"/>
       <Exception type="failure" controlCode="0" actionOnSI="keepAlive"/>
       <Exception type="fatal" controlCode="0" actionOnSI="keepAlive"/>
    </Method>
    <Method name="DestroyService" >
       <Timeout duration="15"/>
     </Method>
</Control>

Commands

Commands to display configuration


Command

Description

soamview app app_name -p

Once the application is registered, use soamview app -p to view the application profile xml. You can also view the xml configuration through the Platform Management Console.