github.com/argoproj/argo-cd/v3@v3.2.1/docs/proposals/sync-timeout.md (about)

     1  ---
     2  title: Neat-enhancement-idea
     3  authors:
     4    - "@alexmt"
     5  sponsors:
     6    - "@jessesuen"
     7  reviewers:
     8    - "@ishitasequeira"
     9  approvers:
    10    - "@gdsoumya"
    11  
    12  creation-date: 2023-12-16
    13  last-updated: 2023-12-16
    14  ---
    15  
    16  # Sync Operation Timeout & Termination Settings
    17  
    18  The Sync Operation Timeout & Termination Settings feature introduces new sync operation settings that control automatic sync operation termination.
    19  
    20  ## Summary
    21  
    22  
    23  The feature includes two types of settings:
    24  
    25  * The sync timeout allows users to set a timeout for the sync operation. If the sync operation exceeds this timeout, it will be terminated.
    26  
    27  * The Termination settings are an advanced set of options that enable terminating the sync operation earlier when a known resource is stuck in a
    28  certain state for a specified amount of time.
    29  
    30  ## Motivation
    31  
    32  Complex synchronization operations that involve sync hooks and sync waves can be time-consuming and may occasionally become stuck in a specific state
    33  for an extended duration. In certain instances, these operations might indefinitely remain in this state. This situation becomes particularly inconvenient when the
    34  synchronization is initiated by an automation tool like a CI/CD pipeline. In these scenarios, the automation tool may end up waiting indefinitely for the
    35  synchronization process to complete.
    36  
    37  To address this issue, this feature enables users to establish a timeout for the sync operation. If the operation exceeds the specified time limit,
    38  it will be terminated, preventing extended periods of inactivity or indefinite waiting in automated processes.
    39  
    40  ### Goals
    41  
    42  The following goals are intended to be met by this enhancement:
    43  
    44  #### [G-1] Synchronization timeout
    45  
    46  The synchronization timeout feature should allow users to set a timeout for the sync operation. If the sync operation exceeds this timeout, it will be terminated.
    47  
    48  #### [G-2] Termination settings
    49  
    50  The termination settings would allow users to terminate the sync operation earlier when a known resource is stuck in a certain state for a specified amount of time.
    51  
    52  ## Proposal
    53  
    54  The proposed additional synchronization settings are to be added to the `syncPolicy.terminate` field within the Application CRD. The following features are to be added:
    55  
    56  * `timeout` - The timeout for the sync operation. If the sync operation exceeds this timeout, it will be terminated.
    57  * `resources` - A list of resources to monitor for termination. If any of the resources in the list are stuck in a
    58    certain state for a specified amount of time, the sync operation will be terminated.
    59  
    60  Example:
    61  
    62  ```yaml
    63  apiVersion: argoproj.io/v1alpha1
    64  kind: Application
    65  metadata:
    66    name: guestbook
    67  spec:
    68    ... # standard application spec
    69  
    70    syncPolicy:
    71      terminate:
    72        timeout: 10m # timeout for the sync operation
    73        resources:
    74          - kind: Deployment
    75            name: guestbook-ui
    76            timeout: 5m # timeout for the resource
    77            health: Progressing # health status of the resource
    78  ```
    79  
    80  ### Use cases
    81  
    82  Add a list of detailed use cases this enhancement intends to take care of.
    83  
    84  #### Normal sync operation:
    85  As a user, I would like to trigger a sync operation and expect it to complete within a certain time limit.
    86  
    87  #### CI triggered sync operation:
    88  As a user, I would like to trigger a sync operation from a CI/CD pipeline and expect it to complete within a certain time limit.
    89  
    90  #### Preview Applications:
    91  As a user, I would like to leverage ApplicationSet PR generator to generate preview applications and expect the auto sync operation fails automatically
    92  if it exceeds a certain time limit.
    93  
    94  ### Implementation Details/Notes/Constraints [optional]
    95  
    96  The application CRD status field already has all required information to implement sync timeout.
    97  
    98  * Global sync timeout: only the operation start time is required to implement this functoinality. It is provided be the `status.operationState.startedAt` field.
    99  * Resources state based termination. This part is a bit more complex and requires information about resources affected/created during the sync operation. Most of
   100  the required information is already available in the Application CRD status field. The `status.operationState.syncResult.resources` field contains a list of resources
   101  affected/created during the sync operation. Each `resource` list item includes the resource name, kind, and the resource health status. In order to provide accurate
   102  duration of the resource health status it is proposed to add `modifiedAt` field to the `resource` list item. This field will be updated every time the resource health/phase
   103  changes.
   104  
   105  ### Security Considerations
   106  
   107  Proposed changes don't expand the scope of the application CRD and don't introduce any new security concerns.
   108  
   109  ### Risks and Mitigations
   110  
   111  The execution of a synchronization operation is carried out in phases, which involve a series of Kubernetes API calls and typically take up to a few seconds.
   112  There is no easy way to terminate the operation during the phase. So the operation might take few seconds longer than the specified timeout. It does not seems
   113  reasonable to implement a more complex logic to terminate the operation during the phase. So it is proposed to just document that the operation might be terminated
   114  few seconds after the timeout is reached.
   115  
   116  ### Upgrade / Downgrade Strategy
   117  
   118  The proposed changes don't require any special upgrade/downgrade strategy. The new settings are optional and can be used by users only if they need them.
   119  
   120  ## Drawbacks
   121  
   122  Slight increase of the application syncrhonization logic complexity.
   123  
   124  ## Alternatives
   125  
   126  Rely on the external tools to terminate the sync operation. For example, the CI/CD pipeline can terminate the sync operation if it exceeds a certain time limit.