Update Control Map can expire while Download (or any other state) is going on

Description

While each state in the client is executing, the timer for the Update Control Maps is still running, and isn't refreshed. This means that if the entire interval has elapsed before the state is finished, it won't be refreshed, but expired, even if a refresh would have worked.

This affects the demo more severely than production, since the intervals there are short.

Steps to reproduce:

  • Start a rootfs update which will take at least a few minutes (the download needs to take longer than UpdateControlMapExpirationTimeSeconds).

  • Set a pause in ArtifactCommit.

  • The update fails in ArtifactCommit instead of pausing.

Unfortunately, fixing this is going to be a bit tricky. Right now we are refreshing the map using explicit states, that can only run in between other states, in other words not in parallel with operations taking place inside states. But the expiration timer runs in a Go routine, so this can still fire even though an operation is in progress. And then we don't refresh again.

I can think of a couple of solutions:

  1. Instead of using a timer, use a timestamp, and calculate it when needed, and in particular, after the refresh.

  2. Do the state operations in a Go routine (which should be relatively safe, they are pretty isolated from the rest of the code), and run the refresh in the main Go routine while we are waiting for the state operations to finish.

Of the two approaches, I suspect that number 1 is a bit simpler, but I don't know for sure, especially the part about reordering the refresh and timeout.

Affects versions

Environment

None

Checklist

Activity

Show:

Ole Petter OrhagenDecember 27, 2021 at 1:12 PM

Kristian AmlieNovember 24, 2021 at 9:40 AM

I've put up an attempted fix for that here. The original issue in this ticket still remains though.

JesusNovember 18, 2021 at 8:05 AM

I was thinking the same solution.

UpdatePollInterval is how the user express the "responsiveness" of the system.

Fixed

Details

Assignee

Reporter

Story Points

Priority

Days in progress

11

Sprint

Backlog

yes

Zendesk Support

Checklist

Created September 22, 2021 at 11:33 AM
Updated June 25, 2024 at 12:02 PM
Resolved January 4, 2022 at 7:46 AM

Flag notifications