Our current quota system has a race condition on reservations that only
happens when we are creating new entries in the quota_usages table.
We normally lock quota_usages rows using a SELECT ... FOR UPDATE query,
but that's only effective when the entries exist, and current code just
creates them and proceeds without a lock on them.
This, together with the table not having unique constraint means that we
can get duplicated entries and we can have one entry overwriting the
data written by another request.
The quota_usages table does soft deletes, so the project_id and resource
fields are not enough for a unique constraint, so we add a new column
called race_preventer so we can have a unique constraint with the
3 fields.
Additionally we have to make sure that we acquire the locks before doing
the reservation calculations or the syncs, so once we create any missing
entries we close the session and try to get the locks again.
With these 2 changes we'll avoid duplicated entries as well as avoid
getting our quota usage out of sync right from the start.
For the unique constraint part of the code there were 2 alternatives
(one was even used in an earlier patchset):
- Create a virtual/computed column for the Table that sets it to a fixed
value when deleted_at is NULL and to NULL in any other case, then use
this virtual/computed column together with project_id and resource
fields for a unique constraint.
This change was my preferred solution, but it requires bumping the
SQLAlchemy version to 1.3.11 where the feature was added as computed
columns [1] and in some DB engines requires a relatively new version,
for example for PostgreSQL is only supported on version 12 or later.
- Set deleted_at to a non NULL value by default on creation, and make
sure our code always uses the deleted field to filter values.
This is a bit nasty, but it has the advantage of not requiring new DB
fields, no DB data migrations for existing entries, and easy to
rollback once we figure out the underlying issue (although it may
require a DB data migration on rollback if we want to leave the
deleted_at entry at NULL).
The decision to add a new field was because one of the alternatives is
kind of hacky and the other one depends on specific DBMS versions and
requires a SQLAlchemy version bump.
[1]: https://docs.sqlalchemy.org/en/13/core/defaults.html#computed-generated-always-as-columns
Closes-Bug: #1484343
Change-Id: I9000c16c5b3e6f313f02256a10cb4bc0a26379f7