Class RdeStagingAction
- java.lang.Object
-
- google.registry.rde.RdeStagingAction
-
- All Implemented Interfaces:
java.lang.Runnable
public final class RdeStagingAction extends java.lang.Object implements java.lang.Runnable
Action that kicks off a Dataflow job to stage escrow deposit XML files on GCS for RDE/BRDA for all TLDs.Pending Deposits
This task starts by asking
PendingDepositChecker
which deposits need to be generated. If there's nothing to deposit, we return 204 No Content; otherwise, we fire off a job and redirect to its status GUI. The task can also be run in manual operation, as described below.Dataflow
The Dataflow job finds the most recent history entry on or before watermark for each resource type and loads the embedded resource from it, which is then projected to watermark time to account for things like pending transfer.Only
ContactResource
s andHostResource
s that are referenced by an includedDomainBase
will be included in the corresponding pending deposit.Registrar
entities, both active and inactive, are included in all deposits. They are not rewinded point-in-time.Afterward
The XML deposit files generated by this job are humongous. A tiny XML report file is generated for each deposit, telling us how much of what it contains.
Once a deposit is successfully generated, For RDE an
RdeUploadAction
is enqueued which will upload it via SFTP to the third-party escrow provider; for BRDA anBrdaCopyAction
is enqueued which will copy it to a GCS bucket and be rsynced to a third-party escrow provider.To generate escrow deposits manually and locally, use the
nomulus
tool commandGenerateEscrowDepositCommand
.Logging
To identify the reduce worker request for a deposit in App Engine's log viewer, you can use search text like
tld=soy
,watermark=2015-01-01
, andmode=FULL
.Error Handling
Valid model objects might not be valid to the RDE XML schema. A single invalid object will cause the whole deposit to fail. You need to check the logs, find out which entities are broken, and perform database surgery.
If a deposit fails, an error is emitted to the logs for each broken entity. It tells you the key and shows you its representation in lenient XML.
Failed deposits will be retried indefinitely. This is because RDE and BRDA each have a
Cursor
for each TLD. Even if the cursor lags for days, it'll catch up gradually on its own, once the data becomes valid.The third-party escrow provider will validate each deposit we send them. They do both schema validation and reference checking.
This job does not perform reference checking. Administrators can do this locally with the
ValidateEscrowDepositCommand
command in thenomulus
tool.Cursors
Deposits are generated serially for a given (tld, mode) pair. A deposit is never started beyond the cursor. Once a deposit is completed, its cursor is rolled forward transactionally.
The mode determines which cursor is used.
Cursor.CursorType.RDE_STAGING
is used for thick deposits andCursor.CursorType.BRDA
is used for thin deposits.Use the
ListCursorsCommand
andUpdateCursorsCommand
commands to administrate with these cursors.Security
The deposit and report are encrypted using
Ghostryde
. Administrators can use theGhostrydeCommand
command in thenomulus
tool to view them.Unencrypted XML fragments are stored temporarily between the map and reduce steps and between Dataflow transforms. The ghostryde encryption on the full archived deposits makes life a little more difficult for an attacker. But security ultimately depends on the bucket.
Idempotency
For the Dataflow job we do not employ a lock because it is difficult to span a lock across three subsequent transforms (save to GCS, roll forward cursor, enqueue next action). Instead, we get around the issue by saving the deposit to a unique folder named after the job name so there is no possibility of overwriting.
Deposits are generated serially for a given (watermark, mode) pair. A deposit is never started beyond the cursor. Once a deposit is completed, its cursor is rolled forward transactionally. Duplicate jobs may exist
<=cursor
. So a transaction will not bother changing the cursor if it's already been rolled forward.Enqueuing
RdeUploadAction
orBrdaCopyAction
is also part of the cursor transaction. This is necessary because the first thing the upload task does is check the staging cursor to verify it's been completed, so we can't enqueue before we roll. We also can't enqueue after the roll, because then if enqueuing fails, the upload might never be enqueued.Determinism
The filename of an escrow deposit is determistic for a given (TLD, watermark, mode) triplet. Its generated contents is deterministic in all the ways that we care about. Its view of the database is strongly consistent in Cloud SQL automatically by nature of the initial query for the history entry running at
READ_COMMITTED
transaction isolation level.This is also true in Datastore because:
EppResource
queries are strongly consistent thanks toEppResourceIndex
EppResource
entities are rewinded to the point-in-time of the watermark
Here's what's not deterministic:
- Ordering of XML fragments. We don't care about this.
- Information about registrars. There's no point-in-time for these objects. So in order to guarantee referential correctness of your deposits, you must never delete a registrar entity.
Manual Operation
The task can be run in manual operation by setting certain parameters. Rather than generating deposits which are currently outstanding, the task will generate specific deposits. The files will be stored in a subdirectory of the "manual" directory, to avoid overwriting regular deposit files. Cursors and revision numbers will not be updated, and the upload task will not be kicked off. The parameters are:
- manual: if present and true, manual operation is indicated
- directory: the subdirectory of "manual" into which the files should be placed
- mode: the mode(s) to generate: FULL for RDE deposits, THIN for BRDA deposits
- tld: the tld(s) for which deposits should be generated
- watermark: the date(s) for which deposits should be generated; dates should be start-of-day
- revision: optional; if not specified, the next available revision number will be used
The manual, directory, mode, tld and watermark parameters must be present for manual operation; they must all be absent for standard operation (except that manual can be present but set to false). The revision parameter is optional in manual operation, and must be absent for standard operation.
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
PATH
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
run()
-
-
-
Field Detail
-
PATH
public static final java.lang.String PATH
- See Also:
- Constant Field Values
-
-