- All Implemented Interfaces:
public final class RdeStagingAction extends java.lang.Object implements java.lang.RunnableAction that kicks off a Dataflow job to stage escrow deposit XML files on GCS for RDE/BRDA for all TLDs.
This task starts by asking
PendingDepositCheckerwhich deposits need to be generated. If there's nothing to deposit, we return 204 No Content; otherwise, we fire off a job and redirect to its status GUI. The task can also be run in manual operation, as described below.
DataflowThe Dataflow job finds the most recent history entry on or before watermark for each resource type and loads the embedded resource from it, which is then projected to watermark time to account for things like pending transfer.
Registrarentities, both active and inactive, are included in all deposits. They are not rewinded point-in-time.
The XML deposit files generated by this job are humongous. A tiny XML report file is generated for each deposit, telling us how much of what it contains.
Once a deposit is successfully generated, For RDE an
RdeUploadActionis enqueued which will upload it via SFTP to the third-party escrow provider; for BRDA an
BrdaCopyActionis enqueued which will copy it to a GCS bucket and be rsynced to a third-party escrow provider.
To generate escrow deposits manually and locally, use the
To identify the reduce worker request for a deposit in App Engine's log viewer, you can use search text like
Valid model objects might not be valid to the RDE XML schema. A single invalid object will cause the whole deposit to fail. You need to check the logs, find out which entities are broken, and perform database surgery.
If a deposit fails, an error is emitted to the logs for each broken entity. It tells you the key and shows you its representation in lenient XML.
Failed deposits will be retried indefinitely. This is because RDE and BRDA each have a
Cursorfor each TLD. Even if the cursor lags for days, it'll catch up gradually on its own, once the data becomes valid.
The third-party escrow provider will validate each deposit we send them. They do both schema validation and reference checking.
This job does not perform reference checking. Administrators can do this locally with the
ValidateEscrowDepositCommandcommand in the
Deposits are generated serially for a given (tld, mode) pair. A deposit is never started beyond the cursor. Once a deposit is completed, its cursor is rolled forward transactionally.
UpdateCursorsCommandcommands to administrate with these cursors.
The deposit and report are encrypted using
Ghostryde. Administrators can use the
GhostrydeCommandcommand in the
nomulustool to view them.
Unencrypted XML fragments are stored temporarily between the map and reduce steps and between Dataflow transforms. The ghostryde encryption on the full archived deposits makes life a little more difficult for an attacker. But security ultimately depends on the bucket.
For the Dataflow job we do not employ a lock because it is difficult to span a lock across three subsequent transforms (save to GCS, roll forward cursor, enqueue next action). Instead, we get around the issue by saving the deposit to a unique folder named after the job name so there is no possibility of overwriting.
Deposits are generated serially for a given (watermark, mode) pair. A deposit is never started beyond the cursor. Once a deposit is completed, its cursor is rolled forward transactionally. Duplicate jobs may exist
<=cursor. So a transaction will not bother changing the cursor if it's already been rolled forward.
BrdaCopyActionis also part of the cursor transaction. This is necessary because the first thing the upload task does is check the staging cursor to verify it's been completed, so we can't enqueue before we roll. We also can't enqueue after the roll, because then if enqueuing fails, the upload might never be enqueued.
The filename of an escrow deposit is determistic for a given (TLD, watermark, mode) triplet. Its generated contents is deterministic in all the ways that we care about. Its view of the database is strongly consistent in Cloud SQL automatically by nature of the initial query for the history entry running at
READ_COMMITTEDtransaction isolation level.
This is also true in Datastore because:
EppResourcequeries are strongly consistent thanks to
EppResourceentities are rewinded to the point-in-time of the watermark
Here's what's not deterministic:
- Ordering of XML fragments. We don't care about this.
- Information about registrars. There's no point-in-time for these objects. So in order to guarantee referential correctness of your deposits, you must never delete a registrar entity.
The task can be run in manual operation by setting certain parameters. Rather than generating deposits which are currently outstanding, the task will generate specific deposits. The files will be stored in a subdirectory of the "manual" directory, to avoid overwriting regular deposit files. Cursors and revision numbers will not be updated, and the upload task will not be kicked off. The parameters are:
- manual: if present and true, manual operation is indicated
- directory: the subdirectory of "manual" into which the files should be placed
- mode: the mode(s) to generate: FULL for RDE deposits, THIN for BRDA deposits
- tld: the tld(s) for which deposits should be generated
- watermark: the date(s) for which deposits should be generated; dates should be start-of-day
- revision: optional; if not specified, the next available revision number will be used
The manual, directory, mode, tld and watermark parameters must be present for manual operation; they must all be absent for standard operation (except that manual can be present but set to false). The revision parameter is optional in manual operation, and must be absent for standard operation.
Fields Modifier and Type Field Description
public static final java.lang.String PATH
- See Also:
- Constant Field Values