Protocol Documentation
Table of Contents
Top
configs.proto
ActionConfig
Action config defines the contents of actions.yaml
configuration files.
ActionConfig.AssertionConfig
Field |
Type |
Label |
Description |
name |
string |
|
The name of the assertion. |
dataset |
string |
|
The dataset (schema) of the assertion. |
project |
string |
|
The Google Cloud project (database) of the assertion. |
dependency_targets |
ActionConfig.Target |
repeated |
Targets of actions that this action is dependent on. |
filename |
string |
|
Path to the source file that the contents of the action is loaded from. |
tags |
string |
repeated |
A list of user-defined tags with which the action should be labeled. |
disabled |
bool |
|
If set to true, this action will not be executed. However, the action can still be depended upon. Useful for temporarily turning off broken actions. |
description |
string |
|
Description of the assertion. |
ActionConfig.ColumnDescriptor
Field |
Type |
Label |
Description |
path |
string |
repeated |
The identifier for the column, using multiple parts for nested records. |
description |
string |
|
A text description of the column. |
bigquery_policy_tags |
string |
repeated |
A list of BigQuery policy tags that will be applied to the column. |
ActionConfig.DeclarationConfig
Field |
Type |
Label |
Description |
name |
string |
|
The name of the declaration. |
dataset |
string |
|
The dataset (schema) of the declaration. |
project |
string |
|
The Google Cloud project (database) of the declaration. |
description |
string |
|
Description of the declaration. |
columns |
ActionConfig.ColumnDescriptor |
repeated |
Descriptions of columns within the declaration. |
ActionConfig.IncrementalTableConfig
Field |
Type |
Label |
Description |
name |
string |
|
The name of the incremental table. |
dataset |
string |
|
The dataset (schema) of the incremental table. |
project |
string |
|
The Google Cloud project (database) of the incremental table. |
dependency_targets |
ActionConfig.Target |
repeated |
Targets of actions that this action is dependent on. |
filename |
string |
|
Path to the source file that the contents of the action is loaded from. |
tags |
string |
repeated |
A list of user-defined tags with which the action should be labeled. |
disabled |
bool |
|
If set to true, this action will not be executed. However, the action can still be depended upon. Useful for temporarily turning off broken actions. |
pre_operations |
string |
repeated |
Queries to run before query . This can be useful for granting permissions. |
post_operations |
string |
repeated |
Queries to run after query . |
protected |
bool |
|
If true, prevents the dataset from being rebuilt from scratch. |
unique_key |
string |
repeated |
If set, unique key represents a set of names of columns that will act as a the unique key. To enforce this, when updating the incremental table, Dataform merges rows with uniqueKey instead of appending them. |
description |
string |
|
Description of the incremental table. |
columns |
ActionConfig.ColumnDescriptor |
repeated |
Descriptions of columns within the table. |
partition_by |
string |
|
The key by which to partition the table. Typically the name of a timestamp or the date column. See https://cloud.google.com/dataform/docs/partitions-clusters. |
partition_expiration_days |
int32 |
|
The number of days for which BigQuery stores data in each partition. The setting applies to all partitions in a table, but is calculated independently for each partition based on the partition time. |
require_partition_filter |
bool |
|
Declares whether the partitioned table requires a WHERE clause predicate filter that filters the partitioning column. |
update_partition_filter |
string |
|
SQL-based filter for when incremental updates are applied. |
cluster_by |
string |
repeated |
The keys by which to cluster partitions by. See https://cloud.google.com/dataform/docs/partitions-clusters. |
labels |
ActionConfig.IncrementalTableConfig.LabelsEntry |
repeated |
Key-value pairs for BigQuery labels. If the label name contains special characters, e.g. hyphens, then quote its name, e.g. labels: { "label-name": "value" } . |
additional_options |
ActionConfig.IncrementalTableConfig.AdditionalOptionsEntry |
repeated |
Key-value pairs of additional options to pass to the BigQuery API. |
Some options, for example, partitionExpirationDays, have dedicated type/validity checked fields. For such options, use the dedicated fields.
String values must be encapsulated in double-quotes, for example: additionalOptions: {numeric_option: "5", string_option: '"string-value"'}
If the option name contains special characters, encapsulate the name in quotes, for example: additionalOptions: { "option-name": "value" }. |
ActionConfig.IncrementalTableConfig.AdditionalOptionsEntry
ActionConfig.IncrementalTableConfig.LabelsEntry
ActionConfig.NotebookConfig
Field |
Type |
Label |
Description |
name |
string |
|
The name of the notebook. |
location |
string |
|
The Google Cloud location of the notebook. |
project |
string |
|
The Google Cloud project (database) of the notebook. |
dependency_targets |
ActionConfig.Target |
repeated |
Targets of actions that this action is dependent on. |
filename |
string |
|
Path to the source file that the contents of the action is loaded from. |
tags |
string |
repeated |
A list of user-defined tags with which the action should be labeled. |
disabled |
bool |
|
If set to true, this action will not be executed. However, the action can still be depended upon. Useful for temporarily turning off broken actions. |
description |
string |
|
Description of the notebook. |
ActionConfig.OperationConfig
Field |
Type |
Label |
Description |
name |
string |
|
The name of the operation. |
dataset |
string |
|
The dataset (schema) of the operation. |
project |
string |
|
The Google Cloud project (database) of the operation. |
dependency_targets |
ActionConfig.Target |
repeated |
Targets of actions that this action is dependent on. |
filename |
string |
|
Path to the source file that the contents of the action is loaded from. |
tags |
string |
repeated |
A list of user-defined tags with which the action should be labeled. |
disabled |
bool |
|
If set to true, this action will not be executed. However, the action can still be depended upon. Useful for temporarily turning off broken actions. |
has_output |
bool |
|
Declares that this action creates a dataset which should be referenceable as a dependency target, for example by using the ref function. |
description |
string |
|
Description of the operation. |
columns |
ActionConfig.ColumnDescriptor |
repeated |
Descriptions of columns within the operation. Can only be set if hasOutput is true. |
ActionConfig.TableConfig
Field |
Type |
Label |
Description |
name |
string |
|
The name of the table. |
dataset |
string |
|
The dataset (schema) of the table. |
project |
string |
|
The Google Cloud project (database) of the table. |
dependency_targets |
ActionConfig.Target |
repeated |
Targets of actions that this action is dependent on. |
filename |
string |
|
Path to the source file that the contents of the action is loaded from. |
tags |
string |
repeated |
A list of user-defined tags with which the action should be labeled. |
disabled |
bool |
|
If set to true, this action will not be executed. However, the action can still be depended upon. Useful for temporarily turning off broken actions. |
pre_operations |
string |
repeated |
Queries to run before query . This can be useful for granting permissions. |
post_operations |
string |
repeated |
Queries to run after query . |
description |
string |
|
Description of the table. |
columns |
ActionConfig.ColumnDescriptor |
repeated |
Descriptions of columns within the table. |
partition_by |
string |
|
The key by which to partition the table. Typically the name of a timestamp or the date column. See https://cloud.google.com/dataform/docs/partitions-clusters. |
partition_expiration_days |
int32 |
|
The number of days for which BigQuery stores data in each partition. The setting applies to all partitions in a table, but is calculated independently for each partition based on the partition time. |
require_partition_filter |
bool |
|
Declares whether the partitioned table requires a WHERE clause predicate filter that filters the partitioning column. |
cluster_by |
string |
repeated |
The keys by which to cluster partitions by. See https://cloud.google.com/dataform/docs/partitions-clusters. |
labels |
ActionConfig.TableConfig.LabelsEntry |
repeated |
Key-value pairs for BigQuery labels. If the label name contains special characters, e.g. hyphens, then quote its name, e.g. labels: { "label-name": "value" } . |
additional_options |
ActionConfig.TableConfig.AdditionalOptionsEntry |
repeated |
Key-value pairs of additional options to pass to the BigQuery API. |
Some options, for example, partitionExpirationDays, have dedicated type/validity checked fields. For such options, use the dedicated fields.
String values must be encapsulated in double-quotes, for example: additionalOptions: {numeric_option: "5", string_option: '"string-value"'}
If the option name contains special characters, encapsulate the name in quotes, for example: additionalOptions: { "option-name": "value" }. |
ActionConfig.TableConfig.AdditionalOptionsEntry
ActionConfig.TableConfig.LabelsEntry
ActionConfig.Target
Target represents a unique action identifier.
Field |
Type |
Label |
Description |
project |
string |
|
The Google Cloud project (database) of the action. |
dataset |
string |
|
The dataset (schema) of the action. For notebooks, this is the location. |
name |
string |
|
The name of the action. |
ActionConfig.ViewConfig
Field |
Type |
Label |
Description |
name |
string |
|
The name of the view. |
dataset |
string |
|
The dataset (schema) of the view. |
project |
string |
|
The Google Cloud project (database) of the view. |
dependency_targets |
ActionConfig.Target |
repeated |
Targets of actions that this action is dependent on. |
filename |
string |
|
Path to the source file that the contents of the action is loaded from. |
tags |
string |
repeated |
A list of user-defined tags with which the action should be labeled. |
disabled |
bool |
|
If set to true, this action will not be executed. However, the action can still be depended upon. Useful for temporarily turning off broken actions. |
pre_operations |
string |
repeated |
Queries to run before query . This can be useful for granting permissions. |
post_operations |
string |
repeated |
Queries to run after query . |
materialized |
bool |
|
Applies the materialized view optimization, see https://cloud.google.com/bigquery/docs/materialized-views-intro. |
description |
string |
|
Description of the view. |
columns |
ActionConfig.ColumnDescriptor |
repeated |
Descriptions of columns within the table. |
labels |
ActionConfig.ViewConfig.LabelsEntry |
repeated |
Key-value pairs for BigQuery labels. If the label name contains special characters, e.g. hyphens, then quote its name, e.g. labels: { "label-name": "value" } . |
additional_options |
ActionConfig.ViewConfig.AdditionalOptionsEntry |
repeated |
Key-value pairs of additional options to pass to the BigQuery API. |
Some options, for example, partitionExpirationDays, have dedicated type/validity checked fields. For such options, use the dedicated fields.
String values must be encapsulated in double-quotes, for example: additionalOptions: {numeric_option: "5", string_option: '"string-value"'}
If the option name contains special characters, encapsulate the name in quotes, for example: additionalOptions: { "option-name": "value" }. |
ActionConfig.ViewConfig.AdditionalOptionsEntry
ActionConfig.ViewConfig.LabelsEntry
ActionConfigs
Action configs defines the contents of actions.yaml
configuration files.
TODO(ekrekr): consolidate these configuration options in the JS API.
NotebookRuntimeOptionsConfig
Field |
Type |
Label |
Description |
output_bucket |
string |
|
Storage bucket to output notebooks to after their execution. |
WorkflowSettings
Workflow Settings defines the contents of the workflow_settings.yaml
configuration file.
Field |
Type |
Label |
Description |
dataform_core_version |
string |
|
The desired dataform core version to compile against. |
default_project |
string |
|
Required. The default Google Cloud project (database). |
default_dataset |
string |
|
Required. The default dataset (schema). |
default_location |
string |
|
Required. The default BigQuery location to use. For more information on BigQuery locations, see https://cloud.google.com/bigquery/docs/locations. |
default_assertion_dataset |
string |
|
Required. The default dataset (schema) for assertions. |
vars |
WorkflowSettings.VarsEntry |
repeated |
Optional. User-defined variables that are made available to project code during compilation. An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" } . |
project_suffix |
string |
|
Optional. The suffix to append to all Google Cloud project references. |
dataset_suffix |
string |
|
Optional. The suffix to append to all dataset references. |
name_prefix |
string |
|
Optional. The prefix to append to all action names. |
default_notebook_runtime_options |
NotebookRuntimeOptionsConfig |
|
Optional. Default runtime options for Notebook actions. |
WorkflowSettings.VarsEntry
Scalar Value Types
.proto Type |
Notes |
C++ |
Java |
Python |
Go |
C# |
PHP |
Ruby |
double |
|
double |
double |
float |
float64 |
double |
float |
Float |
float |
|
float |
float |
float |
float32 |
float |
float |
Float |
int32 |
Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead. |
int32 |
int |
int |
int32 |
int |
integer |
Bignum or Fixnum (as required) |
int64 |
Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead. |
int64 |
long |
int/long |
int64 |
long |
integer/string |
Bignum |
uint32 |
Uses variable-length encoding. |
uint32 |
int |
int/long |
uint32 |
uint |
integer |
Bignum or Fixnum (as required) |
uint64 |
Uses variable-length encoding. |
uint64 |
long |
int/long |
uint64 |
ulong |
integer/string |
Bignum or Fixnum (as required) |
sint32 |
Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. |
int32 |
int |
int |
int32 |
int |
integer |
Bignum or Fixnum (as required) |
sint64 |
Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. |
int64 |
long |
int/long |
int64 |
long |
integer/string |
Bignum |
fixed32 |
Always four bytes. More efficient than uint32 if values are often greater than 2^28. |
uint32 |
int |
int |
uint32 |
uint |
integer |
Bignum or Fixnum (as required) |
fixed64 |
Always eight bytes. More efficient than uint64 if values are often greater than 2^56. |
uint64 |
long |
int/long |
uint64 |
ulong |
integer/string |
Bignum |
sfixed32 |
Always four bytes. |
int32 |
int |
int |
int32 |
int |
integer |
Bignum or Fixnum (as required) |
sfixed64 |
Always eight bytes. |
int64 |
long |
int/long |
int64 |
long |
integer/string |
Bignum |
bool |
|
bool |
boolean |
boolean |
bool |
bool |
boolean |
TrueClass/FalseClass |
string |
A string must always contain UTF-8 encoded or 7-bit ASCII text. |
string |
String |
str/unicode |
string |
string |
string |
String (UTF-8) |
bytes |
May contain any arbitrary sequence of bytes. |
string |
ByteString |
str |
[]byte |
ByteString |
string |
String (ASCII-8BIT) |