Schema Specification
This document describes how to define table schemas in DataCovey using JSON format.
Overview
A schema defines the structure of your table, including column names, data types, and privacy settings. The schema is specified using a JSON array where each element describes a column in your table.
JSON Schema Format
Each column in the schema is represented by a JSON object with the following properties:
name
: The name of the column (required)type
: The data type of the column (required)mode
: Whether the column is NULLABLE, REQUIRED, or REPEATED (optional, defaults to NULLABLE)description
: A description of the column (optional)defaultValueExpression
: The default value for the column (optional)private
: Whether the column should be private to the tenant (optional, defaults to false)
Example schema:
[
{
"name": "user_id",
"type": "STRING",
"mode": "REQUIRED",
"description": "Unique identifier for the user"
},
{
"name": "credit_card_token",
"type": "STRING",
"mode": "REQUIRED",
"private": true,
"description": "Tokenized credit card number - only visible to tenant"
},
{
"name": "transaction_amount",
"type": "NUMERIC",
"mode": "NULLABLE",
"defaultValueExpression": "0.00",
"description": "Amount of the transaction"
}
]
Column Names
Column names must follow these rules:
- Must start with a letter (a-z, A-Z) or underscore (_)
- Can contain letters, numbers, and underscores
- Maximum length of 300 characters
- Cannot use reserved prefixes
- Case-sensitive
Data Types
DataCovey supports the following data types:
STRING
: Variable-length character dataBYTES
: Variable-length binary dataINT64
: 64-bit integerFLOAT64
: Double precision floating pointNUMERIC
: Exact numeric values with 38 digits of precision and 9 decimal digitsBIGNUMERIC
: Exact numeric values with 76.76 digits of precision and 38 decimal digitsBOOL
: TRUE or FALSEDATE
: Calendar date (year, month, day)DATETIME
: Date and timeTIME
: Time of dayTIMESTAMP
: Absolute point in timeSTRUCT
: Container of ordered fieldsGEOGRAPHY
: Geography data typeJSON
: JSON data type
Column Modes
NULLABLE
: Column can contain NULL values (default)REQUIRED
: Column cannot contain NULL valuesREPEATED
: Column contains an array of values
Privacy Settings
DataCovey allows columns to be marked as private using the private
property:
{
"name": "sensitive_data",
"type": "STRING",
"private": true,
"description": "This column is only visible to the tenant who owns the data"
}
When a column is marked as private:
- Only users from the tenant who owns the data can see the actual values
- Other tenants can use the column in computations but cannot see the raw values
- The privacy setting cannot be changed after table creation
Schema Change Constraints
Once a table is created, there are certain limitations on schema modifications:
Allowed Changes
- Adding new columns
- Making a required column nullable
- Relaxing length limits on STRING or BYTES columns
- Adding or updating column descriptions
Prohibited Changes
- Renaming existing columns
- Changing column data types
- Making a nullable column required
- Removing columns
- Changing privacy settings of columns
- Modifying the precision or scale of NUMERIC columns
Best Practices
-
Plan Your Schema Carefully: Since many schema changes are restricted after table creation, it's important to plan your schema carefully.
-
Use Descriptions: Add clear descriptions to your columns to help other users understand the data.
-
Consider Privacy Requirements: Identify sensitive columns that should be marked as private during schema creation.
-
Default Values: Use default values when appropriate to handle missing data consistently.
-
Choose Appropriate Data Types: Select the most appropriate data type for your data to ensure optimal storage and query performance.