Model Inputs

Every model receives its input data as files, served to the Docker container via environment variables. Each input you define in your .pd4castrrc.json becomes an INPUT_<KEY>_URL environment variable that your container fetches with a standard HTTP GET request. Inputs can be uploaded manually (static) or pulled automatically from external data sources (fetched).

How inputs work

When a model run starts, the platform makes each input file available at a unique URL. Your container reads these URLs from environment variables and downloads the data. The environment variable name is derived from the input’s key field, converted to uppercase. For example, an input with "key": "demand" becomes INPUT_DEMAND_URL.

Your model code fetches the file from that URL, parses it, and uses it as input data for the forecast.

Input configuration fields

Each entry in the inputs array in .pd4castrrc.json supports the following fields.

FieldTypeRequiredDescription
keystringYesIdentifier for this input. Becomes the environment variable name (INPUT_<KEY>_URL).
triggerstringYesHow new data triggers runs: WAIT_FOR_LATEST_FILE or USE_MOST_RECENT_FILE.
uploadFileFormatstringNoFormat of the uploaded file: json, csv, or parquet. Defaults to json.
targetFileFormatstringNoFormat served to the container. If it differs from uploadFileFormat, the platform converts the file automatically.
inputSourcestringNoUUID of the storage bucket source. Provided by the pd4castr team. Defaults to the standard shared source.
fetcherobjectNoData fetcher configuration. Omit this field for static inputs. See Fetched inputs.

Static inputs

Static inputs are files you upload manually. They’re ideal for reference data, lookup tables, or configuration files that don’t change with every model run.

Static input files live in your project’s test_input/ directory. When you run pd4castr publish, the CLI uploads these files to the platform via signed S3 URLs. You can also update them later through the API without republishing the entire model.

A minimal static input configuration looks like this:

{
  "key": "reference_prices",
  "trigger": "USE_MOST_RECENT_FILE",
  "uploadFileFormat": "csv",
  "targetFileFormat": "csv"
}

Fetched inputs (data fetchers)

Fetched inputs pull data automatically from external sources on a schedule. This is useful for production models that need fresh market data without manual intervention.

pd4castr currently supports AEMO MMS (Australian Energy Market Operator Market Management System) as a data source, which connects to the AEMO Postgres replica database.

To configure a fetched input, add a fetcher block to the input entry:

{
  "key": "predispatch_price",
  "trigger": "WAIT_FOR_LATEST_FILE",
  "uploadFileFormat": "json",
  "targetFileFormat": "json",
  "fetcher": {
    "type": "AEMO_MMS",
    "checkInterval": 300,
    "config": {
      "checkQuery": "queries/data-fetchers/predispatch-price-check.sql",
      "fetchQuery": "queries/data-fetchers/predispatch-price-fetch.sql"
    }
  }
}

The fetcher fields are:

FieldTypeDescription
typestringThe data source type. Currently only AEMO_MMS is supported.
checkIntervalnumberHow often (in seconds) to poll for new data. Minimum value is 60.
config.checkQuerystringPath to a SQL file that checks whether new data is available.
config.fetchQuerystringPath to a SQL file that retrieves the data when new data is detected.

The platform runs the check query at the configured interval. When the results differ from the previous check, it executes the fetch query and writes the output to storage. This new data can then trigger an automatic model run.

Trigger types and run modes

Each input has a trigger that controls how it interacts with automatic model runs:

  • WAIT_FOR_LATEST_FILE — The platform waits until this input has received new data before triggering a run. All inputs with this trigger type must have fresh data before the run starts.
  • USE_MOST_RECENT_FILE — The platform uses whatever data was last available for this input. It doesn’t block a run from starting.

For models with runMode set to AUTOMATIC, the platform monitors all inputs. A run is triggered only when every WAIT_FOR_LATEST_FILE input has received new data. USE_MOST_RECENT_FILE inputs are included in the run using their most recently available file.

Example configuration

Here’s a complete inputs array with both a static and a fetched input:

{
  "inputs": [
    {
      "key": "dispatch_price",
      "trigger": "WAIT_FOR_LATEST_FILE",
      "uploadFileFormat": "json",
      "targetFileFormat": "json",
      "fetcher": {
        "type": "AEMO_MMS",
        "checkInterval": 300,
        "config": {
          "checkQuery": "queries/data-fetchers/dispatch-price-check.sql",
          "fetchQuery": "queries/data-fetchers/dispatch-price-fetch.sql"
        }
      }
    },
    {
      "key": "regional_boundaries",
      "trigger": "USE_MOST_RECENT_FILE",
      "uploadFileFormat": "csv",
      "targetFileFormat": "csv"
    }
  ]
}

In this example, dispatch_price is fetched automatically from AEMO every 5 minutes. regional_boundaries is a static CSV file uploaded during publish. The model won’t run automatically until fresh dispatch price data arrives, but it uses the latest available regional boundaries data.

Next steps

  • Use pd4castr fetch to pull live data from your configured fetchers into your local test_input/ directory.
  • See the Configuration file reference for the full schema.