Overview

The Standardization node is a transformation node in a Weave Workflow that scales the values of selected numerical columns to their respective Z scores. It sits in the middle of a Workflow’s processing path between source nodes (Data Import) and terminal nodes (Result, Output) and emits the transformed data downstream.

The node applies to the columns selected in the Standardization popup. Each selected numerical column’s values are replaced by their Z scores in the output; unselected columns pass through unchanged.

Note: “Workflow” is the in-UI term for what some Weave documentation calls a pipeline. This reference uses “Workflow.”

When to use it

  • Standardizing numerical columns to a common scale expressed in standard deviations from the mean.
  • Preparing numerical features for downstream processing that expects zero-centered, unit-variance input.
  • Comparing columns with different units or magnitudes on a shared Z-score scale.
  • Standardizing one or more selected columns while leaving the remaining columns unchanged.

Configuration

StepDescription
SelectOpens the Standardization popup listing the columns in the upstream data.
Select AllSelects every column in the popup.
Column checkboxesChoose the columns to standardize. Unselected columns pass through unchanged.
ApplyCommits the column selection.

Output

Output elementDescription
Standardized columnsSelected numerical columns are replaced by their Z scores, producing positive and negative values centered on zero.
Pass-through columnsColumns not selected retain their original values.

Key behaviors

Standardization is a transformation node. Unlike Data Import (source) and Result / Output (terminal), it sits in the middle of the Workflow, receiving data upstream, transforming the selected columns, and passing the result downstream.

Selection-scoped. Only the columns selected in the Standardization popup are standardized; all other columns pass through unchanged.Z-score output. Values are expressed as the number of standard deviations from the column’s mean, so the result depends on the distribution of the data in the selected column.