Bidirectional Data Transformation

Generic transformation engine that chains primitives (decode, decompress, parse, patch, serialize, compress, encode) declared in YAML. Domain-agnostic: knows nothing about SAS, Viya, or any specific platform.

Overview

The transform module provides a declarative pipeline for round-trip data transformations. The typical use case is decoding a nested binary blob (e.g. base64 + zlib + JSON envelope + XML payload), patching the inner content, and re-encoding back to the exact same format so the result can be re-uploaded to the source system.

from kstlib.transform import TransformChain, TransformChainConfig, PrimitiveConfig, PatchConfig

chain = TransformChain(
    TransformChainConfig(
        name="decode_report",
        forward=(
            PrimitiveConfig(name="base64"),
            PrimitiveConfig(name="zlib", options={"skip_bytes": 3}),
            PrimitiveConfig(name="json", options={"extract": "transferableContent.content"}),
        ),
        backward=(
            PrimitiveConfig(name="json", options={"wrap": "transferableContent.content"}),
            PrimitiveConfig(name="zlib", options={"prepend_bytes": "4d1504"}),
            PrimitiveConfig(name="base64"),
        ),
        patch=PatchConfig(replace={"old-host.example.com": "new-host.example.com"}),
    )
)

# Full round-trip: forward + patch + backward
patched_blob = chain.transform(blob_b64_string)

Benefits:

  • Single source of truth: chain definitions live in YAML, not scattered in code

  • Lossless round-trip: JSON envelopes are preserved during patching

  • Composable: presets can be reused and overridden with custom patches

  • Hardened: zlib bomb protection, XML security, callable whitelist, size limits

Primitives

The transform engine ships with 5 built-in primitives. Each one is bidirectional (forward + backward) and can be chained in any order.

Primitive

Forward

Backward

Common options

base64

str -> bytes

bytes -> str

strict, strip_prefix, prefix

bytes

bytes -> str

str -> bytes

encoding (default utf-8)

zlib

compressed -> bytes

bytes -> compressed

skip_bytes, prepend_bytes, level

json

str -> dict

dict -> str

extract, wrap (dot-notation)

xml

str -> Element

Element -> str

encoding

zlib special options

The zlib primitive supports two options to handle SAS-style proprietary headers prepended before the actual zlib stream:

# Forward: skip the first 3 bytes (proprietary header)
- zlib:
    skip_bytes: 3

# Backward: re-prepend the same 3 bytes (hex-encoded)
- zlib:
    prepend_bytes: "4d1504"   # M\x15\x04

skip_bytes cannot be auto-reversed, so a chain that uses it must declare an explicit backward: block with prepend_bytes.

json extract/wrap for envelope-style payloads

The json primitive lets you drill into a nested envelope on the forward path and reconstruct it on the backward path:

forward:
  - base64
  - zlib
  - json:
      extract: "transferableContent.content"   # Drill into the envelope

backward:
  - json:
      wrap: "transferableContent.content"      # Restore the envelope
  - zlib
  - base64

The original envelope is stored internally during forward execution (in _ChainContext.json_envelopes) and restored on backward, ensuring the round-trip is lossless even when only the inner payload was patched.

Configuration

In kstlib.conf.yml

Define chains in your main configuration file under transforms::

transforms:
  security:
    allowed_callable_modules:
      - myproject.transforms

  chains:
    sas_report:
      forward:
        - base64
        - zlib:
            skip_bytes: 3
        - json:
            extract: "transferableContent.content"
      backward:
        - json:
            wrap: "transferableContent.content"
        - zlib:
            prepend_bytes: "4d1504"
        - base64

    patch_report:
      preset: sas_report      # inherit forward + backward from sas_report
      patch:
        scope: blob           # blob | outer | all (default: blob)
        replace:
          "https://old-host.example.com/": "https://new-host.example.com/"
          'library="CASUSER"': 'library="PUBLIC"'

Preset inheritance

A chain can inherit forward + backward from another chain via preset:. The child overrides only patch (or composed_patch):

chains:
  sas_report:
    forward: [...]
    backward: [...]

  patch_report:
    preset: sas_report      # forward + backward inherited
    patch:
      scope: blob
      replace:
        "old": "new"

Chained presets are not supported (a preset cannot itself reference another preset). The validation enforces this at config-load time.

Patches: replace vs callable

A PatchConfig is mutually exclusive between replace: and callable::

# Option 1: simple string substitution
patch:
  scope: blob
  replace:
    "old-value": "new-value"
    'library="CASUSER"': 'library="PUBLIC"'

# Option 2: external Python callable
patch:
  scope: blob
  callable: myproject.transforms:patch_function
  args:
    target_host: "{{target_host}}"   # Resolved from pipeline context
    cas_mapping: "{{cas_mapping}}"

The scope: field is one of blob (default), outer, or all. See Transform for the full scope semantics table and replace_outer_uris helper.

Note

Deprecated alias: the field name mapping: is still accepted as a deprecated alias for replace: and emits a DeprecationWarning when used. Migrate existing configs to replace:.

The callable target follows the module.path:function_name convention. Allowed callable modules must be whitelisted in transforms.security.allowed_callable_modules.

{{variable}} references in args are resolved against the chain’s context dict, allowing dynamic values to be injected from a pipeline step or any caller.

Composed patches: surgical multi-object workflows

When a transformation needs to apply different patches to different objects (e.g. some reports need a specific caslib while others need the generic one), use composed_patch instead of an inline patch:

chains:
  remap_host:
    patch:
      replace:
        "https://source.res.private/": "https://target.res.private/"

  remap_caslib_global:
    patch:
      replace:
        'library="CASUSER"': 'library="PROD_GLOBAL_LIB"'

  remap_caslib_r220:
    patch:
      replace:
        'library="CASUSER"': 'library="R220_DEDICATED_LIB"'

  patch_report_composed:
    preset: sas_report

    global_patches:
      - remap_host             # Applied to every object

    targeted_patches:
      - filter:
          content_type: report
          name: "R220_*"
        patches:
          - remap_caslib_r220

      - filter:
          content_type: report
          name: "*"            # Fallback for other reports
        patches:
          - remap_caslib_global

Warning

Cascade is by declaration order, NOT by filter specificity. This is the inverse of CSS. Order your targeted_patches from most general to most specific because the last applied patch wins on conflict.

A “patch-only” chain (one with only patch and no forward/preset) is allowed and is designed to be referenced from a composed_patch. While it can be instantiated and invoked directly, its primary use case is as a named patch target for composed patch orchestration.

See Transform for the full decision matrix and runtime behavior.

Python API

Quick Functions

from kstlib.transform import transform, load_transform_config

# Convenience function: loads config from kstlib.conf.yml and applies
result = transform(blob_b64, "patch_report")

# Pass metadata for composed_patch filter matching
result = transform(
    blob_b64,
    "patch_report_composed",
    metadata={"content_type": "report", "name": "R220_ASTRO"},
)

Client Instance

from kstlib.transform import TransformChain, load_transform_config

config = load_transform_config()

# Build a chain from a named config entry (resolves preset inheritance)
chain = TransformChain.from_config("patch_report", config)

# Forward only
decoded = chain.forward(blob_b64)

# Backward only (must be called after forward to restore envelopes)
re_encoded = chain.backward(decoded)

# Full round-trip
patched = chain.transform(blob_b64)

# With composed_patch metadata
patched = chain.transform(
    blob_b64,
    metadata={"content_type": "report", "name": "R220_ASTRO"},
)

Programmatic Construction

from kstlib.transform import (
    TransformChain,
    TransformChainConfig,
    PrimitiveConfig,
    PatchConfig,
)

chain = TransformChain(
    TransformChainConfig(
        name="my_chain",
        forward=(
            PrimitiveConfig(name="base64"),
            PrimitiveConfig(name="json"),
        ),
        patch=PatchConfig(replace={"foo": "bar"}),
    )
)

result = chain.transform(blob_b64_string)

Security and Hard Limits

The transform engine implements deep defense against malformed or malicious input.

Callable whitelist

External callables can only be invoked if their module is listed in transforms.security.allowed_callable_modules:

transforms:
  security:
    allowed_callable_modules:
      - myproject.transforms
      - myproject.viya.patches

A callable target whose module is not in the whitelist raises TransformConfigError at config-load time, before any transformation runs.

Size limits

Limit

Default

Hard Max

Input data size

100 MB

100 MB

JSON payload size

50 MB

50 MB

XML payload size

50 MB

50 MB

Decompressed size

200 MB

200 MB

Decompression ratio

100x

100x

Mapping entries per patch

100

100

Named chains

50

50

Global patches per composition

10

10

Targeted patches per composition

50

50

Patches per targeted entry

10

10

Zlib bomb protection

The zlib_decompress primitive enforces both an absolute decompressed size limit and a maximum decompression ratio. A zlib stream that expands beyond either threshold raises DecompressError immediately.

XML security

The xml_parse primitive uses defusedxml if available (recommended). DOCTYPE declarations are rejected by default to prevent XXE attacks and billion-laughs expansion.

Integration with kstlib.pipeline

The transform engine integrates cleanly with kstlib.pipeline via the CallableStep. A pipeline step can invoke kstlib.transform.transform with the chain name as the first argument and pass the loaded data as a callable arg:

pipelines:
  patch-and-upload:
    steps:
      - name: patch
        type: callable
        callable: kstlib.transform:transform
        args:
          - "{{loaded_blob}}"
          - "patch_report"

      - name: upload
        type: shell
        command: "kstlib rapi upload --body @result.json"

API Reference

Chain

class kstlib.transform.TransformChain(config, *, context=None, transform_config=None, allowed_modules=None)[source]

Bases: object

Execute a chain of transform primitives with optional patching.

Warning

TransformChain instances are not reentrant. Do not call transform() concurrently from multiple threads on the same instance. Each call resets internal state (_chain_context). Create one instance per thread if concurrent execution is needed.

Parameters:
  • config (TransformChainConfig) – Resolved chain configuration (no preset references).

  • context (Mapping[str, Any] | None) – Optional external context for {{variable}} resolution.

Examples

>>> from kstlib.transform.config import PrimitiveConfig, TransformChainConfig
>>> chain = TransformChain(TransformChainConfig(
...     name="test",
...     forward=(PrimitiveConfig(name="base64"),),
... ))
>>> chain.forward("SGVsbG8=")
b'Hello'
__init__(self, config: 'TransformChainConfig', *, context: 'Mapping[str, Any] | None' = None, transform_config: 'TransformConfig | None' = None, allowed_modules: 'frozenset[str] | None' = None) 'None' -> None[source]

Initialize TransformChain.

Parameters:
  • config (TransformChainConfig) – Resolved chain configuration.

  • context (Mapping[str, Any] | None) – External context for variable resolution.

  • transform_config (TransformConfig | None) – Top-level config used to resolve chain references in composed_patch. Required when config.composed_patch is set.

  • allowed_modules (frozenset[str] | None) – Whitelist of allowed callable module prefixes. When None (direct construction without from_config), any callable patch is rejected (fail-closed). Pass an explicit frozenset to allow specific modules.

Raises:

TransformConfigError – If composed_patch is set but transform_config was not provided.

classmethod from_config(name: 'str', transform_config: 'TransformConfig', *, context: 'Mapping[str, Any] | None' = None) 'TransformChain' -> TransformChain[source]

Create a TransformChain from a named config entry.

Resolves presets and returns a ready-to-use chain.

Parameters:
  • name (str) – Chain name to look up in transform_config.

  • transform_config (TransformConfig) – Top-level TransformConfig.

  • context (Mapping[str, Any] | None) – External context for variable resolution.

Returns:

Configured TransformChain.

Raises:

TransformChainError – If chain name not found.

Return type:

TransformChain

forward(self, data: 'Any') 'Any' -> Any[source]

Apply forward primitives in order.

Parameters:

data (Any) – Input data (typically base64 string).

Returns:

Decoded/parsed data ready for patching.

Raises:

TransformChainError – If any primitive fails.

Return type:

Any

backward(self, data: 'Any') 'Any' -> Any[source]

Apply backward primitives in order.

Uses stored envelopes from forward for lossless JSON restoration.

Parameters:

data (Any) – Data to re-encode (typically patched XML string or Element).

Returns:

Re-encoded data (same format as original input).

Raises:

TransformChainError – If any primitive fails.

Return type:

Any

patch(self, data: 'Any', *, metadata: 'Mapping[str, Any] | None' = None) 'Any' -> Any[source]

Apply patch to decoded data.

If the chain has an inline patch, applies it directly. If the chain has a composed_patch, applies the global patches then the targeted patches whose filter matches metadata (in declaration order, last applied wins).

Parameters:
  • data (Any) – Decoded data from forward.

  • metadata (Mapping[str, Any] | None) – Object metadata. Used for filter matching in composed patches (typical keys: content_type, name). May also carry "outer" referencing the JSON wrapper to mutate when the patch declares scope: outer or scope: all.

Returns:

Patched data.

Raises:
Return type:

Any

transform(self, data: 'Any', *, metadata: 'Mapping[str, Any] | None' = None) 'Any' -> Any[source]

Full round-trip: forward -> patch -> backward.

This is the main entry point for most use cases.

Parameters:
  • data (Any) – Raw input data.

  • metadata (Mapping[str, Any] | None) – Object metadata used for filter matching in composed patches. Ignored for inline patch.

Returns:

Transformed data (same format as input).

Raises:

TransformChainError – If any stage fails.

Return type:

Any

kstlib.transform.transform(data: 'Any', chain_name: 'str', config: 'TransformConfig | None' = None, context: 'dict[str, Any] | None' = None, *, metadata: 'Mapping[str, Any] | None' = None) 'Any' -> Any[source]

Apply a named transform chain to data.

Convenience function for use in CallableStep pipelines. Loads config from kstlib.conf.yml if not provided.

Parameters:
  • data (Any) – Raw input data.

  • chain_name (str) – Name of the transform chain to apply.

  • config (TransformConfig | None) – Transform config (loads from kstlib.conf.yml if None).

  • context (dict[str, Any] | None) – Variables for {{variable}} resolution in callable args.

  • metadata (Mapping[str, Any] | None) – Object metadata for filter matching in composed patches (typical keys: content_type, name).

Returns:

Transformed data.

Return type:

Any

Examples

>>> transform("SGVsbG8=", "my_chain")  
kstlib.transform.replace_outer_uris(obj: 'Any', replace_map: 'Mapping[str, str]', *, protected_paths: 'frozenset[str]' = frozenset({'connectors[*].hints.xpath'}), additional_protected_paths: 'frozenset[str] | None' = None) 'int' -> int[source]

Recursively patch string values in a JSON-like object, in place.

Walks the object tree and applies str.replace(old, new) for every entry of replace_map to every string value, skipping any path that matches protected_paths. The object is mutated in place.

Path syntax: each protected path is a dotted string. [*] matches any list index. Dict keys are matched literally. For example, "connectors[*].hints.xpath" matches obj["connectors"][i]["hints"]["xpath"] for every i.

Note

Keys containing a literal "." cannot be expressed in the dotted-path syntax and are therefore not protectable via protected_paths. This is a known limitation.

This helper is meant to be called from caller code that knows about the wrapper structure (e.g. SAS Viya transfer packages where the BIRD XML lives inside an encoded blob but connectors[].uri and connectors[].hints.orig-uri live in the outer JSON wrapper). Use it together with PatchConfig(scope='outer') or scope='all'.

Parameters:
  • obj (Any) – JSON-like nested structure (dict, list, str, int, …). Mutated in place. Non-string scalars are returned unchanged.

  • replace_map (Mapping[str, str]) – Mapping of old -> new substring replacements. Replacements are applied in iteration order.

  • protected_paths (frozenset[str]) – Dotted-path patterns that must NOT be patched. Defaults to PROTECTED_OUTER_PATHS.

  • additional_protected_paths (frozenset[str] | None) – Extra patterns merged with protected_paths. Provides an additive API so callers can extend the defaults without accidentally wiping them.

Returns:

Total number of string values that were modified.

Return type:

int

Examples

>>> wrapper = {"connectors": [{"uri": "library=CASUSER", "hints": {"xpath": "/foo/CASUSER"}}]}
>>> replace_outer_uris(wrapper, {"CASUSER": "PUBLIC"})
1
>>> wrapper["connectors"][0]["uri"]
'library=PUBLIC'
>>> wrapper["connectors"][0]["hints"]["xpath"]
'/foo/CASUSER'

Constants

kstlib.transform.PATCH_SCOPE_VALUES = frozenset({'all', 'blob', 'outer'})

frozenset() -> empty frozenset object frozenset(iterable) -> frozenset object

Build an immutable unordered collection of unique elements.

kstlib.transform.PROTECTED_OUTER_PATHS = frozenset({'connectors[*].hints.xpath'})

frozenset() -> empty frozenset object frozenset(iterable) -> frozenset object

Build an immutable unordered collection of unique elements.

Configuration

class kstlib.transform.TransformConfig(chains=<factory>, allowed_callable_modules=frozenset({}))[source]

Bases: object

Top-level transform configuration from kstlib.conf.yml.

chains

Named transform chain configurations.

Type:

dict[str, kstlib.transform.config.TransformChainConfig]

allowed_callable_modules

Whitelist of allowed callable module prefixes.

Type:

frozenset[str]

Examples

>>> TransformConfig(chains={"decode": TransformChainConfig(name="decode", forward=(PrimitiveConfig(name="base64"),))})
TransformConfig(chains={...}, allowed_callable_modules=frozenset())
chains: dict[str, TransformChainConfig]
allowed_callable_modules: frozenset[str]
__post_init__(self) 'None' -> None[source]

Validate top-level configuration.

Raises:

TransformConfigError – If configuration is invalid.

__init__(self, chains: 'dict[str, TransformChainConfig]' = <factory>, allowed_callable_modules: 'frozenset[str]' = frozenset()) None -> None
class kstlib.transform.TransformChainConfig(name, forward=(), backward=None, patch=None, composed_patch=None, preset=None)[source]

Bases: object

Configuration for a named transform chain.

A chain declares how to transform data. It must provide at least one of:

  • forward: explicit forward primitive chain

  • preset: inherit forward/backward from another chain

  • patch or composed_patch: patch-only chain (no forward/backward). Such chains are meant to be referenced by a composed_patch of another chain, not instantiated directly.

name

Chain name (for logging and error messages).

Type:

str

forward

Ordered tuple of forward primitives.

Type:

tuple[kstlib.transform.config.PrimitiveConfig, …]

backward

Ordered list of backward primitives (None = auto-reverse).

Type:

tuple[kstlib.transform.config.PrimitiveConfig, …] | None

patch

Patch configuration (None = no inline patching).

Type:

kstlib.transform.config.PatchConfig | None

composed_patch

Composed patches referencing other chains (None = absent).

Type:

kstlib.transform.config.ComposedPatchConfig | None

preset

Name of preset to inherit from (None = standalone).

Type:

str | None

Examples

>>> TransformChainConfig(
...     name="decode",
...     forward=(PrimitiveConfig(name="base64"),),
... )
TransformChainConfig(name='decode', forward=(...), backward=None, patch=None, composed_patch=None, preset=None)
name: str
forward: tuple[PrimitiveConfig, ...]
backward: tuple[PrimitiveConfig, ...] | None
patch: PatchConfig | None
composed_patch: ComposedPatchConfig | None
preset: str | None
__post_init__(self) 'None' -> None[source]

Validate chain configuration.

Raises:

TransformConfigError – If configuration is invalid.

__init__(self, name: 'str', forward: 'tuple[PrimitiveConfig, ...]' = (), backward: 'tuple[PrimitiveConfig, ...] | None' = None, patch: 'PatchConfig | None' = None, composed_patch: 'ComposedPatchConfig | None' = None, preset: 'str | None' = None) None -> None
class kstlib.transform.PrimitiveConfig(name, options=<factory>)[source]

Bases: object

Configuration for a single transform primitive.

name

Primitive name (base64, zlib, json, xml, bytes).

Type:

str

options

Primitive-specific options dict.

Type:

dict[str, Any]

Examples

>>> PrimitiveConfig(name="base64")
PrimitiveConfig(name='base64', options={})
>>> PrimitiveConfig(name="zlib", options={"skip_bytes": 3})
PrimitiveConfig(name='zlib', options={'skip_bytes': 3})
name: str
options: dict[str, Any]
__post_init__(self) 'None' -> None[source]

Validate primitive configuration.

Raises:

TransformConfigError – If configuration is invalid.

__init__(self, name: 'str', options: 'dict[str, Any]' = <factory>) None -> None
class kstlib.transform.PatchConfig(replace=None, scope='blob', callable=None, args=<factory>, mapping=None)[source]

Bases: object

Configuration for the patch stage between forward and backward.

A patch operates either as a string-replacement mapping (replace) or as a Python callable (callable). The two modes are mutually exclusive.

The scope field controls WHERE replacements apply:

  • "blob" (default): patch the data decoded by the forward chain (e.g. the BIRD XML extracted from a SAS Viya report blob). Preserves the historical behavior.

  • "outer": patch the wrapper dict passed in metadata["outer"] to chain.transform(). The wrapper is mutated in place; the blob itself is not modified beyond the normal forward+backward round-trip. Useful for fields like connectors[].uri that live outside the encoded blob.

  • "all": do both, blob first then outer.

replace

String replacement mapping {old: new}. Mutually exclusive with callable.

Type:

dict[str, str] | None

scope

Where to apply the replace mapping. One of "blob" (default), "outer", "all".

Type:

Literal[‘blob’, ‘outer’, ‘all’]

callable

Import target module.path:function for complex patch logic. Mutually exclusive with replace.

Type:

str | None

args

Keyword arguments passed to the callable as **kwargs.

Type:

dict[str, Any]

mapping

Deprecated alias for ``replace``. Setting it triggers a DeprecationWarning and is silently copied to replace. Will be removed in a future version. Do not set both mapping and replace (raises).

Type:

dict[str, str] | None

Examples

>>> PatchConfig(replace={"old": "new"})
PatchConfig(replace={'old': 'new'}, scope='blob', callable=None, args={}, mapping=None)
>>> PatchConfig(replace={"a": "b"}, scope="all")
PatchConfig(replace={'a': 'b'}, scope='all', callable=None, args={}, mapping=None)
replace: dict[str, str] | None
scope: Literal['blob', 'outer', 'all']
callable: str | None
args: dict[str, Any]
mapping: dict[str, str] | None
__post_init__(self) 'None' -> None[source]

Validate patch configuration.

Raises:

TransformConfigError – If configuration is invalid.

__init__(self, replace: 'dict[str, str] | None' = None, scope: "Literal['blob', 'outer', 'all']" = 'blob', callable: 'str | None' = None, args: 'dict[str, Any]' = <factory>, mapping: 'dict[str, str] | None' = None) None -> None
class kstlib.transform.FilterConfig(content_type='*', name='*')[source]

Bases: object

Filter used by TargetedPatchConfig to select matching objects.

All fields are ANDed: an object matches only if every field matches. A value of "*" means “any value”.

content_type

Object content type (“report”, “folder”, “file”, or “*”).

Type:

str

name

fnmatch glob pattern on the object name (e.g. "R220_*").

Type:

str

Examples

>>> FilterConfig(content_type="report", name="R220_*")
FilterConfig(content_type='report', name='R220_*')
>>> FilterConfig()
FilterConfig(content_type='*', name='*')
content_type: str
name: str
__post_init__(self) 'None' -> None[source]

Validate filter configuration.

Raises:

TransformConfigError – If configuration is invalid.

__init__(self, content_type: 'str' = '*', name: 'str' = '*') None -> None
class kstlib.transform.TargetedPatchConfig(filter, patches)[source]

Bases: object

A filter plus a list of patch chain names to apply when it matches.

filter

Filter describing which objects this entry applies to.

Type:

kstlib.transform.config.FilterConfig

patches

Ordered tuple of chain names whose .patch is applied.

Type:

tuple[str, …]

Examples

>>> TargetedPatchConfig(
...     filter=FilterConfig(content_type="report", name="R220_*"),
...     patches=("remap_caslib_r220",),
... )
TargetedPatchConfig(filter=FilterConfig(...), patches=('remap_caslib_r220',))
filter: FilterConfig
patches: tuple[str, ...]
__post_init__(self) 'None' -> None[source]

Validate targeted patch configuration.

Raises:

TransformConfigError – If configuration is invalid.

__init__(self, filter: 'FilterConfig', patches: 'tuple[str, ...]') None -> None
class kstlib.transform.ComposedPatchConfig(global_patches=(), targeted_patches=())[source]

Bases: object

Composition of global and targeted patch chain references.

Execution order (per object):

  1. global_patches: applied to every object, in declaration order.

  2. targeted_patches: for each entry in declaration order, if the filter matches the object metadata, apply all its patches in order.

Last applied wins on conflict, following kstlib cascade philosophy (kwargs > user config > preset > defaults). Ordering is by declaration, not by filter specificity. Order your targeted_patches from most general to most specific.

global_patches

Chain names applied to every object.

Type:

tuple[str, …]

targeted_patches

Conditional entries applied when their filter matches.

Type:

tuple[kstlib.transform.config.TargetedPatchConfig, …]

Examples

>>> ComposedPatchConfig(
...     global_patches=("remap_host",),
...     targeted_patches=(
...         TargetedPatchConfig(
...             filter=FilterConfig(name="R220_*"),
...             patches=("remap_caslib_r220",),
...         ),
...     ),
... )
ComposedPatchConfig(global_patches=('remap_host',), targeted_patches=(...))
global_patches: tuple[str, ...]
targeted_patches: tuple[TargetedPatchConfig, ...]
__post_init__(self) 'None' -> None[source]

Validate composed patch configuration.

Raises:

TransformConfigError – If configuration is invalid.

__init__(self, global_patches: 'tuple[str, ...]' = (), targeted_patches: 'tuple[TargetedPatchConfig, ...]' = ()) None -> None
kstlib.transform.load_transform_config() 'TransformConfig' -> TransformConfig[source]

Load transform configuration from kstlib.conf.yml.

Reads the transforms: section from the global config.

Returns:

Parsed TransformConfig, or empty config if section absent.

Return type:

TransformConfig

Examples

>>> config = load_transform_config()  

Primitives

kstlib.transform.base64_decode(data: 'str', config: 'PrimitiveConfig') 'bytes' -> bytes[source]

Decode base64 string to bytes.

Supports SAS Viya wire formats and other proprietary base64 variants via two opt-in options:

  • strip_prefix: a literal string that, if present at the start of the input, is removed before decoding. Useful for SAS Viya report blobs which begin with "TRUE###" (the TRUE part decodes to the 3-byte SAS proprietary header and ### is a separator that lenient base64 decoders skip).

  • strict: when True (default), the underlying decoder runs with validate=True and rejects any character outside the base64 alphabet. When False, non-alphabet characters are stripped before decoding (matches the de facto behavior of Python’s stdlib base64.b64decode and most other tools).

Parameters:
  • data (str) – Base64-encoded string. May include a configurable prefix and (in lenient mode) embedded whitespace or separators.

  • config (PrimitiveConfig) –

    Primitive config. Recognized options:

    • strip_prefix (str | None): literal prefix to remove before decoding. Default None. Max 32 chars. If the input does not start with this prefix, the option is a no-op (does NOT raise) so the same chain can handle mixed blobs that sometimes carry the prefix.

    • strict (bool): when True (default) reject any non-alphabet character; when False strip them silently before decoding.

Returns:

Decoded bytes.

Raises:

DecodeError – If data is not a string, exceeds the input size limit, or fails to decode after the configured pre-processing.

Return type:

bytes

Examples

>>> base64_decode("SGVsbG8=", PrimitiveConfig(name="base64"))
b'Hello'
>>> # SAS Viya pattern: strip the proprietary "TRUE###" prefix
>>> cfg = PrimitiveConfig(name="base64",
...     options={"strip_prefix": "TRUE###", "strict": False})
>>> base64_decode("TRUE###SGVsbG8=", cfg)
b'Hello'
>>> # strip_prefix is a no-op when the input does not start with it
>>> cfg2 = PrimitiveConfig(name="base64",
...     options={"strip_prefix": "TRUE###"})
>>> base64_decode("SGVsbG8=", cfg2)
b'Hello'
>>> # Lenient mode strips embedded non-alphabet noise
>>> cfg3 = PrimitiveConfig(name="base64", options={"strict": False})
>>> base64_decode("SGVs###bG8=", cfg3)
b'Hello'
kstlib.transform.base64_encode(data: 'bytes', config: 'PrimitiveConfig') 'str' -> str[source]

Encode bytes to base64 string with an optional literal prefix.

The prefix option allows reattaching a proprietary marker after encoding, mirroring base64_decode’s strip_prefix on the forward path. The typical SAS Viya use case is "TRUE###": the forward chain strips it before decoding, the backward chain re-prepends it after encoding so the wire format is preserved bit-for-bit.

Parameters:
  • data (bytes) – Raw bytes to encode.

  • config (PrimitiveConfig) –

    Primitive config. Recognized options:

    • prefix (str | None): literal string prepended to the base64 result. Default None. Max 32 chars.

Returns:

Base64-encoded string, optionally prefixed.

Raises:

EncodeError – If data is not bytes.

Return type:

str

Examples

>>> base64_encode(b"Hello", PrimitiveConfig(name="base64"))
'SGVsbG8='
>>> # Reattach the SAS Viya proprietary prefix
>>> cfg = PrimitiveConfig(name="base64", options={"prefix": "TRUE###"})
>>> base64_encode(b"Hello", cfg)
'TRUE###SGVsbG8='
kstlib.transform.bytes_decode(data: 'bytes', config: 'PrimitiveConfig') 'str' -> str[source]

Decode bytes to string.

Parameters:
  • data (bytes) – Raw bytes.

  • config (PrimitiveConfig) – Options: encoding (default utf-8).

Returns:

Decoded string.

Raises:

DecodeError – If decoding fails.

Return type:

str

Examples

>>> bytes_decode(b"Hello", PrimitiveConfig(name="bytes"))
'Hello'
kstlib.transform.bytes_encode(data: 'str', config: 'PrimitiveConfig') 'bytes' -> bytes[source]

Encode string to bytes.

Parameters:
  • data (str) – String to encode.

  • config (PrimitiveConfig) – Options: encoding (default utf-8).

Returns:

Encoded bytes.

Raises:

EncodeError – If encoding fails.

Return type:

bytes

Examples

>>> bytes_encode("Hello", PrimitiveConfig(name="bytes"))
b'Hello'
kstlib.transform.zlib_compress(data: 'bytes', config: 'PrimitiveConfig') 'bytes' -> bytes[source]

Compress data with zlib, optionally prepending a header.

Parameters:
  • data (bytes) – Raw bytes to compress.

  • config (PrimitiveConfig) –

    Primitive config. Recognized options:

    • prepend_bytes (str | None): hex string prepended before the compressed bytes. Default None.

    • level (int): compression level passed to zlib.compress. Range -1 to 9, where -1 means “use the Python zlib default level” (typically 6), 0 means no compression, and 9 means maximum compression. Default -1. Higher values produce smaller output but are slower.

Returns:

Compressed bytes with optional header prefix.

Raises:

CompressError – If compression fails or prepend_bytes hex is invalid.

Return type:

bytes

Examples

>>> result = zlib_compress(b"Hello", PrimitiveConfig(name="zlib"))
>>> import zlib
>>> zlib.decompress(result)
b'Hello'
>>> # Maximum compression level
>>> cfg = PrimitiveConfig(name="zlib", options={"level": 9})
>>> result9 = zlib_compress(b"Hello world " * 100, cfg)
>>> zlib.decompress(result9) == b"Hello world " * 100
True
kstlib.transform.zlib_decompress(data: 'bytes', config: 'PrimitiveConfig') 'bytes' -> bytes[source]

Decompress zlib data with optional header skip.

Parameters:
  • data (bytes) – Compressed bytes (possibly with prefix header).

  • config (PrimitiveConfig) – Options: skip_bytes (int) strips N leading bytes.

Returns:

Decompressed bytes.

Raises:

DecompressError – If decompression fails or input invalid.

Return type:

bytes

Examples

>>> import zlib
>>> compressed = zlib.compress(b"Hello")
>>> zlib_decompress(compressed, PrimitiveConfig(name="zlib"))
b'Hello'
kstlib.transform.json_parse(data: 'str | bytes', config: 'PrimitiveConfig') 'tuple[Any, dict[str, Any] | None]' -> tuple[Any, dict[str, Any] | None][source]

Parse JSON string, optionally extracting a nested field.

Returns a tuple of (value, envelope). If extract is used, envelope contains the original parsed dict for lossless backward restoration. If no extract, envelope is None.

Parameters:
  • data (str | bytes) – JSON string or bytes.

  • config (PrimitiveConfig) – Options: extract (dot path).

Returns:

Tuple of (extracted_or_full_value, original_envelope_or_None).

Raises:

ParseError – If JSON parsing fails or extract path not found.

Return type:

tuple[Any, dict[str, Any] | None]

Examples

>>> val, env = json_parse('{"a": 1}', PrimitiveConfig(name="json"))
>>> val
{'a': 1}
kstlib.transform.json_serialize(data: 'Any', config: 'PrimitiveConfig', *, envelope: 'dict[str, Any] | None' = None) 'str' -> str[source]

Serialize Python object to JSON string.

If wrap path and envelope are provided, restores the value into the original envelope structure (lossless round-trip).

Parameters:
  • data (Any) – Python object to serialize.

  • config (PrimitiveConfig) –

    Primitive config. Recognized options:

    • wrap (str | None): dot-notation path used together with envelope to restore the value inside its original envelope structure. Default None.

    • minify (bool): when True, output uses compact separators=(",", ":") (no whitespace). When False (default), uses Python’s default separators (", ", ": "). Useful before zlib compression (denser input compresses better).

    • ensure_ascii (bool): when True, escape every non-ASCII character as \\uXXXX. When False (default, diverges from Python stdlib which is True), non-ASCII characters are emitted verbatim. The kstlib default is False to preserve Unicode content (French, Japanese, etc.) without bloating the output.

  • envelope (dict[str, Any] | None) – Original envelope for lossless restoration when wrap is set.

Returns:

JSON string.

Raises:

SerializeError – If serialization fails.

Return type:

str

Examples

>>> json_serialize({"a": 1}, PrimitiveConfig(name="json"))
'{"a": 1}'
>>> # Minified output (no spaces after , and :)
>>> cfg = PrimitiveConfig(name="json", options={"minify": True})
>>> json_serialize({"a": 1, "b": 2}, cfg)
'{"a":1,"b":2}'
>>> # Preserve Unicode content (default behavior)
>>> json_serialize({"k": "café"}, PrimitiveConfig(name="json"))
'{"k": "café"}'
>>> # Force ASCII escapes
>>> cfg2 = PrimitiveConfig(name="json", options={"ensure_ascii": True})
>>> json_serialize({"k": "café"}, cfg2)
'{"k": "caf\\u00e9"}'
kstlib.transform.xml_parse(data: 'str', config: 'PrimitiveConfig') 'Element' -> Element[source]

Parse XML string to ElementTree Element.

Uses defusedxml.ElementTree.fromstring for XXE protection. defusedxml raises EntitiesForbidden, DTDForbidden, or ExternalReferenceForbidden on malicious payloads; all are wrapped in a ParseError here.

Parameters:
  • data (str) – XML string.

  • config (PrimitiveConfig) – Primitive config.

Returns:

ElementTree root Element.

Raises:

ParseError – If XML parsing fails or input is unsafe.

Return type:

Element

Examples

>>> root = xml_parse("<root><a>1</a></root>", PrimitiveConfig(name="xml"))
>>> root.tag
'root'
kstlib.transform.xml_serialize(data: 'Element', config: 'PrimitiveConfig') 'str' -> str[source]

Serialize ElementTree Element to XML string.

Parameters:
  • data (Element) – ElementTree root Element.

  • config (PrimitiveConfig) – Primitive config.

Returns:

XML string.

Raises:

SerializeError – If serialization fails.

Return type:

str

Examples

>>> from xml.etree.ElementTree import Element
>>> root = Element("root")
>>> xml_serialize(root, PrimitiveConfig(name="xml"))
'<root />'

Exceptions

exception kstlib.transform.TransformError[source]

Bases: KstlibError

Base exception for all transform module errors.

exception kstlib.transform.TransformConfigError[source]

Bases: TransformError, ValueError

Transform configuration is invalid.

Raised when the transform chain or primitive configuration contains invalid values, missing required fields, or constraint violations.

exception kstlib.transform.TransformChainError(message, *, chain_name=None)[source]

Bases: TransformError

Transform chain execution failed.

chain_name

Name of the chain that failed.

__init__(self, message: 'str', *, chain_name: 'str | None' = None) 'None' -> None[source]

Initialize TransformChainError.

Parameters:
  • message (str) – Human-readable error message.

  • chain_name (str | None) – Name of the chain that failed.

exception kstlib.transform.PrimitiveError(message, *, primitive_name=None, chain_name=None)[source]

Bases: TransformChainError

A single transform primitive failed.

primitive_name

Name of the primitive that failed.

__init__(self, message: 'str', *, primitive_name: 'str | None' = None, chain_name: 'str | None' = None) 'None' -> None[source]

Initialize PrimitiveError.

Parameters:
  • message (str) – Human-readable error message.

  • primitive_name (str | None) – Name of the primitive that failed.

  • chain_name (str | None) – Name of the chain that was running.

exception kstlib.transform.DecodeError(message, *, primitive_name=None, chain_name=None)[source]

Bases: PrimitiveError

Base64 or bytes decoding failed.

exception kstlib.transform.DecompressError(message, *, primitive_name=None, chain_name=None)[source]

Bases: PrimitiveError

Zlib decompression failed.

exception kstlib.transform.ParseError(message, *, primitive_name=None, chain_name=None)[source]

Bases: PrimitiveError

JSON or XML parsing failed.

exception kstlib.transform.PatchError(message, *, primitive_name=None, chain_name=None)[source]

Bases: PrimitiveError

Patch stage failed.

exception kstlib.transform.SerializeError(message, *, primitive_name=None, chain_name=None)[source]

Bases: PrimitiveError

JSON or XML serialization failed.

exception kstlib.transform.CompressError(message, *, primitive_name=None, chain_name=None)[source]

Bases: PrimitiveError

Zlib compression failed.

exception kstlib.transform.EncodeError(message, *, primitive_name=None, chain_name=None)[source]

Bases: PrimitiveError

Base64 or bytes encoding failed.

exception kstlib.transform.CallableError(target, reason, *, chain_name=None)[source]

Bases: TransformChainError

Callable raised an exception during execution.

target

The callable target string.

__init__(self, target: 'str', reason: 'str', *, chain_name: 'str | None' = None) 'None' -> None[source]

Initialize CallableError.

Parameters:
  • target (str) – The callable target string.

  • reason (str) – Description of the error.

  • chain_name (str | None) – Name of the chain that was running.

exception kstlib.transform.CallableImportError(target, *, chain_name=None)[source]

Bases: TransformChainError

Callable target could not be imported.

target

The import target string that failed.

__init__(self, target: 'str', *, chain_name: 'str | None' = None) 'None' -> None[source]

Initialize CallableImportError.

Parameters:
  • target (str) – The import target string (e.g. “module.path:function”).

  • chain_name (str | None) – Name of the chain that was running.