Bidirectional Data Transformation¶

Generic transformation engine that chains primitives (decode, decompress, parse, patch, serialize, compress, encode) declared in YAML. Domain-agnostic: knows nothing about SAS, Viya, or any specific platform.

Overview¶

The transform module provides a declarative pipeline for round-trip data transformations. The typical use case is decoding a nested binary blob (e.g. base64 + zlib + JSON envelope + XML payload), patching the inner content, and re-encoding back to the exact same format so the result can be re-uploaded to the source system.

from kstlib.transform import TransformChain, TransformChainConfig, PrimitiveConfig, PatchConfig

chain = TransformChain(
    TransformChainConfig(
        name="decode_report",
        forward=(
            PrimitiveConfig(name="base64"),
            PrimitiveConfig(name="zlib", options={"skip_bytes": 3}),
            PrimitiveConfig(name="json", options={"extract": "transferableContent.content"}),
        ),
        backward=(
            PrimitiveConfig(name="json", options={"wrap": "transferableContent.content"}),
            PrimitiveConfig(name="zlib", options={"prepend_bytes": "4d1504"}),
            PrimitiveConfig(name="base64"),
        ),
        patch=PatchConfig(replace={"old-host.example.com": "new-host.example.com"}),
    )
)

# Full round-trip: forward + patch + backward
patched_blob = chain.transform(blob_b64_string)

Benefits:

Single source of truth: chain definitions live in YAML, not scattered in code
Lossless round-trip: JSON envelopes are preserved during patching
Composable: presets can be reused and overridden with custom patches
Hardened: zlib bomb protection, XML security, callable whitelist, size limits

Primitives¶

The transform engine ships with 5 built-in primitives. Each one is bidirectional (forward + backward) and can be chained in any order.

Primitive	Forward	Backward	Common options
`base64`	str -> bytes	bytes -> str	`strict`, `strip_prefix`, `prefix`
`bytes`	bytes -> str	str -> bytes	`encoding` (default `utf-8`)
`zlib`	compressed -> bytes	bytes -> compressed	`skip_bytes`, `prepend_bytes`, `level`
`json`	str -> dict	dict -> str	`extract`, `wrap` (dot-notation)
`xml`	str -> Element	Element -> str	`encoding`

The engine also ships 4 forward-only string extractors for slicing and cleaning decoded text. They have no backward path (the extraction is terminal), so a chain that uses one declares only forward:.

Primitive	Forward	Common options
`split`	str -> str \| list[str]	`sep` (required), `index`, `maxsplit`, `keep_empty`
`tr`	str -> str	`delete` or `map` (mutually exclusive)
`removeprefix`	str -> str	`prefix` (required)
`removesuffix`	str -> str	`suffix` (required)

See Transform for the decision guide (“when to use which”) and the keep_empty divergence from str.split.

zlib special options¶

The zlib primitive supports two options to handle SAS-style proprietary headers prepended before the actual zlib stream:

# Forward: skip the first 3 bytes (proprietary header)
- zlib:
    skip_bytes: 3

# Backward: re-prepend the same 3 bytes (hex-encoded)
- zlib:
    prepend_bytes: "4d1504"   # M\x15\x04

skip_bytes cannot be auto-reversed, so a chain that uses it must declare an explicit backward: block with prepend_bytes.

json extract/wrap for envelope-style payloads¶

The json primitive lets you drill into a nested envelope on the forward path and reconstruct it on the backward path:

forward:
  - base64
  - zlib
  - json:
      extract: "transferableContent.content"   # Drill into the envelope

backward:
  - json:
      wrap: "transferableContent.content"      # Restore the envelope
  - zlib
  - base64

The original envelope is stored internally during forward execution (in _ChainContext.json_envelopes) and restored on backward, ensuring the round-trip is lossless even when only the inner payload was patched.

Configuration¶

In kstlib.conf.yml¶

Define chains in your main configuration file under transforms::

transforms:
  security:
    allowed_callable_modules:
      - myproject.transforms

  chains:
    sas_report:
      forward:
        - base64
        - zlib:
            skip_bytes: 3
        - json:
            extract: "transferableContent.content"
      backward:
        - json:
            wrap: "transferableContent.content"
        - zlib:
            prepend_bytes: "4d1504"
        - base64

    patch_report:
      preset: sas_report      # inherit forward + backward from sas_report
      patch:
        scope: blob           # blob | outer | all (default: blob)
        replace:
          "https://old-host.example.com/": "https://new-host.example.com/"
          'library="CASUSER"': 'library="PUBLIC"'

Preset inheritance¶

A chain can inherit forward + backward from another chain via preset:. The child overrides only patch (or composed_patch):

chains:
  sas_report:
    forward: [...]
    backward: [...]

  patch_report:
    preset: sas_report      # forward + backward inherited
    patch:
      scope: blob
      replace:
        "old": "new"

Chained presets are not supported (a preset cannot itself reference another preset). The validation enforces this at config-load time.

Patches: replace vs callable¶

A PatchConfig is mutually exclusive between replace: and callable::

# Option 1: simple string substitution
patch:
  scope: blob
  replace:
    "old-value": "new-value"
    'library="CASUSER"': 'library="PUBLIC"'

# Option 2: external Python callable
patch:
  scope: blob
  callable: myproject.transforms:patch_function
  args:
    target_host: "{{target_host}}"   # Resolved from pipeline context
    cas_mapping: "{{cas_mapping}}"

The scope: field is one of blob (default), outer, or all. See Transform for the full scope semantics table and replace_outer_uris helper.

Note

Deprecated alias: the field name mapping: is still accepted as a deprecated alias for replace: and emits a DeprecationWarning when used. Migrate existing configs to replace:.

The callable target follows the module.path:function_name convention. Allowed callable modules must be whitelisted in transforms.security.allowed_callable_modules.

{{variable}} references in args are resolved against the chain’s context dict, allowing dynamic values to be injected from a pipeline step or any caller.

Composed patches: surgical multi-object workflows¶

When a transformation needs to apply different patches to different objects (e.g. some reports need a specific caslib while others need the generic one), use composed_patch instead of an inline patch:

chains:
  remap_host:
    patch:
      replace:
        "https://source.res.private/": "https://target.res.private/"

  remap_caslib_global:
    patch:
      replace:
        'library="CASUSER"': 'library="PROD_GLOBAL_LIB"'

  remap_caslib_r220:
    patch:
      replace:
        'library="CASUSER"': 'library="R220_DEDICATED_LIB"'

  patch_report_composed:
    preset: sas_report

    global_patches:
      - remap_host             # Applied to every object

    targeted_patches:
      - filter:
          content_type: report
          name: "R220_*"
        patches:
          - remap_caslib_r220

      - filter:
          content_type: report
          name: "*"            # Fallback for other reports
        patches:
          - remap_caslib_global

Warning

Cascade is by declaration order, NOT by filter specificity. This is the inverse of CSS. Order your targeted_patches from most general to most specific because the last applied patch wins on conflict.

A “patch-only” chain (one with only patch and no forward/preset) is allowed and is designed to be referenced from a composed_patch. While it can be instantiated and invoked directly, its primary use case is as a named patch target for composed patch orchestration.

See Transform for the full decision matrix and runtime behavior.

Python API¶

Quick Functions¶

from kstlib.transform import transform, load_transform_config

# Convenience function: loads config from kstlib.conf.yml and applies
result = transform(blob_b64, "patch_report")

# Pass metadata for composed_patch filter matching
result = transform(
    blob_b64,
    "patch_report_composed",
    metadata={"content_type": "report", "name": "R220_SALES"},
)

Client Instance¶

from kstlib.transform import TransformChain, load_transform_config

config = load_transform_config()

# Build a chain from a named config entry (resolves preset inheritance)
chain = TransformChain.from_config("patch_report", config)

# Forward only
decoded = chain.forward(blob_b64)

# Backward only (must be called after forward to restore envelopes)
re_encoded = chain.backward(decoded)

# Full round-trip
patched = chain.transform(blob_b64)

# With composed_patch metadata
patched = chain.transform(
    blob_b64,
    metadata={"content_type": "report", "name": "R220_SALES"},
)

Programmatic Construction¶

from kstlib.transform import (
    TransformChain,
    TransformChainConfig,
    PrimitiveConfig,
    PatchConfig,
)

chain = TransformChain(
    TransformChainConfig(
        name="my_chain",
        forward=(
            PrimitiveConfig(name="base64"),
            PrimitiveConfig(name="json"),
        ),
        patch=PatchConfig(replace={"foo": "bar"}),
    )
)

result = chain.transform(blob_b64_string)

Security and Hard Limits¶

The transform engine implements deep defense against malformed or malicious input.

Callable whitelist¶

External callables can only be invoked if their module is listed in transforms.security.allowed_callable_modules:

transforms:
  security:
    allowed_callable_modules:
      - myproject.transforms
      - myproject.viya.patches

A callable target whose module is not in the whitelist raises TransformConfigError at config-load time, before any transformation runs.

Size limits¶

Limit	Default	Hard Max
Input data size	100 MB	100 MB
JSON payload size	50 MB	50 MB
XML payload size	50 MB	50 MB
Decompressed size	200 MB	200 MB
Decompression ratio	100x	100x
Mapping entries per patch	100	100
Named chains	50	50
Global patches per composition	10	10
Targeted patches per composition	50	50
Patches per targeted entry	10	10

Zlib bomb protection¶

The zlib_decompress primitive enforces both an absolute decompressed size limit and a maximum decompression ratio. A zlib stream that expands beyond either threshold raises DecompressError immediately.

XML security¶

The xml_parse primitive uses defusedxml if available (recommended). DOCTYPE declarations are rejected by default to prevent XXE attacks and billion-laughs expansion.

Integration with kstlib.pipeline¶

The transform engine integrates cleanly with kstlib.pipeline via the CallableStep. A pipeline step can invoke kstlib.transform.transform with the chain name as the first argument and pass the loaded data as a callable arg:

pipelines:
  patch-and-upload:
    steps:
      - name: patch
        type: callable
        callable: kstlib.transform:transform
        args:
          - "{{loaded_blob}}"
          - "patch_report"

      - name: upload
        type: shell
        command: "kstlib rapi upload --body @result.json"

API Reference¶

Chain¶

class kstlib.transform.TransformChain(config, *, context=None, transform_config=None, allowed_modules=None)[source]¶

Bases: object

Execute a chain of transform primitives with optional patching.

Warning

TransformChain instances are not reentrant. Do not call transform() concurrently from multiple threads on the same instance. Each call resets internal state (_chain_context). Create one instance per thread if concurrent execution is needed.

Parameters:

config (TransformChainConfig) – Resolved chain configuration (no preset references).
context (Mapping[str, Any] | None) – Optional external context for {{variable}} resolution.

Examples

>>> from kstlib.transform.config import PrimitiveConfig, TransformChainConfig
>>> chain = TransformChain(TransformChainConfig(
...     name="test",
...     forward=(PrimitiveConfig(name="base64"),),
... ))
>>> chain.forward("SGVsbG8=")
b'Hello'

__init__(self, config: 'TransformChainConfig', *, context: 'Mapping[str, Any] | None' = None, transform_config: 'TransformConfig | None' = None, allowed_modules: 'frozenset[str] | None' = None) → 'None' -> None[source]¶

Initialize TransformChain.

Parameters:

config (TransformChainConfig) – Resolved chain configuration.
context (Mapping[str, Any] | None) – External context for variable resolution.
transform_config (TransformConfig | None) – Top-level config used to resolve chain references in composed_patch. Required when config.composed_patch is set.
allowed_modules (frozenset[str] | None) – Whitelist of allowed callable module prefixes. When None (direct construction without from_config), any callable patch is rejected (fail-closed). Pass an explicit frozenset to allow specific modules.

Raises:

TransformConfigError – If composed_patch is set but transform_config was not provided.

classmethod from_config(name: 'str', transform_config: 'TransformConfig', *, context: 'Mapping[str, Any] | None' = None) → 'TransformChain' -> TransformChain[source]¶

Create a TransformChain from a named config entry.

Resolves presets and returns a ready-to-use chain.

Parameters:

name (str) – Chain name to look up in transform_config.
transform_config (TransformConfig) – Top-level TransformConfig.
context (Mapping[str, Any] | None) – External context for variable resolution.

Returns:

Configured TransformChain.

Raises:

TransformChainError – If chain name not found.

Return type:

TransformChain

forward(self, data: 'Any') → 'Any' -> Any[source]¶

Apply forward primitives in order.

Parameters:: data (Any) – Input data (typically base64 string).
Returns:: Decoded/parsed data ready for patching.
Raises:: TransformChainError – If any primitive fails.
Return type:: Any

backward(self, data: 'Any') → 'Any' -> Any[source]¶

Apply backward primitives in order.

Uses stored envelopes from forward for lossless JSON restoration.

Parameters:

data (Any) – Data to re-encode (typically patched XML string or Element).

Returns:

Re-encoded data (same format as original input).

Raises:

TransformConfigError – If the chain is forward-only (lossy extractor) and therefore has no backward path.
TransformChainError – If any primitive fails.

Return type:

Any

patch(self, data: 'Any', *, metadata: 'Mapping[str, Any] | None' = None) → 'Any' -> Any[source]¶

Apply patch to decoded data.

If the chain has an inline patch, applies it directly. If the chain has a composed_patch, applies the global patches then the targeted patches whose filter matches metadata (in declaration order, last applied wins).

Parameters:

data (Any) – Decoded data from forward.
metadata (Mapping[str, Any] | None) – Object metadata. Used for filter matching in composed patches (typical keys: content_type, name). May also carry "outer" referencing the JSON wrapper to mutate when the patch declares scope: outer or scope: all.

Returns:

Patched data.

Raises:

PatchError – If patching fails.
CallableError – If a callable patch raises.
CallableImportError – If a callable cannot be imported.

Return type:

Any

transform(self, data: 'Any', *, metadata: 'Mapping[str, Any] | None' = None) → 'Any' -> Any[source]¶

Full round-trip: forward -> patch -> backward.

This is the main entry point for most use cases. Forward-only chains (lossy extractors like split) skip the backward leg: the patched, extracted value is returned as-is.

Parameters:

data (Any) – Raw input data.
metadata (Mapping[str, Any] | None) – Object metadata used for filter matching in composed patches. Ignored for inline patch.

Returns:

Transformed data (same format as input).

Raises:

TransformChainError – If any stage fails.

Return type:

Any

kstlib.transform.transform(data: 'Any', chain_name: 'str', config: 'TransformConfig | None' = None, context: 'dict[str, Any] | None' = None, *, metadata: 'Mapping[str, Any] | None' = None) → 'Any' -> Any[source]¶

Apply a named transform chain to data.

Convenience function for use in CallableStep pipelines. Loads config from kstlib.conf.yml if not provided.

Parameters:

data (Any) – Raw input data.
chain_name (str) – Name of the transform chain to apply.
config (TransformConfig | None) – Transform config (loads from kstlib.conf.yml if None).
context (dict[str, Any] | None) – Variables for {{variable}} resolution in callable args.
metadata (Mapping[str, Any] | None) – Object metadata for filter matching in composed patches (typical keys: content_type, name).

Returns:

Transformed data.

Return type:

Any

Examples

>>> transform("SGVsbG8=", "my_chain")  

kstlib.transform.replace_outer_uris(obj: 'Any', replace_map: 'Mapping[str, str]', *, protected_paths: 'frozenset[str]' = frozenset({'connectors[*].hints.xpath'}), additional_protected_paths: 'frozenset[str] | None' = None) → 'int' -> int[source]¶

Recursively patch string values in a JSON-like object, in place.

Walks the object tree and applies str.replace(old, new) for every entry of replace_map to every string value, skipping any path that matches protected_paths. The object is mutated in place.

Path syntax: each protected path is a dotted string. [*] matches any list index. Dict keys are matched literally. For example, "connectors[*].hints.xpath" matches obj["connectors"][i]["hints"]["xpath"] for every i.

Note

Keys containing a literal "." cannot be expressed in the dotted-path syntax and are therefore not protectable via protected_paths. This is a known limitation.

This helper is meant to be called from caller code that knows about the wrapper structure (e.g. SAS Viya transfer packages where the BIRD XML lives inside an encoded blob but connectors[].uri and connectors[].hints.orig-uri live in the outer JSON wrapper). Use it together with PatchConfig(scope='outer') or scope='all'.

Parameters:

obj (Any) – JSON-like nested structure (dict, list, str, int, …). Mutated in place. Non-string scalars are returned unchanged.
replace_map (Mapping[str, str]) – Mapping of old -> new substring replacements. Replacements are applied in iteration order.
protected_paths (frozenset[str]) – Dotted-path patterns that must NOT be patched. Defaults to PROTECTED_OUTER_PATHS.
additional_protected_paths (frozenset[str] | None) – Extra patterns merged with protected_paths. Provides an additive API so callers can extend the defaults without accidentally wiping them.

Returns:

Total number of string values that were modified.

Return type:

int

Examples

>>> wrapper = {"connectors": [{"uri": "library=CASUSER", "hints": {"xpath": "/foo/CASUSER"}}]}
>>> replace_outer_uris(wrapper, {"CASUSER": "PUBLIC"})
1
>>> wrapper["connectors"][0]["uri"]
'library=PUBLIC'
>>> wrapper["connectors"][0]["hints"]["xpath"]
'/foo/CASUSER'

Constants¶

kstlib.transform.PATCH_SCOPE_VALUES = frozenset({'all', 'blob', 'outer'})¶

frozenset() -> empty frozenset object frozenset(iterable) -> frozenset object

Build an immutable unordered collection of unique elements.

kstlib.transform.PROTECTED_OUTER_PATHS = frozenset({'connectors[*].hints.xpath'})¶

frozenset() -> empty frozenset object frozenset(iterable) -> frozenset object

Build an immutable unordered collection of unique elements.

Configuration¶

class kstlib.transform.TransformConfig(chains=<factory>, allowed_callable_modules=frozenset({}))[source]

Bases: object

Top-level transform configuration from kstlib.conf.yml.

chains

Named transform chain configurations.

Type:: dict[str, kstlib.transform.config.TransformChainConfig]

allowed_callable_modules

Whitelist of allowed callable module prefixes.

Type:: frozenset[str]

Examples

>>> TransformConfig(chains={"decode": TransformChainConfig(name="decode", forward=(PrimitiveConfig(name="base64"),))})
TransformConfig(chains={...}, allowed_callable_modules=frozenset())

chains: dict[str, TransformChainConfig]

allowed_callable_modules: frozenset[str]

__post_init__(self) → 'None' -> None[source]

Validate top-level configuration.

Raises:: TransformConfigError – If configuration is invalid.

__init__(self, chains: 'dict[str, TransformChainConfig]' = <factory>, allowed_callable_modules: 'frozenset[str]' = frozenset()) → None -> None

class kstlib.transform.TransformChainConfig(name, forward=(), backward=None, patch=None, composed_patch=None, preset=None)[source]

Bases: object

Configuration for a named transform chain.

A chain declares how to transform data. It must provide at least one of:

forward: explicit forward primitive chain
preset: inherit forward/backward from another chain
patch or composed_patch: patch-only chain (no forward/backward). Such chains are meant to be referenced by a composed_patch of another chain, not instantiated directly.

name

Chain name (for logging and error messages).

Type:: str

forward

Ordered tuple of forward primitives.

Type:: tuple[kstlib.transform.config.PrimitiveConfig, …]

backward

Ordered list of backward primitives (None = auto-reverse).

Type:: tuple[kstlib.transform.config.PrimitiveConfig, …] | None

patch

Patch configuration (None = no inline patching).

Type:: kstlib.transform.config.PatchConfig | None

composed_patch

Composed patches referencing other chains (None = absent).

Type:: kstlib.transform.config.ComposedPatchConfig | None

preset

Name of preset to inherit from (None = standalone).

Type:: str | None

Examples

>>> TransformChainConfig(
...     name="decode",
...     forward=(PrimitiveConfig(name="base64"),),
... )
TransformChainConfig(name='decode', forward=(...), backward=None, patch=None, composed_patch=None, preset=None)

name: str

forward: tuple[PrimitiveConfig, ...]

backward: tuple[PrimitiveConfig, ...] | None

patch: PatchConfig | None

composed_patch: ComposedPatchConfig | None

preset: str | None

__post_init__(self) → 'None' -> None[source]

Validate chain configuration.

Raises:: TransformConfigError – If configuration is invalid.

__init__(self, name: 'str', forward: 'tuple[PrimitiveConfig, ...]' = (), backward: 'tuple[PrimitiveConfig, ...] | None' = None, patch: 'PatchConfig | None' = None, composed_patch: 'ComposedPatchConfig | None' = None, preset: 'str | None' = None) → None -> None

class kstlib.transform.PrimitiveConfig(name, options=<factory>)[source]

Bases: object

Configuration for a single transform primitive.

name

Primitive name (base64, zlib, json, xml, bytes).

Type:: str

options

Primitive-specific options dict.

Type:: dict[str, Any]

Examples

>>> PrimitiveConfig(name="base64")
PrimitiveConfig(name='base64', options={})
>>> PrimitiveConfig(name="zlib", options={"skip_bytes": 3})
PrimitiveConfig(name='zlib', options={'skip_bytes': 3})

name: str

options: dict[str, Any]

__post_init__(self) → 'None' -> None[source]

Validate primitive configuration.

Raises:: TransformConfigError – If configuration is invalid.

__init__(self, name: 'str', options: 'dict[str, Any]' = <factory>) → None -> None

class kstlib.transform.PatchConfig(replace=None, scope='blob', callable=None, args=<factory>, mapping=None)[source]

Bases: object

Configuration for the patch stage between forward and backward.

A patch operates either as a string-replacement mapping (replace) or as a Python callable (callable). The two modes are mutually exclusive.

The scope field controls WHERE replacements apply:

"blob" (default): patch the data decoded by the forward chain (e.g. the BIRD XML extracted from a SAS Viya report blob). Preserves the historical behavior.
"outer": patch the wrapper dict passed in metadata["outer"] to chain.transform(). The wrapper is mutated in place; the blob itself is not modified beyond the normal forward+backward round-trip. Useful for fields like connectors[].uri that live outside the encoded blob.
"all": do both, blob first then outer.

replace

String replacement mapping {old: new}. Mutually exclusive with callable.

Type:: dict[str, str] | None

scope

Where to apply the replace mapping. One of "blob" (default), "outer", "all".

Type:: Literal[‘blob’, ‘outer’, ‘all’]

callable

Import target module.path:function for complex patch logic. Mutually exclusive with replace.

Type:: str | None

args

Keyword arguments passed to the callable as **kwargs.

Type:: dict[str, Any]

mapping

Deprecated alias for ``replace``. Setting it triggers a DeprecationWarning and is silently copied to replace. Will be removed in a future version. Do not set both mapping and replace (raises).

Type:: dict[str, str] | None

Examples

>>> PatchConfig(replace={"old": "new"})
PatchConfig(replace={'old': 'new'}, scope='blob', callable=None, args={}, mapping=None)
>>> PatchConfig(replace={"a": "b"}, scope="all")
PatchConfig(replace={'a': 'b'}, scope='all', callable=None, args={}, mapping=None)

replace: dict[str, str] | None

scope: Literal['blob', 'outer', 'all']

callable: str | None

args: dict[str, Any]

mapping: dict[str, str] | None

__post_init__(self) → 'None' -> None[source]

Validate patch configuration.

Raises:: TransformConfigError – If configuration is invalid.

__init__(self, replace: 'dict[str, str] | None' = None, scope: "Literal['blob', 'outer', 'all']" = 'blob', callable: 'str | None' = None, args: 'dict[str, Any]' = <factory>, mapping: 'dict[str, str] | None' = None) → None -> None

class kstlib.transform.FilterConfig(content_type='*', name='*')[source]

Bases: object

Filter used by TargetedPatchConfig to select matching objects.

All fields are ANDed: an object matches only if every field matches. A value of "*" means “any value”.

content_type

Object content type (“report”, “folder”, “file”, or “*”).

Type:: str

name

fnmatch glob pattern on the object name (e.g. "R220_*").

Type:: str

Examples

>>> FilterConfig(content_type="report", name="R220_*")
FilterConfig(content_type='report', name='R220_*')
>>> FilterConfig()
FilterConfig(content_type='*', name='*')

content_type: str

name: str

__post_init__(self) → 'None' -> None[source]

Validate filter configuration.

Raises:: TransformConfigError – If configuration is invalid.

__init__(self, content_type: 'str' = '*', name: 'str' = '*') → None -> None

class kstlib.transform.TargetedPatchConfig(filter, patches)[source]

Bases: object

A filter plus a list of patch chain names to apply when it matches.

filter

Filter describing which objects this entry applies to.

Type:: kstlib.transform.config.FilterConfig

patches

Ordered tuple of chain names whose .patch is applied.

Type:: tuple[str, …]

Examples

>>> TargetedPatchConfig(
...     filter=FilterConfig(content_type="report", name="R220_*"),
...     patches=("remap_caslib_r220",),
... )
TargetedPatchConfig(filter=FilterConfig(...), patches=('remap_caslib_r220',))

filter: FilterConfig

patches: tuple[str, ...]

__post_init__(self) → 'None' -> None[source]

Validate targeted patch configuration.

Raises:: TransformConfigError – If configuration is invalid.

__init__(self, filter: 'FilterConfig', patches: 'tuple[str, ...]') → None -> None

class kstlib.transform.ComposedPatchConfig(global_patches=(), targeted_patches=())[source]

Bases: object

Composition of global and targeted patch chain references.

Execution order (per object):

global_patches: applied to every object, in declaration order.
targeted_patches: for each entry in declaration order, if the filter matches the object metadata, apply all its patches in order.

Last applied wins on conflict, following kstlib cascade philosophy (kwargs > user config > preset > defaults). Ordering is by declaration, not by filter specificity. Order your targeted_patches from most general to most specific.

global_patches

Chain names applied to every object.

Type:: tuple[str, …]

targeted_patches

Conditional entries applied when their filter matches.

Type:: tuple[kstlib.transform.config.TargetedPatchConfig, …]

Examples

>>> ComposedPatchConfig(
...     global_patches=("remap_host",),
...     targeted_patches=(
...         TargetedPatchConfig(
...             filter=FilterConfig(name="R220_*"),
...             patches=("remap_caslib_r220",),
...         ),
...     ),
... )
ComposedPatchConfig(global_patches=('remap_host',), targeted_patches=(...))

global_patches: tuple[str, ...]

targeted_patches: tuple[TargetedPatchConfig, ...]

__post_init__(self) → 'None' -> None[source]

Validate composed patch configuration.

Raises:: TransformConfigError – If configuration is invalid.

__init__(self, global_patches: 'tuple[str, ...]' = (), targeted_patches: 'tuple[TargetedPatchConfig, ...]' = ()) → None -> None

kstlib.transform.load_transform_config() → 'TransformConfig' -> TransformConfig[source]¶

Load transform configuration from kstlib.conf.yml.

Reads the transforms: section from the global config.

Returns:: Parsed TransformConfig, or empty config if section absent.
Return type:: TransformConfig

Examples

>>> config = load_transform_config()  

Primitives¶

kstlib.transform.base64_decode(data: 'str', config: 'PrimitiveConfig') → 'bytes' -> bytes[source]¶

Decode base64 string to bytes.

Supports SAS Viya wire formats and other proprietary base64 variants via two opt-in options:

strip_prefix: a literal string that, if present at the start of the input, is removed before decoding. Useful for SAS Viya report blobs which begin with "TRUE###" (the TRUE part decodes to the 3-byte SAS proprietary header and ### is a separator that lenient base64 decoders skip).
strict: when True (default), the underlying decoder runs with validate=True and rejects any character outside the base64 alphabet. When False, non-alphabet characters are stripped before decoding (matches the de facto behavior of Python’s stdlib base64.b64decode and most other tools).

Parameters:

data (str) – Base64-encoded string. May include a configurable prefix and (in lenient mode) embedded whitespace or separators.
config (PrimitiveConfig) –
Primitive config. Recognized options:
- strip_prefix (str | None): literal prefix to remove before decoding. Default None. Max 32 chars. If the input does not start with this prefix, the option is a no-op (does NOT raise) so the same chain can handle mixed blobs that sometimes carry the prefix.
- strict (bool): when True (default) reject any non-alphabet character; when False strip them silently before decoding.

Returns:

Decoded bytes.

Raises:

DecodeError – If data is not a string, exceeds the input size limit, or fails to decode after the configured pre-processing.

Return type:

bytes

Examples

>>> base64_decode("SGVsbG8=", PrimitiveConfig(name="base64"))
b'Hello'
>>> # SAS Viya pattern: strip the proprietary "TRUE###" prefix
>>> cfg = PrimitiveConfig(name="base64",
...     options={"strip_prefix": "TRUE###", "strict": False})
>>> base64_decode("TRUE###SGVsbG8=", cfg)
b'Hello'
>>> # strip_prefix is a no-op when the input does not start with it
>>> cfg2 = PrimitiveConfig(name="base64",
...     options={"strip_prefix": "TRUE###"})
>>> base64_decode("SGVsbG8=", cfg2)
b'Hello'
>>> # Lenient mode strips embedded non-alphabet noise
>>> cfg3 = PrimitiveConfig(name="base64", options={"strict": False})
>>> base64_decode("SGVs###bG8=", cfg3)
b'Hello'

kstlib.transform.base64_encode(data: 'bytes', config: 'PrimitiveConfig') → 'str' -> str[source]¶

Encode bytes to base64 string with an optional literal prefix.

The prefix option allows reattaching a proprietary marker after encoding, mirroring base64_decode’s strip_prefix on the forward path. The typical SAS Viya use case is "TRUE###": the forward chain strips it before decoding, the backward chain re-prepends it after encoding so the wire format is preserved bit-for-bit.

Parameters:

data (bytes) – Raw bytes to encode.
config (PrimitiveConfig) –
Primitive config. Recognized options:
- prefix (str | None): literal string prepended to the base64 result. Default None. Max 32 chars.

Returns:

Base64-encoded string, optionally prefixed.

Raises:

EncodeError – If data is not bytes.

Return type:

str

Examples

>>> base64_encode(b"Hello", PrimitiveConfig(name="base64"))
'SGVsbG8='
>>> # Reattach the SAS Viya proprietary prefix
>>> cfg = PrimitiveConfig(name="base64", options={"prefix": "TRUE###"})
>>> base64_encode(b"Hello", cfg)
'TRUE###SGVsbG8='

kstlib.transform.bytes_decode(data: 'bytes', config: 'PrimitiveConfig') → 'str' -> str[source]¶

Decode bytes to string.

Parameters:

data (bytes) – Raw bytes.
config (PrimitiveConfig) – Options: encoding (default utf-8).

Returns:

Decoded string.

Raises:

DecodeError – If decoding fails.

Return type:

str

Examples

>>> bytes_decode(b"Hello", PrimitiveConfig(name="bytes"))
'Hello'

kstlib.transform.bytes_encode(data: 'str', config: 'PrimitiveConfig') → 'bytes' -> bytes[source]¶

Encode string to bytes.

Parameters:

data (str) – String to encode.
config (PrimitiveConfig) – Options: encoding (default utf-8).

Returns:

Encoded bytes.

Raises:

EncodeError – If encoding fails.

Return type:

bytes

Examples

>>> bytes_encode("Hello", PrimitiveConfig(name="bytes"))
b'Hello'

kstlib.transform.zlib_compress(data: 'bytes', config: 'PrimitiveConfig') → 'bytes' -> bytes[source]¶

Compress data with zlib, optionally prepending a header.

Parameters:

data (bytes) – Raw bytes to compress.
config (PrimitiveConfig) –
Primitive config. Recognized options:
- prepend_bytes (str | None): hex string prepended before the compressed bytes. Default None.
- level (int): compression level passed to zlib.compress. Range -1 to 9, where -1 means “use the Python zlib default level” (typically 6), 0 means no compression, and 9 means maximum compression. Default -1. Higher values produce smaller output but are slower.

Returns:

Compressed bytes with optional header prefix.

Raises:

CompressError – If compression fails or prepend_bytes hex is invalid.

Return type:

bytes

Examples

>>> result = zlib_compress(b"Hello", PrimitiveConfig(name="zlib"))
>>> import zlib
>>> zlib.decompress(result)
b'Hello'
>>> # Maximum compression level
>>> cfg = PrimitiveConfig(name="zlib", options={"level": 9})
>>> result9 = zlib_compress(b"Hello world " * 100, cfg)
>>> zlib.decompress(result9) == b"Hello world " * 100
True

kstlib.transform.zlib_decompress(data: 'bytes', config: 'PrimitiveConfig') → 'bytes' -> bytes[source]¶

Decompress zlib data with optional header skip.

Parameters:

data (bytes) – Compressed bytes (possibly with prefix header).
config (PrimitiveConfig) – Options: skip_bytes (int) strips N leading bytes.

Returns:

Decompressed bytes.

Raises:

DecompressError – If decompression fails or input invalid.

Return type:

bytes

Examples

>>> import zlib
>>> compressed = zlib.compress(b"Hello")
>>> zlib_decompress(compressed, PrimitiveConfig(name="zlib"))
b'Hello'

kstlib.transform.json_parse(data: 'str | bytes', config: 'PrimitiveConfig') → 'tuple[Any, dict[str, Any] | None]' -> tuple[Any, dict[str, Any] | None][source]¶

Parse JSON string, optionally extracting a nested field.

Returns a tuple of (value, envelope). If extract is used, envelope contains the original parsed dict for lossless backward restoration. If no extract, envelope is None.

Parameters:

data (str | bytes) – JSON string or bytes.
config (PrimitiveConfig) – Options: extract (dot path).

Returns:

Tuple of (extracted_or_full_value, original_envelope_or_None).

Raises:

ParseError – If JSON parsing fails or extract path not found.

Return type:

tuple[Any, dict[str, Any] | None]

Examples

>>> val, env = json_parse('{"a": 1}', PrimitiveConfig(name="json"))
>>> val
{'a': 1}

kstlib.transform.json_serialize(data: 'Any', config: 'PrimitiveConfig', *, envelope: 'dict[str, Any] | None' = None) → 'str' -> str[source]¶

Serialize Python object to JSON string.

If wrap path and envelope are provided, restores the value into the original envelope structure (lossless round-trip).

Parameters:

data (Any) – Python object to serialize.
config (PrimitiveConfig) –
Primitive config. Recognized options:
- wrap (str | None): dot-notation path used together with envelope to restore the value inside its original envelope structure. Default None.
- minify (bool): when True, output uses compact separators=(",", ":") (no whitespace). When False (default), uses Python’s default separators (", ", ": "). Useful before zlib compression (denser input compresses better).
- ensure_ascii (bool): when True, escape every non-ASCII character as \\uXXXX. When False (default, diverges from Python stdlib which is True), non-ASCII characters are emitted verbatim. The kstlib default is False to preserve Unicode content (French, Japanese, etc.) without bloating the output.
envelope (dict[str, Any] | None) – Original envelope for lossless restoration when wrap is set.

Returns:

JSON string.

Raises:

SerializeError – If serialization fails.

Return type:

str

Examples

>>> json_serialize({"a": 1}, PrimitiveConfig(name="json"))
'{"a": 1}'
>>> # Minified output (no spaces after , and :)
>>> cfg = PrimitiveConfig(name="json", options={"minify": True})
>>> json_serialize({"a": 1, "b": 2}, cfg)
'{"a":1,"b":2}'
>>> # Preserve Unicode content (default behavior)
>>> json_serialize({"k": "café"}, PrimitiveConfig(name="json"))
'{"k": "café"}'
>>> # Force ASCII escapes
>>> cfg2 = PrimitiveConfig(name="json", options={"ensure_ascii": True})
>>> json_serialize({"k": "café"}, cfg2)
'{"k": "caf\\u00e9"}'

kstlib.transform.xml_parse(data: 'str', config: 'PrimitiveConfig') → 'Element' -> Element[source]¶

Parse XML string to ElementTree Element.

Uses defusedxml.ElementTree.fromstring for XXE protection. defusedxml raises EntitiesForbidden, DTDForbidden, or ExternalReferenceForbidden on malicious payloads; all are wrapped in a ParseError here.

Parameters:

data (str) – XML string.
config (PrimitiveConfig) – Primitive config.

Returns:

ElementTree root Element.

Raises:

ParseError – If XML parsing fails or input is unsafe.

Return type:

Element

Examples

>>> root = xml_parse("<root><a>1</a></root>", PrimitiveConfig(name="xml"))
>>> root.tag
'root'

kstlib.transform.xml_serialize(data: 'Element', config: 'PrimitiveConfig') → 'str' -> str[source]¶

Serialize ElementTree Element to XML string.

Parameters:

data (Element) – ElementTree root Element.
config (PrimitiveConfig) – Primitive config.

Returns:

XML string.

Raises:

SerializeError – If serialization fails.

Return type:

str

Examples

>>> from xml.etree.ElementTree import Element
>>> root = Element("root")
>>> xml_serialize(root, PrimitiveConfig(name="xml"))
'<root />'

kstlib.transform.split_extract(data: 'str', config: 'PrimitiveConfig') → 'str | list[str]' -> str | list[str][source]¶

Split a string by a literal separator and extract segment(s).

Forward-only string extractor. The separator is treated literally (no regex). Options are read from config.options:

sep (str, required): the literal separator. Validated at config time (must be a non-empty string).
index (int | None): which segment to return. None (default) returns the full list of segments; an integer returns that single segment (0-based, negatives count from the end: -1 is the last).
maxsplit (int): forwarded to str.split. Default -1 (no limit).
keep_empty (bool): when False (default), empty segments are filtered out before indexing, which is path-friendly (a leading separator does not produce an empty first segment). This diverges from str.split, which keeps empty segments. Set to True to keep them.

Parameters:

data (str) – Input string to split.
config (PrimitiveConfig) – Primitive config carrying the options described above.

Returns:

The selected segment (str) when index is set, otherwise the list of segments (list[str]).

Raises:

ParseError – If the input is not a string, exceeds the size limit, contains a null byte, or index is out of range.

Return type:

str | list[str]

Examples

>>> cfg = PrimitiveConfig(name="split", options={"sep": "/", "index": -1})
>>> split_extract("/reports/reports/abc", cfg)
'abc'
>>> split_extract("a/b/c", PrimitiveConfig(name="split", options={"sep": "/"}))
['a', 'b', 'c']

kstlib.transform.tr_translate(data: 'str', config: 'PrimitiveConfig') → 'str' -> str[source]¶

Translate or delete characters in a string (character level).

Forward-only string transformer operating character by character, like the Unix tr command. This is distinct from the substring replace of the patch stage: tr works on individual characters. Exactly one of the following options must be set (enforced at config time by PrimitiveConfig):

delete (str): every character in this set is removed from the input.
map (dict[str, str]): a single-character to single-character translation table applied to the input.

Parameters:

data (str) – Input string to translate.
config (PrimitiveConfig) – Primitive config carrying the delete or map option.

Returns:

The translated string.

Raises:

ParseError – If the input is not a string, exceeds the input size limit, or contains a null byte.

Return type:

str

Examples

>>> tr_translate("a\nb\n", PrimitiveConfig(name="tr", options={"delete": "\n"}))
'ab'
>>> tr_translate("aaa", PrimitiveConfig(name="tr", options={"map": {"a": "b"}}))
'bbb'

kstlib.transform.remove_prefix(data: 'str', config: 'PrimitiveConfig') → 'str' -> str[source]¶

Strip a known literal prefix from the start of a string.

Forward-only string extractor wrapping str.removeprefix(). The prefix option (required, validated at config time) is removed from the input only when the input starts with it; otherwise the string is returned unchanged (no error). This mirrors the standard-library semantics and lets the same chain handle mixed inputs that sometimes carry the prefix.

Parameters:

data (str) – Input string to strip.
config (PrimitiveConfig) – Primitive config carrying the required prefix option.

Returns:

The input with prefix removed from its start, or the input unchanged when it does not start with prefix.

Raises:

ParseError – If the input is not a string, exceeds the input size limit, or contains a null byte.

Return type:

str

Examples

>>> cfg = PrimitiveConfig(name="removeprefix", options={"prefix": "reports/"})
>>> remove_prefix("reports/xxx", cfg)
'xxx'
>>> # No-op when the input does not start with the prefix
>>> remove_prefix("other/xxx", cfg)
'other/xxx'

kstlib.transform.remove_suffix(data: 'str', config: 'PrimitiveConfig') → 'str' -> str[source]¶

Strip a known literal suffix from the end of a string.

Forward-only string extractor wrapping str.removesuffix(). The suffix option (required, validated at config time) is removed from the input only when the input ends with it; otherwise the string is returned unchanged (no error). This mirrors the standard-library semantics and lets the same chain handle mixed inputs that sometimes carry the suffix.

Parameters:

data (str) – Input string to strip.
config (PrimitiveConfig) – Primitive config carrying the required suffix option.

Returns:

The input with suffix removed from its end, or the input unchanged when it does not end with suffix.

Raises:

ParseError – If the input is not a string, exceeds the input size limit, or contains a null byte.

Return type:

str

Examples

>>> cfg = PrimitiveConfig(name="removesuffix", options={"suffix": ".json"})
>>> remove_suffix("data.json", cfg)
'data'
>>> # No-op when the input does not end with the suffix
>>> remove_suffix("data.yml", cfg)
'data.yml'

Exceptions¶

exception kstlib.transform.TransformError[source]¶

Bases: KstlibError

Base exception for all transform module errors.

exception kstlib.transform.TransformConfigError[source]¶

Bases: TransformError, ValueError

Transform configuration is invalid.

Raised when the transform chain or primitive configuration contains invalid values, missing required fields, or constraint violations.

exception kstlib.transform.TransformChainError(message, *, chain_name=None)[source]¶

Bases: TransformError

Transform chain execution failed.

chain_name¶: Name of the chain that failed.

__init__(self, message: 'str', *, chain_name: 'str | None' = None) → 'None' -> None[source]¶

Initialize TransformChainError.

Parameters:

message (str) – Human-readable error message.
chain_name (str | None) – Name of the chain that failed.

exception kstlib.transform.PrimitiveError(message, *, primitive_name=None, chain_name=None)[source]¶

Bases: TransformChainError

A single transform primitive failed.

primitive_name¶: Name of the primitive that failed.

__init__(self, message: 'str', *, primitive_name: 'str | None' = None, chain_name: 'str | None' = None) → 'None' -> None[source]¶

Initialize PrimitiveError.

Parameters:

message (str) – Human-readable error message.
primitive_name (str | None) – Name of the primitive that failed.
chain_name (str | None) – Name of the chain that was running.

exception kstlib.transform.DecodeError(message, *, primitive_name=None, chain_name=None)[source]¶

Bases: PrimitiveError

Base64 or bytes decoding failed.

exception kstlib.transform.DecompressError(message, *, primitive_name=None, chain_name=None)[source]¶

Bases: PrimitiveError

Zlib decompression failed.

exception kstlib.transform.ParseError(message, *, primitive_name=None, chain_name=None)[source]¶

Bases: PrimitiveError

JSON or XML parsing failed.

exception kstlib.transform.PatchError(message, *, primitive_name=None, chain_name=None)[source]¶

Bases: PrimitiveError

Patch stage failed.

exception kstlib.transform.SerializeError(message, *, primitive_name=None, chain_name=None)[source]¶

Bases: PrimitiveError

JSON or XML serialization failed.

exception kstlib.transform.CompressError(message, *, primitive_name=None, chain_name=None)[source]¶

Bases: PrimitiveError

Zlib compression failed.

exception kstlib.transform.EncodeError(message, *, primitive_name=None, chain_name=None)[source]¶

Bases: PrimitiveError

Base64 or bytes encoding failed.

exception kstlib.transform.CallableError(target, reason, *, chain_name=None)[source]¶

Bases: TransformChainError

Callable raised an exception during execution.

target¶: The callable target string.

__init__(self, target: 'str', reason: 'str', *, chain_name: 'str | None' = None) → 'None' -> None[source]¶

Initialize CallableError.

Parameters:

target (str) – The callable target string.
reason (str) – Description of the error.
chain_name (str | None) – Name of the chain that was running.

exception kstlib.transform.CallableImportError(target, *, chain_name=None)[source]¶

Bases: TransformChainError

Callable target could not be imported.

target¶: The import target string that failed.

__init__(self, target: 'str', *, chain_name: 'str | None' = None) → 'None' -> None[source]¶

Initialize CallableImportError.

Parameters:

target (str) – The import target string (e.g. “module.path:function”).
chain_name (str | None) – Name of the chain that was running.