mustash.core – Core definitions

mustash.core.Document = Document

Type representing a document to process.

mustash.core.Element = Element

Document element.

This is a recursive type defining a document element that can represent a JSON value with extra types supported by backends such as ElasticSearch, including:

  • Dictionaries associating string keys with document elements;

  • Lists of document elements;

  • Strings;

  • Numbers (integers, floating-point, booleans);

  • None.

class mustash.core.FieldPath(path: FieldPath | str | Iterable[str], /)

Bases: object

Object representing the path to a field in a JSON document.

This object can be used in a similar fashion to pathlib.Path. For example, in order to create a field path out of several components, the following can be used:

>>> FieldPath("hello.world")
FieldPath('hello.world')
>>> FieldPath("hello") / "world"
FieldPath('hello.world')
>>> FieldPath(["hello", "world"])
FieldPath('hello.world')

Field paths can also be used in Pydantic models:

>>> from pydantic import BaseModel
>>> class MyModel(BaseModel):
...     field: FieldPath
...
>>> MyModel(field="hello.world")
MyModel(field=FieldPath('hello.world'))
Parameters:

path (FieldPath | str | Iterable[str])

property parent: FieldPath

Get the field path parent.

Returns:

Parent.

property parts: tuple[str, ...]

Get the parts of the current path.

Returns:

Parts.

get(element: Element, /, *, cls: type | None = None, default: Any = NO_VALUE) Any

Get the value in a document element using the path.

An example usage with a given element is the following:

>>> path = FieldPath("hello.world")
>>> document = {"hello": {"world": [1, 2, 3]}}
>>> path.get(document)
[1, 2, 3]

You can also set a default value, in case any intermediate element does not exist:

>>> path = FieldPath("hello.world")
>>> document = {"hello": {}}
>>> path.get(document, default="my_default")
'my_default'

If you expect a specific type, such as an integer, you can also do set the cls parameter to the expected type, and this method will attempt at using pydantic to convert the value:

>>> path = FieldPath("hello.world")
>>> document = {"hello": {"world": "42"}}
>>> path.get(document, cls=int)
42

You can also add validators to the provided class:

>>> from annotated_types import Le
>>> from typing import Annotated
>>> path = FieldPath("hello")
>>> document = {"hello": "101"}
>>> path.get(document, cls=Annotated[int, Le(100)])
Traceback (most recent call last):
...
pydantic_core._pydantic_core.ValidationError: ...
  Input should be less than or equal to 100 [...]
    ...
Parameters:
  • document – Element from which to get the value.

  • cls (type | None) – Optional type to validate the obtained value with, using pydantic.

  • default (Any) – Default value to get.

  • element (Element)

Returns:

Found value, or default value if one has been set.

Raises:

KeyError – The key was not provided, and the value did not exist.

Return type:

Any

set(element: Element, value: Element, /, *, override: bool = True) None

Set the value in a document element using the path.

An example usage with a given element is the following:

>>> path = FieldPath("hello.world")
>>> document = {}
>>> path.set(document, 42)
>>> document
{'hello': {'world': 42}}
Parameters:
  • element (Element) – Element at which to set the value.

  • value (Element) – Value to set at the path.

  • override (bool) – Whether to override the field if exists, or not.

Raises:

KeyError – A non-indexable object was found in the way.

Return type:

None

delete(element: Element, /) None

Delete the value in a document element using the path.

An example usage with a given element is the following:

>>> path = FieldPath("hello.world")
>>> document = {"hello": {"world": 42}}
>>> path.delete(document)
>>> document
{'hello': {}}
Parameters:

element (Element) – Element at which to delete the value.

Raises:

KeyError – A non-indexable object was found in the way.

Return type:

None

pydantic model mustash.core.Condition

Bases: BaseModel, ABC

Condition to execute one or more processors.

Config:
  • extra: str = forbid

abstract verify(document: Document, /) bool

Verify whether the condition is true or not.

Parameters:

document (Document) – Document on which to verify the condition.

Returns:

Whether the condition is verified or not.

Return type:

bool

pydantic model mustash.core.PainlessCondition

Bases: Condition

Condition written in Painless.

See Painless scripting language for more information.

Config:
  • extra: str = forbid

Fields:
field script: Annotated[str, StringConstraints(min_length=1)] [Required]

Painless script to run.

Constraints:
  • min_length = 1

verify(document: Document, /) bool

Verify whether the condition is true or not.

Parameters:

document (Document) – Document on which to verify the condition.

Returns:

Whether the condition is verified or not.

Return type:

bool

pydantic model mustash.core.Pipeline

Bases: BaseModel

Pipeline, as a set of processors and metadata.

Config:
  • extra: str = forbid

  • arbitrary_types_allowed: bool = True

Fields:
field name: str | None = None

Name of the pipeline.

field processors: list[Processor] [Required]

List of processors constituting the pipeline.

async apply(document: Document, /) None

Apply the pipeline to the document, in-place.

Parameters:

document (Document) – Document to which to apply the processor.

Return type:

None

pydantic model mustash.core.Processor

Bases: BaseModel, ABC

Processor, for transforming data.

For a guide on how to create your own processors based on this class, see Creating processors.

Config:
  • extra: str = forbid

  • arbitrary_types_allowed: bool = True

Fields:
field condition: Condition | None = None

Condition depending on which the processor is executed.

field description: str | None = None

Optional description of the processor.

field ignore_failure: bool = False

Whether to ignore failures for the processor.

field on_failure: list[Processor] | None = None

Processors to execute when a failure occurs.

field tag: str | None = None

Identifier for the processor, included in debugging and metrics.

abstract async apply(document: Document, /) None

Apply the processor to the document, in-place.

Parameters:

document (Document) – Document to which to apply the processor.

Return type:

None

pydantic model mustash.core.FieldProcessor

Bases: Processor, Generic[FieldType]

Processor that processes a field, expected to be a given type.

For a guide on how to create such processors, see Creating field processors.

This uses the same idea as ElasticSearch’s abstract string processor, used for a few of their processors.

Config:
  • extra: str = forbid

  • arbitrary_types_allowed: bool = True

Fields:
Validators:
  • _validate » all fields

field condition: Condition | None = None

Condition depending on which the processor is executed.

field description: str | None = None

Optional description of the processor.

field field: FieldPath [Required]

Field from which to get the size.

field ignore_failure: bool = False

Whether to ignore failures for the processor.

field ignore_missing: bool = False

Whether not to fail if the field is not present in the document.

field on_failure: list[Processor] | None = None

Processors to execute when a failure occurs.

field remove_if_successful: bool = False

Whether to remove the source field after processing it.

field tag: str | None = None

Identifier for the processor, included in debugging and metrics.

field target_field: FieldPath | None = None

Target field to set with the result.

async apply(document: Document, /) None

Apply the processor to the document, in-place.

Parameters:

document (Document) – Document to which to apply the processor.

Return type:

None

abstract async process(value: Element, /) Element

Process the field into the expected type.

Parameters:

value (Element) – Value to process.

Returns:

Processed value.

Return type:

Element