dimcat.steps.slicers package#

Submodules#

dimcat.steps.slicers.base module#

class dimcat.steps.slicers.base.Slicer(level_name: str = 'slice', **kwargs)[source]#

Bases: ResourceTransformation

class Schema(*, only: Optional[Union[Sequence[str], AbstractSet[str]]] = None, exclude: Union[Sequence[str], AbstractSet[str]] = (), many: Optional[bool] = None, load_only: Union[Sequence[str], AbstractSet[str]] = (), dump_only: Union[Sequence[str], AbstractSet[str]] = (), partial: Optional[Union[bool, Sequence[str], AbstractSet[str]]] = None, unknown: Optional[Literal['exclude', 'include', 'raise']] = None)[source]#

Bases: Schema

dump_fields: dict[str, Field]#
exclude: set[Any] | MutableSet[Any]#
fields: dict[str, Field]#

Dictionary mapping field_names -> Field objects

load_fields: dict[str, Field]#
opts: Any = <marshmallow.schema.SchemaOpts object>#
unknown: types.UnknownOption#
check_resource(resource: DimcatResource) None[source]#

Check if the resource is eligible for processing.

Raises:
get_slice_intervals(resource: DimcatResource) SliceIntervals[source]#
property level_name: str#
property required_feature: Optional[FeatureName]#
transform_resource(resource: DimcatResource) DataFrame[source]#

Apply the grouper to a Feature.

dimcat.steps.slicers.feature_dimensions module#

class dimcat.steps.slicers.feature_dimensions.FeatureDimensionsSlicer(level_name: str = 'adjacency_group', slice_intervals: Optional[SliceIntervals] = None, **kwargs)[source]#

Bases: Slicer

This slicer and its subclasses slice resources according to the dimensions of a particular Feature. Its previous name, AdjacencyGroupSlicer, expresses an important characteristic, which is that it computes dimensions in the sense of ranges where a given feature does not change. For example, a LocalKeySlicer uses dimensions of uninterrupted segments consisting of a single local key. This is different from a LocalKeyGrouper, which would group elements by local key regardless of which key segment they come from.

This type of slicer needs to be set up with dimensions of the required_feature. It therefore requires either processing a Dataset providing the relevant Feature (resulting in a call to fit_to_dataset()), or calling process() on the relevant feature before any others, or setting the slice_intervals manually, including upon initialization.

As all slicers, FeatureDimensionSlicers append a new index level with slice intervals to the processed features. Items whose timespans overlap with a slice interval are split. If several items occur within a given slice interval, they will share that same interval in the new index level. In most cases, you will want to group by this new level. If you need to know which feature value(s) each slice interval corresponds to, you can use FeatureDimensionSlicer.slice_metadata

class Schema(*, only: Optional[Union[Sequence[str], AbstractSet[str]]] = None, exclude: Union[Sequence[str], AbstractSet[str]] = (), many: Optional[bool] = None, load_only: Union[Sequence[str], AbstractSet[str]] = (), dump_only: Union[Sequence[str], AbstractSet[str]] = (), partial: Optional[Union[bool, Sequence[str], AbstractSet[str]]] = None, unknown: Optional[Literal['exclude', 'include', 'raise']] = None)[source]#

Bases: Schema

dump_fields: dict[str, Field]#
exclude: set[Any] | MutableSet[Any]#
fields: dict[str, Field]#

Dictionary mapping field_names -> Field objects

load_fields: dict[str, Field]#
opts: Any = <marshmallow.schema.SchemaOpts object>#
unknown: types.UnknownOption#
fit_to_dataset(dataset: Dataset) None[source]#

Set the slice intervals to the intervals provided by the relevant feature.

get_slice_intervals(resource: Feature) SliceIntervals[source]#

Get the slice intervals from the relevant feature.

property required_feature: FeatureName#
property slice_intervals: Optional[SliceIntervals]#
slice_metadata: Optional[Feature]#

Reference to the processed Feature that determines the slice intervals of the current fit. This feature, sliced, serves as metadata and will be joined with Metadata features whenever they are processed.

transform_resource(resource: DimcatResource) DataFrame[source]#

Apply the slicer to a Feature.

class dimcat.steps.slicers.feature_dimensions.HarmonyLabelSlicer(level_name: str = 'harmony_label_slice', slice_intervals: Optional[SliceIntervals] = None, **kwargs)[source]#

Bases: FeatureDimensionsSlicer

Slices resources using intervals from the HarmonyLabels feature.

class dimcat.steps.slicers.feature_dimensions.KeySlicer(level_name: str = 'localkey_slice', slice_intervals: Optional[SliceIntervals] = None, **kwargs)[source]#

Bases: FeatureDimensionsSlicer

Slices resources by key.

class dimcat.steps.slicers.feature_dimensions.PhraseSlicer(level_name: str = 'phrase_slice', slice_intervals: Optional[SliceIntervals] = None, **kwargs)[source]#

Bases: FeatureDimensionsSlicer

Slices resources by phrase.

Module contents#