dimcat.data package#

Subpackages#

Submodules#

dimcat.data.base module#

class dimcat.data.base.AbsolutePathStr[source]#

Bases: str

This is just a string but if it includes the HOME directory, it is represented with a leading ‘~’.

class dimcat.data.base.Data(basepath: Optional[str] = None)[source]#

Bases: DimcatObject

This base class unites all classes containing data in some way or another.

class PickleSchema(*, only: Optional[Union[Sequence[str], AbstractSet[str]]] = None, exclude: Union[Sequence[str], AbstractSet[str]] = (), many: Optional[bool] = None, load_only: Union[Sequence[str], AbstractSet[str]] = (), dump_only: Union[Sequence[str], AbstractSet[str]] = (), partial: Optional[Union[bool, Sequence[str], AbstractSet[str]]] = None, unknown: Optional[Literal['exclude', 'include', 'raise']] = None)[source]#

Bases: Schema

When serializing data objects, the basepath is used as location, but it is not included in the descriptor, according to the frictionless specification.

dump_fields: dict[str, Field]#
exclude: set[Any] | MutableSet[Any]#
fields: dict[str, Field]#

Dictionary mapping field_names -> Field objects

load_fields: dict[str, Field]#
opts: Any = <marshmallow.schema.SchemaOpts object>#
unknown: types.UnknownOption#
validate_dump(data, **kwargs)[source]#

Make sure to never return invalid serialization data. Identical with PipelineStep.Schema.validate_dump().

class Schema(*, only: Optional[Union[Sequence[str], AbstractSet[str]]] = None, exclude: Union[Sequence[str], AbstractSet[str]] = (), many: Optional[bool] = None, load_only: Union[Sequence[str], AbstractSet[str]] = (), dump_only: Union[Sequence[str], AbstractSet[str]] = (), partial: Optional[Union[bool, Sequence[str], AbstractSet[str]]] = None, unknown: Optional[Literal['exclude', 'include', 'raise']] = None)[source]#

Bases: Schema

dump_fields: dict[str, Field]#
exclude: set[Any] | MutableSet[Any]#
fields: dict[str, Field]#

Dictionary mapping field_names -> Field objects

load_fields: dict[str, Field]#
opts: Any = <marshmallow.schema.SchemaOpts object>#
unknown: types.UnknownOption#
property basepath: str#
get_basepath(set_default_if_missing: bool = False) str[source]#

Get the basepath of the resource. If not specified, the default basepath is returned. If set_default_if_missing is set to True and no basepath has been set (e.g. during initialization), the basepath is permanently set to the default basepath.

class property pickle_schema[source]#

Returns the (instantiated) PickleSchema singleton object for this class. It is different from the ‘normal’ Schema in that it stores the tabular data to disk and returns the path to its descriptor.

to_config(pickle=False) DimcatConfig[source]#

If pickle is set to True,

to_dict(pickle=False) dict[source]#
static treat_new_basepath(basepath: str, filepath=None, other_logger=None) AbsolutePathStr[source]#
dimcat.data.base.resolve_path(path) Optional[AbsolutePathStr][source]#

Resolves ‘~’ to HOME directory and turns path into an absolute path. This is an identical copy of the function in dimcat.utils.

dimcat.data.utils module#

dimcat.data.utils.check_descriptor_filename_argument(descriptor_filename) str[source]#

Check if the descriptor_filename is a filename (not path) and warn if it doesn’t have the extension .json or .yaml.

Parameters:

descriptor_filename

Raises:

ValueError – If the descriptor_filename is absolute.

dimcat.data.utils.check_rel_path(rel_path, basepath)[source]#
dimcat.data.utils.is_default_package_descriptor_path(filepath: str) bool[source]#
dimcat.data.utils.is_default_resource_descriptor_path(filepath: str) bool[source]#
dimcat.data.utils.make_fl_resource(name: Optional[str] = None, **options) Resource[source]#

Creates a frictionless.Resource by passing the **options to the constructor.

dimcat.data.utils.make_rel_path(path: str, start: str)[source]#

Like os.path.relpath() but ensures that path is contained within start.

dimcat.data.utils.store_as_json_or_yaml(descriptor_dict: dict, descriptor_path: str, create_dirs: bool = True)[source]#
dimcat.data.utils.warn_about_potentially_unrelated_descriptor(basepath: str, descriptor_filename: str)[source]#

Module contents#