transforms - Transforming variables, scales and coordinates

"The Grammar of Graphics (2005)" by Wilkinson, Anand and Grossman describes three types of transformations.

  • Variable transformations - Used to make statistical operations on variables appropriate and meaningful. They are also used to new variables.

  • Scale transformations - Used to make statistical objects displayed on dimensions appropriate and meaningful.

  • Coordinate transformations - Used to manipulate the geometry of graphics to help perceive relationships and find meaningful structures for representing variations.

Variable and scale transformations are similar in-that they lead to plotted objects that are indistinguishable. Typically, variable transformation is done outside the graphics system and so the system cannot provide transformation specific guides & decorations for the plot. The trans is aimed at being useful for scale and coordinate transformations.

class mizani.transforms.asn_trans(*, domain: DomainType = (-inf, inf), transform_is_linear: bool = True, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Arc-sin square-root Transformation

transform_is_linear: bool = True

Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.

transform(x: FloatArrayLike) NDArrayFloat[source]

Transform of x

inverse(x: FloatArrayLike) NDArrayFloat[source]

Inverse of x

class mizani.transforms.atanh_trans(*, domain: DomainType = (-inf, inf), transform_is_linear: bool = True, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Arc-tangent Transformation

transform_is_linear: bool = True

Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.

transform(x)[source]

Transform of x

inverse(x)[source]

Inverse of x

class mizani.transforms.boxcox_trans(p: float, offset: int = 0, *, domain: DomainType = (-inf, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Boxcox Transformation

The Box-Cox transformation is a flexible transformation, often used to transform data towards normality.

The Box-Cox power transformation (type 1) requires strictly positive values and takes the following form for \(y \gt 0\):

\[y^{(\lambda)} = \frac{y^\lambda - 1}{\lambda}\]

When \(y = 0\), the natural log transform is used.

Parameters:
pfloat

Transformation exponent \(\lambda\).

offsetint

Constant offset. 0 for Box-Cox type 1, otherwise any non-negative constant (Box-Cox type 2). The default is 0. modulus_trans() sets the default to 1.

See also

modulus_trans()

References

transform(x: FloatArrayLike) NDArrayFloat[source]

Transform of x

inverse(x: FloatArrayLike) NDArrayFloat[source]

Inverse of x

class mizani.transforms.modulus_trans(p: float, offset: int = 1, *, domain: DomainType = (-inf, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Modulus Transformation

The modulus transformation generalises Box-Cox to work with both positive and negative values.

When \(y \neq 0\)

\[y^{(\lambda)} = sign(y) * \frac{(|y| + 1)^\lambda - 1}{\lambda}\]

and when \(y = 0\)

\[y^{(\lambda)} = sign(y) * \ln{(|y| + 1)}\]
Parameters:
pfloat

Transformation exponent \(\lambda\).

offsetint

Constant offset. 0 for Box-Cox type 1, otherwise any non-negative constant (Box-Cox type 2). The default is 1. boxcox_trans() sets the default to 0.

See also

boxcox_trans()

References

transform(x: FloatArrayLike) NDArrayFloat[source]

Transform of x

inverse(x: FloatArrayLike) NDArrayFloat[source]

Inverse of x

class mizani.transforms.datetime_trans(tz: tzinfo | str | None = None, *, domain: DomainType = (datetime.datetime(1, 1, 1, 0, 0, tzinfo=zoneinfo.ZoneInfo(key='UTC')), datetime.datetime(9999, 12, 31, 0, 0, tzinfo=zoneinfo.ZoneInfo(key='UTC'))), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Datetime Transformation

Parameters:
tzstr | ZoneInfo

Timezone information

Examples

>>> from zoneinfo import ZoneInfo
>>> UTC = ZoneInfo("UTC")
>>> EST = ZoneInfo("EST")
>>> t = datetime_trans(EST)
>>> x = [datetime(2022, 1, 20, tzinfo=UTC)]
>>> x2 = t.inverse(t.transform(x))
>>> list(x) == list(x2)
True
>>> x[0].tzinfo == x2[0].tzinfo
False
>>> x[0].tzinfo.key
'UTC'
>>> x2[0].tzinfo.key
'EST'
breaks_func: BreaksFunction

Callable to calculate breaks

format_func: FormatFunction

Function to format breaks

transform(x: DatetimeArrayLike) NDArrayFloat[source]

Transform from date to a numerical format

The transform values a unit of [days].

inverse(x: FloatArrayLike) NDArrayDatetime[source]

Transform to date from numerical format

property tzinfo

Alias of tz

diff_type_to_num(x: TimedeltaArrayLike) FloatArrayLike[source]

Covert timedelta to numerical format

The timedeltas are converted to a unit of [days].

class mizani.transforms.exp_trans(base: float = np.float64(2.718281828459045), *, domain: DomainType = (-inf, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Create a exponential transform class for base

This is inverse of the log transform.

Parameters:
basefloat

Base of the logarithm

Returns:
outtype

Exponential transform class

transform(x)[source]

Transform of x

inverse(x)[source]

Inverse of x

class mizani.transforms.identity_trans(transform_is_linear: bool = True, *, domain: DomainType = (-inf, inf), breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Identity Transformation

Examples

The default trans returns one minor break between every pair of major break

>>> major = [0, 1, 2]
>>> t = identity_trans()
>>> t.minor_breaks(major)
array([0.5, 1.5])

Create a trans that returns 4 minor breaks

>>> t = identity_trans(minor_breaks_func=minor_breaks(4))
>>> t.minor_breaks(major)
array([0.2, 0.4, 0.6, 0.8, 1.2, 1.4, 1.6, 1.8])
transform_is_linear: bool = True

Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.

transform(x)[source]

Transform of x

inverse(x)[source]

Inverse of x

class mizani.transforms.log10_trans(base: float = 10, *, domain: DomainType = (2.2250738585072014e-308, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Log 10 Transformation

class mizani.transforms.log1p_trans(*, domain: DomainType = (-inf, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Log plus one Transformation

transform(x)[source]

Transform of x

inverse(x)[source]

Inverse of x

class mizani.transforms.log2_trans(base: float = 2, *, domain: DomainType = (2.2250738585072014e-308, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Log 2 Transformation

class mizani.transforms.log_trans(base: float = np.float64(2.718281828459045), *, domain: DomainType = (2.2250738585072014e-308, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Create a log transform class for base

Parameters:
basefloat

Base for the logarithm. If None, then the natural log is used.

Returns:
outtype

Log transform class

transform(x)[source]

Transform of x

inverse(x)[source]

Inverse of x

class mizani.transforms.logit_trans[source]

Logit Transformation

class mizani.transforms.probability_trans(distribution: str, *args, **kwargs)[source]

Probability Transformation

Parameters:
distributionstr

Name of the distribution. Valid distributions are listed at scipy.stats. Any of the continuous or discrete distributions.

argstuple

Arguments passed to the distribution functions.

kwargsdict

Keyword arguments passed to the distribution functions.

Notes

Make sure that the distribution is a good enough approximation for the data. When this is not the case, computations may run into errors. Absence of any errors does not imply that the distribution fits the data.

transform(x: FloatArrayLike) NDArrayFloat[source]

Transform of x

inverse(x: FloatArrayLike) NDArrayFloat[source]

Inverse of x

class mizani.transforms.probit_trans[source]

Probit Transformation

class mizani.transforms.reverse_trans(*, domain: DomainType = (-inf, inf), transform_is_linear: bool = True, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Reverse Transformation

transform_is_linear: bool = True

Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.

transform(x)[source]

Transform of x

inverse(x)[source]

Inverse of x

class mizani.transforms.sqrt_trans(*, domain: DomainType = (0, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Square-root Transformation

transform(x)[source]

Transform of x

inverse(x)[source]

Inverse of x

class mizani.transforms.symlog_trans(*, domain: DomainType = (-inf, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <mizani.breaks.breaks_symlog object>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Symmetric Log Transformation

They symmetric logarithmic transformation is defined as

f(x) = log(x+1) for x >= 0
       -log(-x+1) for x < 0

It can be useful for data that has a wide range of both positive and negative values (including zero).

breaks_func: BreaksFunction = <mizani.breaks.breaks_symlog object>

Callable to calculate breaks

transform(x: FloatArrayLike) NDArrayFloat[source]

Transform of x

inverse(x: FloatArrayLike) NDArrayFloat[source]

Inverse of x

class mizani.transforms.timedelta_trans(*, domain: DomainType = (datetime.timedelta(days=-999999999), datetime.timedelta(days=999999999, seconds=86399, microseconds=999999)), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Timedelta Transformation

breaks_func: BreaksFunction

Callable to calculate breaks

format_func: FormatFunction

Function to format breaks

transform(x: TimedeltaArrayLike) NDArrayFloat[source]

Transform from Timeddelta to numerical format

The transform values have a unit of [days]

inverse(x: FloatArrayLike) Sequence[pd.Timedelta][source]

Transform to Timedelta from numerical format

diff_type_to_num(x: TimedeltaArrayLike) FloatArrayLike[source]

Covert timedelta to numerical format

The timedeltas are converted to a unit of [days].

class mizani.transforms.pd_timedelta_trans(*, domain: DomainType = (<Mock name='mock.Timedelta.min' id='140516798122080'>, <Mock name='mock.Timedelta.max' id='140516798134416'>), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Pandas timedelta Transformation

class mizani.transforms.pseudo_log_trans(sigma: float = 1, base: float = np.float64(2.718281828459045), *, domain: DomainType = (-inf, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Pseudo-log transformation

A transformation mapping numbers to a signed logarithmic scale with a smooth transition to linear scale around 0.

Parameters:
sigmafloat

Scaling factor for the linear part.

baseint

Approximate logarithm used. If None, then the natural log is used.

transform(x: FloatArrayLike) NDArrayFloat[source]

Transform of x

inverse(x: FloatArrayLike) NDArrayFloat[source]

Inverse of x

minor_breaks(major: FloatArrayLike, limits: tuple[float, float] | None = None, n: int | None = None) NDArrayFloat[source]

Calculate minor_breaks

class mizani.transforms.reciprocal_trans(*, domain: DomainType = (-inf, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)[source]

Reciprocal Transformation

transform(x: FloatArrayLike) NDArrayFloat[source]

Transform of x

inverse(x: FloatArrayLike) NDArrayFloat[source]

Inverse of x

class mizani.transforms.trans(*, domain: 'DomainType' = (-inf, inf), transform_is_linear: 'bool' = False, breaks_func: 'BreaksFunction' = <factory>, format_func: 'FormatFunction' = <factory>, minor_breaks_func: 'MinorBreaksFunction | None' = None)[source]
transform_is_linear: bool = False

Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.

breaks_func: BreaksFunction

Callable to calculate breaks

format_func: FormatFunction

Function to format breaks

minor_breaks_func: MinorBreaksFunction | None = None

Callable to calculate minor breaks

abstract transform(x: TFloatArrayLike) TFloatArrayLike[source]

Transform of x

abstract inverse(x: TFloatArrayLike) TFloatArrayLike[source]

Inverse of x

property domain_is_numerical: bool

Return True if transformation acts on numerical data. e.g. int, float, and imag are numerical but datetime is not.

minor_breaks(major: FloatArrayLike, limits: tuple[float, float] | None = None, n: int | None = None) NDArrayFloat[source]

Calculate minor_breaks

breaks(limits: DomainType) NDArrayFloat[source]

Calculate breaks in data space and return them in transformed space.

Expects limits to be in transform space, this is the same space as that where the domain is specified.

This method wraps around breaks_() to ensure that the calculated breaks are within the domain the transform. This is helpful in cases where an aesthetic requests breaks with limits expanded for some padding, yet the expansion goes beyond the domain of the transform. e.g for a probability transform the breaks will be in the domain [0, 1] despite any outward limits.

Parameters:
limitstuple

The scale limits. Size 2.

Returns:
outarray_like

Major breaks

format(x: Any) Sequence[str][source]

Format breaks

When subclassing, you can override this function, or you can just define format_func.

diff_type_to_num(x: Any) FloatArrayLike[source]

Convert the difference between two points in the domain to a numeric

This function is necessary for some arithmetic operations in the transform space of a domain when the difference in between any two points in that domain is not numeric.

For example for a domain of datetime value types, the difference on the domain is of type timedelta. In this case this function should expect timedeltas and convert them to float values that compatible (same units) as the transform value of datetimes.

Parameters:
x

Differences

mizani.transforms.gettrans(t: str | Type[trans] | trans | None = None)[source]

Return a trans object

Parameters:
tstr | type | trans

Name of transformation function. If None, returns an identity transform.

Returns:
outtrans