transforms - Transforming variables, scales and coordinates¶
"The Grammar of Graphics (2005)" by Wilkinson, Anand and Grossman describes three types of transformations.
Variable transformations - Used to make statistical operations on variables appropriate and meaningful. They are also used to new variables.
Scale transformations - Used to make statistical objects displayed on dimensions appropriate and meaningful.
Coordinate transformations - Used to manipulate the geometry of graphics to help perceive relationships and find meaningful structures for representing variations.
Variable and scale transformations are similar in-that they lead to
plotted objects that are indistinguishable. Typically, variable
transformation is done outside the graphics system and so the system
cannot provide transformation specific guides & decorations for the
plot. The trans
is aimed at being useful for scale and
coordinate transformations.
- class mizani.transforms.atanh_trans(**kwargs: Any)[source]¶
Arc-tangent Transformation
- transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'arctanh'>¶
- inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'tanh'>¶
- mizani.transforms.boxcox_trans(p, offset=0, **kwargs)[source]¶
Boxcox Transformation
The Box-Cox transformation is a flexible transformation, often used to transform data towards normality.
The Box-Cox power transformation (type 1) requires strictly positive values and takes the following form for \(y \gt 0\):
\[y^{(\lambda)} = \frac{y^\lambda - 1}{\lambda}\]When \(y = 0\), the natural log transform is used.
- Parameters:
- p
float
Transformation exponent \(\lambda\).
- offset
int
Constant offset. 0 for Box-Cox type 1, otherwise any non-negative constant (Box-Cox type 2). The default is 0.
modulus_trans()
sets the default to 1.- kwargs
dict
Keyword arguments passed onto
trans_new()
. Should not include the transform or inverse.
- p
See also
References
Box, G. E., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 211-252. https://www.jstor.org/stable/2984418
John, J. A., & Draper, N. R. (1980). An alternative family of transformations. Applied Statistics, 190-197. http://www.jstor.org/stable/2986305
- mizani.transforms.modulus_trans(p, offset=1, **kwargs)[source]¶
Modulus Transformation
The modulus transformation generalises Box-Cox to work with both positive and negative values.
When \(y \neq 0\)
\[y^{(\lambda)} = sign(y) * \frac{(|y| + 1)^\lambda - 1}{\lambda}\]and when \(y = 0\)
\[y^{(\lambda)} = sign(y) * \ln{(|y| + 1)}\]- Parameters:
- p
float
Transformation exponent \(\lambda\).
- offset
int
Constant offset. 0 for Box-Cox type 1, otherwise any non-negative constant (Box-Cox type 2). The default is 1.
boxcox_trans()
sets the default to 0.- kwargs
dict
Keyword arguments passed onto
trans_new()
. Should not include the transform or inverse.
- p
See also
References
Box, G. E., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 211-252. https://www.jstor.org/stable/2984418
John, J. A., & Draper, N. R. (1980). An alternative family of transformations. Applied Statistics, 190-197. http://www.jstor.org/stable/2986305
- class mizani.transforms.datetime_trans(tz=None, **kwargs)[source]¶
Datetime Transformation
- Parameters:
- tz
str
|ZoneInfo
Timezone information
- tz
Examples
>>> from zoneinfo import ZoneInfo >>> UTC = ZoneInfo("UTC") >>> EST = ZoneInfo("EST") >>> t = datetime_trans(EST) >>> x = [datetime(2022, 1, 20, tzinfo=UTC)] >>> x2 = t.inverse(t.transform(x)) >>> list(x) == list(x2) True >>> x[0].tzinfo == x2[0].tzinfo False >>> x[0].tzinfo.key 'UTC' >>> x2[0].tzinfo.key 'EST'
- breaks_: BreaksFunction = <mizani.breaks.breaks_date object>¶
Callable to calculate breaks
- format: FormatFunction = label_date(fmt='%Y-%m-%d', tz=None)¶
Function to format breaks
- property tzinfo¶
Alias of tz
- mizani.transforms.exp_trans(base: float | None = None, **kwargs: Any)[source]¶
Create a exponential transform class for base
This is inverse of the log transform.
- Parameters:
- base
float
Base of the logarithm
- kwargs
dict
Keyword arguments passed onto
trans_new()
. Should not include the transform or inverse.
- base
- Returns:
- outtype
Exponential transform class
- class mizani.transforms.identity_trans(**kwargs: Any)[source]¶
Identity Transformation
Examples
The default trans returns one minor break between every pair of major break
>>> major = [0, 1, 2] >>> t = identity_trans() >>> t.minor_breaks(major) array([0.5, 1.5])
Create a trans that returns 4 minor breaks
>>> t = identity_trans(minor_breaks=minor_breaks(4)) >>> t.minor_breaks(major) array([0.2, 0.4, 0.6, 0.8, 1.2, 1.4, 1.6, 1.8])
- transform_is_linear: bool = True¶
Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.
- static transform(*args)¶
Return whatever is passed in
- static inverse(*args)¶
Return whatever is passed in
- class mizani.transforms.log10_trans(**kwargs: Any)¶
Log 10 Transformation
- breaks_: BreaksFunction = <mizani.breaks.breaks_log object>¶
Callable to calculate breaks
- format: FormatFunction = label_log(base=10, exponent_limits=(-4, 4), mathtex=False)¶
Function to format breaks
- static inverse(x)¶
Inverse of x
- transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log10'>¶
- class mizani.transforms.log1p_trans(**kwargs: Any)[source]¶
Log plus one Transformation
- transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log1p'>¶
- inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'expm1'>¶
- class mizani.transforms.log2_trans(**kwargs: Any)¶
Log 2 Transformation
- breaks_: BreaksFunction = <mizani.breaks.breaks_log object>¶
Callable to calculate breaks
- format: FormatFunction = label_log(base=2, exponent_limits=(-4, 4), mathtex=False)¶
Function to format breaks
- static inverse(x)¶
Inverse of x
- transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log2'>¶
- mizani.transforms.log_trans(base: float | None = None, **kwargs: Any) trans [source]¶
Create a log transform class for base
- Parameters:
- base
float
Base for the logarithm. If None, then the natural log is used.
- kwargs
dict
Keyword arguments passed onto
trans_new()
. Should not include the transform or inverse.
- base
- Returns:
- outtype
Log transform class
- class mizani.transforms.logit_trans(**kwargs: Any)¶
Logit Transformation
- static inverse(x: FloatArrayLike) NDArrayFloat ¶
Inverse of x
- static transform(x: FloatArrayLike) NDArrayFloat ¶
Transform of x
- mizani.transforms.probability_trans(distribution: str, *args, **kwargs) trans [source]¶
Probability Transformation
- Parameters:
- distribution
str
Name of the distribution. Valid distributions are listed at
scipy.stats
. Any of the continuous or discrete distributions.- args
tuple
Arguments passed to the distribution functions.
- kwargs
dict
Keyword arguments passed to the distribution functions.
- distribution
Notes
Make sure that the distribution is a good enough approximation for the data. When this is not the case, computations may run into errors. Absence of any errors does not imply that the distribution fits the data.
- mizani.transforms.probit_trans¶
alias of
norm_trans
- class mizani.transforms.reverse_trans(**kwargs: Any)[source]¶
Reverse Transformation
- transform_is_linear: bool = True¶
Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.
- transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'negative'>¶
- inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'negative'>¶
- class mizani.transforms.sqrt_trans(**kwargs: Any)[source]¶
Square-root Transformation
- transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'sqrt'>¶
- inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'square'>¶
- class mizani.transforms.symlog_trans(**kwargs: Any)[source]¶
Symmetric Log Transformation
They symmetric logarithmic transformation is defined as
f(x) = log(x+1) for x >= 0 -log(-x+1) for x < 0
It can be useful for data that has a wide range of both positive and negative values (including zero).
- class mizani.transforms.timedelta_trans(**kwargs: Any)[source]¶
Timedelta Transformation
- breaks_: BreaksFunction = <mizani.breaks.breaks_timedelta object>¶
Callable to calculate breaks
- format: FormatFunction = label_timedelta(units=None, show_units=True, zero_has_units=True, usetex=False, space=True, use_plurals=True)¶
Function to format breaks
- class mizani.transforms.pd_timedelta_trans(**kwargs: Any)[source]¶
Pandas timedelta Transformation
- breaks_: BreaksFunction = <mizani.breaks.breaks_timedelta object>¶
Callable to calculate breaks
- format: FormatFunction = label_timedelta(units=None, show_units=True, zero_has_units=True, usetex=False, space=True, use_plurals=True)¶
Function to format breaks
- class mizani.transforms.pseudo_log_trans(sigma=1, base=None, **kwargs)[source]¶
Pseudo-log transformation
A transformation mapping numbers to a signed logarithmic scale with a smooth transition to linear scale around 0.
- Parameters:
- sigma
float
Scaling factor for the linear part.
- base
int
Approximate logarithm used. If None, then the natural log is used.
- kwargs
dict
Keyword arguments passed onto
trans_new()
. Should not include the transform or inverse.
- sigma
- class mizani.transforms.trans(**kwargs: Any)[source]¶
Base class for all transforms
This class is used to transform data and also tell the x and y axes how to create and label the tick locations.
The key methods to override are
trans.transform()
andtrans.inverse()
. Alternately, you can quickly create a transform class using thetrans_new()
function.- Parameters:
- kwargs
dict
Attributes of the class to set/override
- kwargs
- transform_is_linear: bool = False¶
Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.
- breaks_: BreaksFunction = <mizani.breaks.breaks_extended object>¶
Callable to calculate breaks
- format: FormatFunction = label_number(accuracy=None, precision=None, scale=1, prefix='', suffix='', big_mark=',', decimal_mark='.', fill='', style_negative='-', style_positive='', align='>', width=None)¶
Function to format breaks
- property domain_is_numerical: bool¶
Return True if transformation acts on numerical data. e.g. int, float, and imag are numerical but datetime is not.
- minor_breaks(major: FloatArrayLike, limits: TupleFloat2 | None = None, n: int | None = None) NDArrayFloat [source]¶
Calculate minor_breaks
- breaks(limits: tuple[Any, Any]) NDArrayAny [source]¶
Calculate breaks in data space and return them in transformed space.
Expects limits to be in transform space, this is the same space as that where the domain is specified.
This method wraps around
breaks_()
to ensure that the calculated breaks are within the domain the transform. This is helpful in cases where an aesthetic requests breaks with limits expanded for some padding, yet the expansion goes beyond the domain of the transform. e.g for a probability transform the breaks will be in the domain[0, 1]
despite any outward limits.- Parameters:
- limits
tuple
The scale limits. Size 2.
- limits
- Returns:
- outarray_like
Major breaks
- mizani.transforms.trans_new(name: str, transform: TransformFunction, inverse: InverseFunction, breaks: BreaksFunction | None = None, minor_breaks: MinorBreaksFunction | None = None, _format: FormatFunction | None = None, domain=(-inf, inf), doc: str = '', **kwargs) trans [source]¶
Create a transformation class object
- Parameters:
- name
str
Name of the transformation
- transform
callable()
f(x)
A function (preferably a ufunc) that computes the transformation.
- inverse
callable()
f(x)
A function (preferably a ufunc) that computes the inverse of the transformation.
- breaks
callable()
f(limits)
Function to compute the breaks for this transform. If None, then a default good enough for a linear domain is used.
- minor_breaks
callable()
f(major, limits)
Function to compute the minor breaks for this transform. If None, then a default good enough for a linear domain is used.
- _format
callable()
f(breaks)
Function to format the generated breaks.
- domainarray_like
Domain over which the transformation is valid. It should be of length 2.
- doc
str
Docstring for the class.
- **kwargs
dict
Attributes of the transform, e.g if base is passed in kwargs, then t.base would be a valied attribute.
- name
- Returns:
- out
trans
Transform class
- out