transforms - Transforming variables, scales and coordinates¶
"The Grammar of Graphics (2005)" by Wilkinson, Anand and Grossman describes three types of transformations.
Variable transformations - Used to make statistical operations on variables appropriate and meaningful. They are also used to new variables.
Scale transformations - Used to make statistical objects displayed on dimensions appropriate and meaningful.
Coordinate transformations - Used to manipulate the geometry of graphics to help perceive relationships and find meaningful structures for representing variations.
Variable and scale transformations are similar in-that they lead to
plotted objects that are indistinguishable. Typically, variable
transformation is done outside the graphics system and so the system
cannot provide transformation specific guides & decorations for the
plot. The trans
is aimed at being useful for scale and
coordinate transformations.
- class mizani.transforms.atanh_trans(**kwargs)[source]¶
Arc-tangent Transformation
- transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'arctanh'>¶
- inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'tanh'>¶
- mizani.transforms.boxcox_trans(p, offset=0, **kwargs)[source]¶
Boxcox Transformation
The Box-Cox transformation is a flexible transformation, often used to transform data towards normality.
The Box-Cox power transformation (type 1) requires strictly positive values and takes the following form for \(y \gt 0\):
\[y^{(\lambda)} = \frac{y^\lambda - 1}{\lambda}\]When \(y = 0\), the natural log transform is used.
- Parameters:
- p
float
Transformation exponent \(\lambda\).
- offset
int
Constant offset. 0 for Box-Cox type 1, otherwise any non-negative constant (Box-Cox type 2). The default is 0.
modulus_trans()
sets the default to 1.- kwargs
dict
Keyword arguments passed onto
trans_new()
. Should not include the transform or inverse.
- p
See also
References
Box, G. E., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 211-252. https://www.jstor.org/stable/2984418
John, J. A., & Draper, N. R. (1980). An alternative family of transformations. Applied Statistics, 190-197. http://www.jstor.org/stable/2986305
- mizani.transforms.modulus_trans(p, offset=1, **kwargs)[source]¶
Modulus Transformation
The modulus transformation generalises Box-Cox to work with both positive and negative values.
When \(y \neq 0\)
\[y^{(\lambda)} = sign(y) * \frac{(|y| + 1)^\lambda - 1}{\lambda}\]and when \(y = 0\)
\[y^{(\lambda)} = sign(y) * \ln{(|y| + 1)}\]- Parameters:
- p
float
Transformation exponent \(\lambda\).
- offset
int
Constant offset. 0 for Box-Cox type 1, otherwise any non-negative constant (Box-Cox type 2). The default is 1.
boxcox_trans()
sets the default to 0.- kwargs
dict
Keyword arguments passed onto
trans_new()
. Should not include the transform or inverse.
- p
See also
References
Box, G. E., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 211-252. https://www.jstor.org/stable/2984418
John, J. A., & Draper, N. R. (1980). An alternative family of transformations. Applied Statistics, 190-197. http://www.jstor.org/stable/2986305
- class mizani.transforms.datetime_trans(tz=None, **kwargs)[source]¶
Datetime Transformation
- Parameters:
- tz
str
|ZoneInfo
Timezone information
- tz
Examples
>>> # from zoneinfo import ZoneInfo >>> # from backports.zoneinfo import ZoneInfo # for python < 3.9 >>> UTC = ZoneInfo("UTC") >>> EST = ZoneInfo("EST") >>> t = datetime_trans(EST) >>> x = datetime.datetime(2022, 1, 20, tzinfo=UTC) >>> x2 = t.inverse(t.transform(x)) >>> x == x2 True >>> x.tzinfo == x2.tzinfo False >>> x.tzinfo.key 'UTC' >>> x2.tzinfo.key 'EST'
- dataspace_is_numerical = False¶
Whether the untransformed data is numerical
- domain = (datetime.datetime(1, 1, 1, 0, 0, tzinfo=zoneinfo.ZoneInfo(key='UTC')), datetime.datetime(9999, 12, 31, 0, 0, tzinfo=zoneinfo.ZoneInfo(key='UTC')))¶
Limits of the transformed data
- breaks_ = <mizani.breaks.date_breaks object>¶
Callable to calculate breaks
- format = <mizani.formatters.date_format object>¶
Function to format breaks
- property tzinfo¶
Alias of tz
- mizani.transforms.exp_trans(base=None, **kwargs)[source]¶
Create a exponential transform class for base
This is inverse of the log transform.
- Parameters:
- base
float
Base of the logarithm
- kwargs
dict
Keyword arguments passed onto
trans_new()
. Should not include the transform or inverse.
- base
- Returns:
- outtype
Exponential transform class
- class mizani.transforms.log10_trans(**kwargs)¶
Log 10 Transformation
- breaks_ = <mizani.breaks.log_breaks object>¶
Callable to calculate breaks
- domain = (2.2250738585072014e-308, inf)¶
Limits of the transformed data
- format = <mizani.formatters.log_format object>¶
Function to format breaks
- static inverse(x)¶
Inverse of x
- minor_breaks = <mizani.breaks.trans_minor_breaks object>¶
Callable to calculate minor_breaks
- transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log10'>¶
- class mizani.transforms.log1p_trans(**kwargs)[source]¶
Log plus one Transformation
- transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log1p'>¶
- inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'expm1'>¶
- class mizani.transforms.log2_trans(**kwargs)¶
Log 2 Transformation
- breaks_ = <mizani.breaks.log_breaks object>¶
Callable to calculate breaks
- domain = (2.2250738585072014e-308, inf)¶
Limits of the transformed data
- format = <mizani.formatters.log_format object>¶
Function to format breaks
- static inverse(x)¶
Inverse of x
- minor_breaks = <mizani.breaks.trans_minor_breaks object>¶
Callable to calculate minor_breaks
- transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log2'>¶
- mizani.transforms.log_trans(base=None, **kwargs)[source]¶
Create a log transform class for base
- Parameters:
- base
float
Base for the logarithm. If None, then the natural log is used.
- kwargs
dict
Keyword arguments passed onto
trans_new()
. Should not include the transform or inverse.
- base
- Returns:
- outtype
Log transform class
- class mizani.transforms.logit_trans(**kwargs)¶
Logit Transformation
- domain = (0, 1)¶
Limits of the transformed data
- static inverse(x)¶
Inverse of x
- static transform(x)¶
Transform of x
- mizani.transforms.probability_trans(distribution, *args, **kwargs)[source]¶
Probability Transformation
- Parameters:
- distribution
str
Name of the distribution. Valid distributions are listed at
scipy.stats
. Any of the continuous or discrete distributions.- args
tuple
Arguments passed to the distribution functions.
- kwargs
dict
Keyword arguments passed to the distribution functions.
- distribution
Notes
Make sure that the distribution is a good enough approximation for the data. When this is not the case, computations may run into errors. Absence of any errors does not imply that the distribution fits the data.
- mizani.transforms.probit_trans¶
alias of
norm_trans
- class mizani.transforms.reverse_trans(**kwargs)[source]¶
Reverse Transformation
- transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'negative'>¶
- inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'negative'>¶
- class mizani.transforms.sqrt_trans(**kwargs)[source]¶
Square-root Transformation
- transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'sqrt'>¶
- inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'square'>¶
- domain = (0, inf)¶
Limits of the transformed data
- class mizani.transforms.timedelta_trans(**kwargs)[source]¶
Timedelta Transformation
- dataspace_is_numerical = False¶
Whether the untransformed data is numerical
- domain = (datetime.timedelta(days=-999999999), datetime.timedelta(days=999999999, seconds=86399, microseconds=999999))¶
Limits of the transformed data
- breaks_ = <mizani.breaks.timedelta_breaks object>¶
Callable to calculate breaks
- format = <mizani.formatters.timedelta_format object>¶
Function to format breaks
- class mizani.transforms.pd_timedelta_trans(**kwargs)[source]¶
Pandas timedelta Transformation
- dataspace_is_numerical = False¶
Whether the untransformed data is numerical
- domain = (<Mock name='mock.Timedelta.min' id='139874961989456'>, <Mock name='mock.Timedelta.max' id='139874961899344'>)¶
Limits of the transformed data
- breaks_ = <mizani.breaks.timedelta_breaks object>¶
Callable to calculate breaks
- format = <mizani.formatters.timedelta_format object>¶
Function to format breaks
- mizani.transforms.pseudo_log_trans(sigma=1, base=None, **kwargs)[source]¶
Pseudo-log transformation
A transformation mapping numbers to a signed logarithmic scale with a smooth transition to linear scale around 0.
- Parameters:
- sigma
float
Scaling factor for the linear part.
- base
int
Approximate logarithm used. If None, then the natural log is used.
- kwargs
dict
Keyword arguments passed onto
trans_new()
. Should not include the transform or inverse.
- sigma
- class mizani.transforms.trans(**kwargs)[source]¶
Base class for all transforms
This class is used to transform data and also tell the x and y axes how to create and label the tick locations.
The key methods to override are
trans.transform()
andtrans.inverse()
. Alternately, you can quickly create a transform class using thetrans_new()
function.- Parameters:
- kwargs
dict
Attributes of the class to set/override
- kwargs
Examples
By default trans returns one minor break between every pair of major break
>>> major = [0, 1, 2] >>> t = trans() >>> t.minor_breaks(major) array([0.5, 1.5])
Create a trans that returns 4 minor breaks
>>> t = trans(minor_breaks=minor_breaks(4)) >>> t.minor_breaks(major) array([0.2, 0.4, 0.6, 0.8, 1.2, 1.4, 1.6, 1.8])
- aesthetic = None¶
Aesthetic that the transform works on
- dataspace_is_numerical = True¶
Whether the untransformed data is numerical
- domain = (-inf, inf)¶
Limits of the transformed data
- format = <mizani.formatters.mpl_format object>¶
Function to format breaks
- breaks_ = None¶
Callable to calculate breaks
- minor_breaks = None¶
Callable to calculate minor_breaks
- breaks(limits)[source]¶
Calculate breaks in data space and return them in transformed space.
Expects limits to be in transform space, this is the same space as that where the domain is specified.
This method wraps around
breaks_()
to ensure that the calculated breaks are within the domain the transform. This is helpful in cases where an aesthetic requests breaks with limits expanded for some padding, yet the expansion goes beyond the domain of the transform. e.g for a probability transform the breaks will be in the domain[0, 1]
despite any outward limits.- Parameters:
- limits
tuple
The scale limits. Size 2.
- limits
- Returns:
- outarray_like
Major breaks
- mizani.transforms.trans_new(name, transform, inverse, breaks=None, minor_breaks=None, _format=None, domain=(-inf, inf), doc='', **kwargs)[source]¶
Create a transformation class object
- Parameters:
- name
str
Name of the transformation
- transform
callable()
f(x)
A function (preferably a ufunc) that computes the transformation.
- inverse
callable()
f(x)
A function (preferably a ufunc) that computes the inverse of the transformation.
- breaks
callable()
f(limits)
Function to compute the breaks for this transform. If None, then a default good enough for a linear domain is used.
- minor_breaks
callable()
f(major, limits)
Function to compute the minor breaks for this transform. If None, then a default good enough for a linear domain is used.
- _format
callable()
f(breaks)
Function to format the generated breaks.
- domainarray_like
Domain over which the transformation is valid. It should be of length 2.
- doc
str
Docstring for the class.
- **kwargs
dict
Attributes of the transform, e.g if base is passed in kwargs, then t.base would be a valied attribute.
- name
- Returns:
- out
trans
Transform class
- out