vaep.plotting package#
- vaep.plotting.make_large_descriptors(size='xx-large')[source]#
Helper function to have very large titles, labes and tick texts for matplotlib plots per default.
- size: str
fontsize or allowed category. Change default if necessary, default ‘xx-large’
- vaep.plotting.plot_cutoffs(df: pd.DataFrame, feat_completness_over_samples: int = None, min_feat_in_sample: int = None) tuple[matplotlib.figure.Figure, np.array[matplotlib.axes.Axes]] [source]#
plot number of available features along index and columns (feat vs samples), potentially including some cutoff.
- Parameters:
- Returns:
_description_
- Return type:
tuple[matplotlib.figure.Figure, np.array[matplotlib.axes.Axes]]
- vaep.plotting.plot_feat_counts(df_counts: DataFrame, feat_name: str, n_samples: int, ax=None, figsize=(15, 10), count_col='counts', **kwargs)[source]#
- vaep.plotting.plot_rolling_error(errors: DataFrame, metric_name: str, window: int = 200, min_freq=None, freq_col: str = 'freq', colors_to_use=None, ax=None)[source]#
- vaep.plotting.savefig(fig, name, folder: Path = '.', pdf=True, dpi=300, tight_layout=True)#
Save matplotlib Figure (having method savefig) as pdf and png.
- vaep.plotting.select_dates(date_series: Series, max_ticks=30) array [source]#
Get unique dates (single days) for selection in pd.plot.line with xticks argument.
- Parameters:
date_series (pd.Series) – datetime series to use (values, not index)
max_ticks (int, optional) – maximum number of unique ticks to select, by default 30
- Returns:
_description_
- Return type:
np.array
- vaep.plotting.select_xticks(ax: Axes, max_ticks: int = 50) list [source]#
Limit the number of xticks displayed.
- Parameters:
ax (matplotlib.axes.Axes) – Axes object to manipulate
max_ticks (int, optional) – maximum number of set ticks on x-axis, by default 50
- Returns:
list of current ticks for x-axis. Either new or old (depending if something was changed).
- Return type:
Submodules#
vaep.plotting.data module#
Plot data distribution based on pandas DataFrames or Series.
- vaep.plotting.data.get_min_max_iterable(series: Iterable[Series]) Tuple[int] [source]#
Get the min and max as integer from an iterable of pandas.Series.
- vaep.plotting.data.min_max(s: Series) Tuple[int] [source]#
Get the min and max as integer from a pandas.Series.
- Parameters:
s (pd.Series) – Series of intensities.
- Returns:
_description_
- Return type:
Tuple[int]
- vaep.plotting.data.plot_feat_median_over_prop_missing(data: DataFrame, type: str = 'scatter', ax: Optional[Axes] = None, s: int = 1, return_plot_data: bool = False) Union[Axes, Tuple[Axes, DataFrame]] [source]#
Plot feature median over proportion missing in that feature. Sorted by feature median into bins.
- vaep.plotting.data.plot_histogram_intensities(s: Series, interval_bins=1, min_max=(15, 40), ax=None, **kwargs) Tuple[Axes, range] [source]#
Plot intensities in Series in a certain range and equally spaced intervals.
- vaep.plotting.data.plot_missing_dist_boxplots(data: DataFrame, min_feat_per_sample=None, min_samples_per_feat=None) Figure [source]#
- vaep.plotting.data.plot_missing_dist_highdim(data: DataFrame, min_feat_per_sample: Optional[int] = None, min_samples_per_feat: Optional[int] = None) Figure [source]#
Plot missing distribution (cdf) in high dimensional data.
- Parameters:
data (pd.DataFrame) – Intensity table with samples in rows and features in columns.
min_feat_per_sample (int, optional) – Show the minimum required features a sample has to have, by default None
min_samples_per_feat (int, optional) – Show the minimum required number of samples a feature has to be found in, by default None
- Returns:
Figure with two plots (Axes).
- Return type:
- vaep.plotting.data.plot_missing_pattern_histogram(data: DataFrame, bins: int = 20, min_feat_per_sample=None, min_samples_per_feat=None) Figure [source]#
- vaep.plotting.data.plot_missing_pattern_violinplot(data: DataFrame, min_feat_per_sample=None, min_samples_per_feat=None) Figure [source]#
- vaep.plotting.data.plot_observations(df: DataFrame, ax: Optional[Axes] = None, title: str = '', axis: int = 1, size: int = 1, ylabel: str = 'Frequency', xlabel: Optional[str] = None) Axes [source]#
Plot non missing observations by row (axis=1) or column (axis=0) in order of number of available observations. No binning is applied, only counts of non-missing values are plotted.
- Parameters:
df (pd.DataFrame) – DataFrame on which notna is applied
ax (Axes, optional) – Axes to plot on, by default None
title (str, optional) – Axes title, by default ‘’
axis (int, optional) – dimension to sum over, by default 1
ylabel (str, optional) – y-Axis label, by default ‘number of features’
xlabel (str, optional) – x-Axis label, by default ‘Samples ordered by number of features’
- Returns:
Axes on which plot was plotted
- Return type:
Axes
vaep.plotting.defaults module#
vaep.plotting.errors module#
Plot errors based on DataFrame with model predictions.
- vaep.plotting.errors.get_data_for_errors_by_median(errors: DataFrame, feat_name, metric_name)[source]#
- Extract Bars with confidence intervals from seaborn plot.
Confident intervals are calculated with bootstrapping (sampling the mean).
Relies on internal seaborn class. only used for reporting of source data in the paper.
- vaep.plotting.errors.plot_errors_binned(pred: DataFrame, target_col='observed', ax: Optional[Axes] = None, palette: Optional[dict] = None, metric_name: Optional[str] = None, errwidth: float = 1.2) Axes [source]#