vaep.plotting package#

vaep.plotting.make_large_descriptors(size='xx-large')[source]#

Helper function to have very large titles, labes and tick texts for matplotlib plots per default.

size: str

fontsize or allowed category. Change default if necessary, default ‘xx-large’

vaep.plotting.plot_cutoffs(df: pd.DataFrame, feat_completness_over_samples: int = None, min_feat_in_sample: int = None) tuple[matplotlib.figure.Figure, np.array[matplotlib.axes.Axes]][source]#

plot number of available features along index and columns (feat vs samples), potentially including some cutoff.

Parameters:
  • df (pd.DataFrame) – DataFrame in wide data format.

  • feat_completness_over_samples (int, optional) – horizental line to plot as cutoff for features, by default None

  • min_feat_in_sample (int, optional) – horizental line to plot as cutoff for samples, by default None

Returns:

_description_

Return type:

tuple[matplotlib.figure.Figure, np.array[matplotlib.axes.Axes]]

vaep.plotting.plot_feat_counts(df_counts: DataFrame, feat_name: str, n_samples: int, ax=None, figsize=(15, 10), count_col='counts', **kwargs)[source]#
vaep.plotting.plot_rolling_error(errors: DataFrame, metric_name: str, window: int = 200, min_freq=None, freq_col: str = 'freq', colors_to_use=None, ax=None)[source]#
vaep.plotting.savefig(fig, name, folder: Path = '.', pdf=True, dpi=300, tight_layout=True)#

Save matplotlib Figure (having method savefig) as pdf and png.

vaep.plotting.select_dates(date_series: Series, max_ticks=30) array[source]#

Get unique dates (single days) for selection in pd.plot.line with xticks argument.

Parameters:
  • date_series (pd.Series) – datetime series to use (values, not index)

  • max_ticks (int, optional) – maximum number of unique ticks to select, by default 30

Returns:

_description_

Return type:

np.array

vaep.plotting.select_xticks(ax: Axes, max_ticks: int = 50) list[source]#

Limit the number of xticks displayed.

Parameters:
  • ax (matplotlib.axes.Axes) – Axes object to manipulate

  • max_ticks (int, optional) – maximum number of set ticks on x-axis, by default 50

Returns:

list of current ticks for x-axis. Either new or old (depending if something was changed).

Return type:

list

Submodules#

vaep.plotting.data module#

Plot data distribution based on pandas DataFrames or Series.

vaep.plotting.data.get_min_max_iterable(series: Iterable[Series]) Tuple[int][source]#

Get the min and max as integer from an iterable of pandas.Series.

vaep.plotting.data.min_max(s: Series) Tuple[int][source]#

Get the min and max as integer from a pandas.Series.

Parameters:

s (pd.Series) – Series of intensities.

Returns:

_description_

Return type:

Tuple[int]

vaep.plotting.data.plot_feat_median_over_prop_missing(data: DataFrame, type: str = 'scatter', ax: Optional[Axes] = None, s: int = 1, return_plot_data: bool = False) Union[Axes, Tuple[Axes, DataFrame]][source]#

Plot feature median over proportion missing in that feature. Sorted by feature median into bins.

vaep.plotting.data.plot_histogram_intensities(s: Series, interval_bins=1, min_max=(15, 40), ax=None, **kwargs) Tuple[Axes, range][source]#

Plot intensities in Series in a certain range and equally spaced intervals.

vaep.plotting.data.plot_missing_dist_boxplots(data: DataFrame, min_feat_per_sample=None, min_samples_per_feat=None) Figure[source]#
vaep.plotting.data.plot_missing_dist_highdim(data: DataFrame, min_feat_per_sample: Optional[int] = None, min_samples_per_feat: Optional[int] = None) Figure[source]#

Plot missing distribution (cdf) in high dimensional data.

Parameters:
  • data (pd.DataFrame) – Intensity table with samples in rows and features in columns.

  • min_feat_per_sample (int, optional) – Show the minimum required features a sample has to have, by default None

  • min_samples_per_feat (int, optional) – Show the minimum required number of samples a feature has to be found in, by default None

Returns:

Figure with two plots (Axes).

Return type:

matplotlib.figure.Figure

vaep.plotting.data.plot_missing_pattern_histogram(data: DataFrame, bins: int = 20, min_feat_per_sample=None, min_samples_per_feat=None) Figure[source]#
vaep.plotting.data.plot_missing_pattern_violinplot(data: DataFrame, min_feat_per_sample=None, min_samples_per_feat=None) Figure[source]#
vaep.plotting.data.plot_observations(df: DataFrame, ax: Optional[Axes] = None, title: str = '', axis: int = 1, size: int = 1, ylabel: str = 'Frequency', xlabel: Optional[str] = None) Axes[source]#

Plot non missing observations by row (axis=1) or column (axis=0) in order of number of available observations. No binning is applied, only counts of non-missing values are plotted.

Parameters:
  • df (pd.DataFrame) – DataFrame on which notna is applied

  • ax (Axes, optional) – Axes to plot on, by default None

  • title (str, optional) – Axes title, by default ‘’

  • axis (int, optional) – dimension to sum over, by default 1

  • ylabel (str, optional) – y-Axis label, by default ‘number of features’

  • xlabel (str, optional) – x-Axis label, by default ‘Samples ordered by number of features’

Returns:

Axes on which plot was plotted

Return type:

Axes

vaep.plotting.defaults module#

class vaep.plotting.defaults.ModelColorVisualizer(models, palette)[source]#

Bases: object

as_hex()[source]#

Return a color palette with hex codes instead of RGB values.

vaep.plotting.defaults.assign_colors(models)[source]#

vaep.plotting.errors module#

Plot errors based on DataFrame with model predictions.

vaep.plotting.errors.get_data_for_errors_by_median(errors: DataFrame, feat_name, metric_name)[source]#
Extract Bars with confidence intervals from seaborn plot.

Confident intervals are calculated with bootstrapping (sampling the mean).

Relies on internal seaborn class. only used for reporting of source data in the paper.

vaep.plotting.errors.plot_errors_binned(pred: DataFrame, target_col='observed', ax: Optional[Axes] = None, palette: Optional[dict] = None, metric_name: Optional[str] = None, errwidth: float = 1.2) Axes[source]#
vaep.plotting.errors.plot_errors_by_median(pred: pd.DataFrame, feat_medians: pd.Series, target_col='observed', ax: Axes = None, palette: dict = None, feat_name: str = None, metric_name: Optional[str] = None, errwidth: float = 1.2) tuple[Axes, pd.DataFrame][source]#
vaep.plotting.errors.plot_rolling_error(errors: DataFrame, metric_name: str, window: int = 200, min_freq=None, freq_col: str = 'freq', colors_to_use=None, ax=None)[source]#

vaep.plotting.plotly module#

vaep.plotting.plotly.apply_default_layout(fig)[source]#