- Cyclic
- Diverging
- Miscellaneous
- Perceptually Uniform Sequential
- Qualitative
- Sequential
- Sequential (2)
plotting
Work with color maps
plot_cmap_collections
plot_cmap_collections (cmap_collections:str|list[str]=None)
Plot all color maps in the collections passed as cmap_collections
Type | Default | Details | |
---|---|---|---|
cmap_collections | str | list[str] | None | list of color map collections to display (from cmaps.keys()) |
The following color map collections are defined:
plot_cmap_collections
will plot a color bar for each color map in the selected collections:
- A single collection
'Cyclic') plot_cmap_collections(
- Several collections
'Qualitative', 'Sequential']) plot_cmap_collections([
- All the collections
plot_cmap_collections()
plot_color_bar
plot_color_bar (cmap:str, series:list[int|float]=None)
Plot a color bar with value overlay from series
based on cmap
Type | Default | Details | |
---|---|---|---|
cmap | str | string name of one of the cmaps | |
series | list[int | float] | None | series of numerical values to show for each color |
'tab10', range(10)) plot_color_bar(
'tab10', series=range(6)) plot_color_bar(
'tab10', series=[0, 1, 2]) plot_color_bar(
get_color_mapper
get_color_mapper (series:list[int|float], cmap:str='tab10')
Return color mapper based on a color map and a series of values
Type | Default | Details | |
---|---|---|---|
series | list[int | float] | series of values to map to colors | |
cmap | str | tab10 | name of the cmap to use |
Usage
This function is used to ensure coherent colors for different plots.
- Define a color mapper based on values and cmap:
clr_mapper = get_color_mapper([1, 2, 3, 4], cmap='Paired)
- Call the color mapper and have it return the appropriate values for any plot:
clr_mapper.to_rgba(2)
Example
We have dataset with several features.
import pandas as pd
from sklearn.cluster import DBSCAN, KMeans
from sklearn.datasets import make_blobs
= 6
n_feats = [f"col_{i}" for i in range(n_feats)]
col_list = make_blobs(n_samples=5_000, n_features=n_feats, centers=10, shuffle=True)
X, y
= pd.DataFrame(X, columns=col_list)
X 3) X.head(
col_0 | col_1 | col_2 | col_3 | col_4 | col_5 | |
---|---|---|---|---|---|---|
0 | -4.377115 | -3.113744 | 4.417737 | 7.327412 | 7.366114 | 0.033885 |
1 | -10.319655 | -6.998589 | 1.126784 | 7.731522 | 4.524063 | -1.337312 |
2 | 1.542669 | 6.398550 | 8.267037 | -1.024028 | -0.697208 | 8.599691 |
1. Define a color mapper based on values and cmap
We cluster the data into 10 clusters and make a scatter plot of two of the features, displaying the 10 cluster using a cmap
.
To ensure that we can keep the same cluster color mapping for other plots, we use clr_mapper
to predefine how colors are mapped to each cluster:
clr_mapper = get_color_mapper(cluster_ids, cmap=cmap)
.
= KMeans(n_clusters=10)
clustering = clustering.fit_predict(X)
clusters = np.unique(clusters) cluster_ids
='tab10'
cmap
=(6, 3))
plt.figure(figsize=clusters, s=2, cmap=cmap)
plt.scatter(X.col_0, X.col_1, c
plt.colorbar()'two first features data points, colored by cluster value')
plt.title( plt.show()
= get_color_mapper(cluster_ids, cmap=cmap) clr_mapper
Call the color mapper and use it in any plot
Use for another plot, showing another feature, and its value for each sample, colored according to its cluster
= 'col_4'
featname =(12, 3))
plt.figure(figsize='grey', alpha=.25, lw=0.25)
plt.plot(X[featname], cf'{featname}.')
plt.title(
plt.show()
=(12, 3))
plt.figure(figsize='grey', alpha=.25, lw=0.25)
plt.plot(X[featname], cfor c in cluster_ids:
= y == c
mask f"{featname}_{c}"] = X.loc[:, featname]
X[~mask, f"{featname}_{c}"] = np.nan
X.loc[f"{featname}_{c}"], label=str(c), c=clr_mapper.to_rgba(c), lw=0, marker='o', markersize=1)
plt.plot(X[f'{featname}. Data points colored according to the cluster it belongs to.')
plt.title(
plt.legend() plt.show()
Advanced plots
plot_feature_scatter
plot_feature_scatter (X:numpy.ndarray, y:Optional[numpy.ndarray]=None, n_plots:int=2, axes_per_row:int=3, axes_size:int=5)
Plots n_plots
scatter plots of randomly selected combinations of two features out of X
Type | Default | Details | |
---|---|---|---|
X | np.ndarray | input dataset. X.shape[1] is used to set the total number of features |
|
y | Optional[np.ndarray] | None | target dataset |
n_plots | int | 2 | number of feature pairs scatter plot to show |
axes_per_row | int | 3 | number of axes per row. number of rows will be calculated accordingly |
axes_size | int | 5 | size of one axes. figsize will be (ncols * axes_size, nrows * axes_size) |
X.shape
(5000, 16)
= 6
n_feats = [f"col_{i}" for i in range(n_feats)]
col_list = make_blobs(n_samples=5_000, n_features=n_feats, centers=10, shuffle=True)
X, y
= pd.DataFrame(X, columns=col_list)
X
=6, axes_per_row=3, axes_size=5) plot_feature_scatter(X.values, y, n_plots
When not value is available for y
, it is set to 1 by default
=4, axes_per_row=2, axes_size=2) plot_feature_scatter(X.values, n_plots