Skip to content

Commit c4cad71

Browse files
authored
Merge pull request #1862 from Giskard-AI/task/gsk-2701-add-documentation-advanced-scan
Add documentation for advanced scan usage: custom metrics and min slice size
2 parents a093292 + c777f96 commit c4cad71

File tree

1 file changed

+96
-4
lines changed

1 file changed

+96
-4
lines changed

docs/open_source/scan/advanced_scan/index.rst

+96-4
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ Limiting to a specific group of detectors
1313
If you want to run only a specific detector (or a group of detectors), you can
1414
use the `only` argument. This argument accepts either a tag or a list of tags::
1515

16-
import giksard as gsk
16+
import giskard as gsk
1717

1818
report = gsk.scan(my_model, my_dataset, only="robustness")
1919

@@ -28,7 +28,7 @@ Limiting to a selection of model features
2828
If your model has a great number of features and you want to limit the scan to
2929
a specific subset, you can use the `features` argument::
3030

31-
import giksard as gsk
31+
import giskard as gsk
3232

3333
report = gsk.scan(my_model, my_dataset, features=["feature_1", "feature_2"])
3434

@@ -43,7 +43,7 @@ the `params` argument, which accepts a dictionary where the key is the
4343
identifier of the detector and the value is a dictionary of config options that
4444
will be passed to the detector upon initialization::
4545

46-
import giksard as gsk
46+
import giskard as gsk
4747

4848
params = {
4949
"performance_bias": dict(threshold=0.04, metrics=["accuracy", "f1"]),
@@ -86,4 +86,96 @@ This will limit the scan to 100,000 samples of your dataset. You can adjust this
8686
number to your needs.
8787

8888
Note: for classification models, we will make sure that the sample is balanced
89-
between the different classes via stratified sampling.
89+
between the different classes via stratified sampling.
90+
91+
92+
How to specify the minimum slice size
93+
-------------------------------------
94+
95+
By default, the minimum slice size is set to the maximum between 1% of the
96+
dataset size and 30 samples. You may want to customize this value, for example,
97+
when you expect to have a low number of problematic samples in your dataset.
98+
In that case, simply pass the `min_slice_size` parameter to the metrics you are
99+
interested in, either as an integer to set the minimum slice size to a fixed
100+
value or as a float to set it as a percentage of the dataset size::
101+
102+
import giskard as gsk
103+
104+
params = {
105+
"performance_bias": dict(min_slice_size=50),
106+
"spurious_correlation": dict(min_slice_size=0.01),
107+
}
108+
109+
report = gsk.scan(my_model, my_dataset, params=params)
110+
111+
112+
How to add a custom metric
113+
---------------------------
114+
115+
If you want to add a custom metric to the scan, you can do so by creating a
116+
class that extends the `giskard.scanner.performance.metrics.PerformanceMetric`
117+
class and implementing the `__call__` method. This method should return an
118+
instance of `giskard.scanner.performance.metrics.MetricResult`::
119+
120+
from giskard.scanner.performance.metrics import PerformanceMetric, MetricResult
121+
122+
class MyCustomMetric(PerformanceMetric):
123+
def __call__(self, model, dataset):
124+
# your custom logic here
125+
return MetricResult(
126+
name="my_custom_metric",
127+
value=0.42,
128+
affected_counts=100,
129+
binary_counts=[25, 75],
130+
)
131+
132+
You can also directly extend `giskard.scanner.performance.metrics.ClassificationPerformanceMetric`
133+
for classification models or `giskard.scanner.performance.metrics.RegressionPerformanceMetric`
134+
for regression models, implementing the method `_calculate_metric`.
135+
The following is an example of a custom classification metric that calculates the
136+
frequency-weighted accuracy::
137+
138+
from giskard.scanner.performance.metrics import (
139+
ClassificationPerformanceMetric,
140+
MetricResult
141+
)
142+
import numpy as np
143+
import sklearn.metrics
144+
145+
class FrequencyWeightedAccuracy(ClassificationPerformanceMetric):
146+
name = "Frequency-Weighted Accuracy"
147+
greater_is_better = True
148+
has_binary_counts = False
149+
150+
def _calculate_metric(
151+
self,
152+
y_true: np.ndarray,
153+
y_pred: np.ndarray,
154+
model: BaseModel
155+
):
156+
labels = model.meta.classification_labels
157+
label_to_id = {label: i for i, label in enumerate(labels)}
158+
y_true_ids = np.array([label_to_id[label] for label in y_true])
159+
class_counts = np.bincount(y_true_ids, minlength=len(labels))
160+
total_count = np.sum(class_counts)
161+
162+
weighted_sum = 0
163+
164+
for i in range(len(labels)):
165+
class_mask = y_true_ids == i
166+
if not np.any(class_mask):
167+
continue
168+
label_acc = sklearn.metrics.accuracy_score(y_true[class_mask], y_pred[class_mask])
169+
weighted_sum += (class_counts[i] / total_count) * label_acc
170+
return weighted_sum
171+
172+
Then, you can instantiate the metric and pass it to the `scan` method::
173+
174+
import giskard as gsk
175+
176+
frequency_weighted_accuracy = FrequencyWeightedAccuracy()
177+
178+
params = {
179+
"performance_bias": {"metrics": ["accuracy", frequency_weighted_accuracy]}
180+
}
181+
report = gsk.scan(my_model, my_dataset, params=params)

0 commit comments

Comments
 (0)