@@ -13,7 +13,7 @@ Limiting to a specific group of detectors
13
13
If you want to run only a specific detector (or a group of detectors), you can
14
14
use the `only ` argument. This argument accepts either a tag or a list of tags::
15
15
16
- import giksard as gsk
16
+ import giskard as gsk
17
17
18
18
report = gsk.scan(my_model, my_dataset, only="robustness")
19
19
@@ -28,7 +28,7 @@ Limiting to a selection of model features
28
28
If your model has a great number of features and you want to limit the scan to
29
29
a specific subset, you can use the `features ` argument::
30
30
31
- import giksard as gsk
31
+ import giskard as gsk
32
32
33
33
report = gsk.scan(my_model, my_dataset, features=["feature_1", "feature_2"])
34
34
@@ -43,7 +43,7 @@ the `params` argument, which accepts a dictionary where the key is the
43
43
identifier of the detector and the value is a dictionary of config options that
44
44
will be passed to the detector upon initialization::
45
45
46
- import giksard as gsk
46
+ import giskard as gsk
47
47
48
48
params = {
49
49
"performance_bias": dict(threshold=0.04, metrics=["accuracy", "f1"]),
@@ -86,4 +86,96 @@ This will limit the scan to 100,000 samples of your dataset. You can adjust this
86
86
number to your needs.
87
87
88
88
Note: for classification models, we will make sure that the sample is balanced
89
- between the different classes via stratified sampling.
89
+ between the different classes via stratified sampling.
90
+
91
+
92
+ How to specify the minimum slice size
93
+ -------------------------------------
94
+
95
+ By default, the minimum slice size is set to the maximum between 1% of the
96
+ dataset size and 30 samples. You may want to customize this value, for example,
97
+ when you expect to have a low number of problematic samples in your dataset.
98
+ In that case, simply pass the `min_slice_size ` parameter to the metrics you are
99
+ interested in, either as an integer to set the minimum slice size to a fixed
100
+ value or as a float to set it as a percentage of the dataset size::
101
+
102
+ import giskard as gsk
103
+
104
+ params = {
105
+ "performance_bias": dict(min_slice_size=50),
106
+ "spurious_correlation": dict(min_slice_size=0.01),
107
+ }
108
+
109
+ report = gsk.scan(my_model, my_dataset, params=params)
110
+
111
+
112
+ How to add a custom metric
113
+ ---------------------------
114
+
115
+ If you want to add a custom metric to the scan, you can do so by creating a
116
+ class that extends the `giskard.scanner.performance.metrics.PerformanceMetric `
117
+ class and implementing the `__call__ ` method. This method should return an
118
+ instance of `giskard.scanner.performance.metrics.MetricResult `::
119
+
120
+ from giskard.scanner.performance.metrics import PerformanceMetric, MetricResult
121
+
122
+ class MyCustomMetric(PerformanceMetric):
123
+ def __call__(self, model, dataset):
124
+ # your custom logic here
125
+ return MetricResult(
126
+ name="my_custom_metric",
127
+ value=0.42,
128
+ affected_counts=100,
129
+ binary_counts=[25, 75],
130
+ )
131
+
132
+ You can also directly extend `giskard.scanner.performance.metrics.ClassificationPerformanceMetric `
133
+ for classification models or `giskard.scanner.performance.metrics.RegressionPerformanceMetric `
134
+ for regression models, implementing the method `_calculate_metric `.
135
+ The following is an example of a custom classification metric that calculates the
136
+ frequency-weighted accuracy::
137
+
138
+ from giskard.scanner.performance.metrics import (
139
+ ClassificationPerformanceMetric,
140
+ MetricResult
141
+ )
142
+ import numpy as np
143
+ import sklearn.metrics
144
+
145
+ class FrequencyWeightedAccuracy(ClassificationPerformanceMetric):
146
+ name = "Frequency-Weighted Accuracy"
147
+ greater_is_better = True
148
+ has_binary_counts = False
149
+
150
+ def _calculate_metric(
151
+ self,
152
+ y_true: np.ndarray,
153
+ y_pred: np.ndarray,
154
+ model: BaseModel
155
+ ):
156
+ labels = model.meta.classification_labels
157
+ label_to_id = {label: i for i, label in enumerate(labels)}
158
+ y_true_ids = np.array([label_to_id[label] for label in y_true])
159
+ class_counts = np.bincount(y_true_ids, minlength=len(labels))
160
+ total_count = np.sum(class_counts)
161
+
162
+ weighted_sum = 0
163
+
164
+ for i in range(len(labels)):
165
+ class_mask = y_true_ids == i
166
+ if not np.any(class_mask):
167
+ continue
168
+ label_acc = sklearn.metrics.accuracy_score(y_true[class_mask], y_pred[class_mask])
169
+ weighted_sum += (class_counts[i] / total_count) * label_acc
170
+ return weighted_sum
171
+
172
+ Then, you can instantiate the metric and pass it to the `scan ` method::
173
+
174
+ import giskard as gsk
175
+
176
+ frequency_weighted_accuracy = FrequencyWeightedAccuracy()
177
+
178
+ params = {
179
+ "performance_bias": {"metrics": ["accuracy", frequency_weighted_accuracy]}
180
+ }
181
+ report = gsk.scan(my_model, my_dataset, params=params)
0 commit comments