Estimation Algorithms

class moval.models.Model(estim_algorithm: str, mode: str, num_class: int, confidence_scores: str, class_specific: bool, extend_param: bool)

Bases: ABC

Base model for MOVAL experiments.

Features can be computed by calling the __call__() method. This class should not be directly instaniated, and instead used as the base class for MOVAl models.

Args:

mode (str): The given task to estimate model performance. num_class (int): The number of class for the given task. confidence_scores (str): The method to calculate confidence scores. class_specific (bool): If True, the calculation will match class-wise confidence to class-wise accuracy/DSC.

Atrributes:

estim_algorithm (str):: The chosen estimated algorithms for confidence calibration.
mode (str):: The given task to estimate model performance. This will choose the criterions for parameter optimization.
num_class (int):: The number of class for the given task. This will decide the number of parameter to optimize and optimization criterions.
confidence_scores (str):: The method to calculate confidence scores. This will decide the methods chosen from confidences.
class_spcific (bool):: If True, the calculation will match class-wise confidence to class-wise accuracy/DSC. This will also affect the number of parameter to optimize and optimization criterions.
conf (moval.models.Confidence):: The Confidence class to calculate the confidence scores.
max_value (float):: The maximum value to normalize the confidence score before calibration. This is only necessary when conf is unbounded.
min_value (float):: The minimum value to normalize the confidence score before calibration. This is only necessary when conf is unbounded.
param (np.ndarray):: The parameter to be optimized.
extend_param (bool):: If True, the model contains more parameters (x2) and need two-stage optimization. Currently, this is only applicable for ts-atc-model.
is_fitted (bool):: If True, the model is fitted using validation data and ready to use for estimating performance.
is_training (bool):: If True, the normalization paramters would be updated using the input data.

calculate_probability(inp: List[Iterable] | ndarray, midstage: bool = False, appr: bool = False, full: bool = False) → List[Iterable] | ndarray

Calculate the calibrated probability with parameters. For classificaiton tasks, we choose to estimate the pseudo-temperature, for segmentation tasks, we simplify it with 1-socre.

Args:: inp: The network output (logits) of shape (n, d) or a list of n (d, H, W, (D)). midstage: If True, return the first calibrated results. appr: If True, utilize the approximation version. full: If False, we just put the score in and fill the other domains with 0.
Returns:: calibrated_probability: The calibrated probability which would match the accuracy/DSC on validation data, of shape (n, d) or a list of n (d, H, W, (D)).

calculate_probability_appr(inp: List[Iterable] | ndarray, midstage: bool = False, full: bool = False) → List[Iterable] | ndarray

Calculate the calibrated probability with parameters, utilizing approximation of 1-score. To acclerate the optimization process, we calculate the probability but divide 1-score equally to other classes. I should use it for all the segmentation tasks, as the pixel number is always quite large.

Note:: The calibrated probability should have the same dimension with network outputs. This is identical to calculate_probability when d == 2, i.e. binary segmentation, which is the most common case. Therefore, I am not very worry about the descrepencies.
Args:: inp: The network output (logits) of shape (n, d) or a list of n (d, H, W, (D)). midstage: If True, return the first calibrated results. full: If False, we just put the score in and fill the other domains with 0.
Returns:: calibrated_probability: The calibrated probability which would match the accuracy/DSC on validation data, of shape (n, d) or a list of n (d, H, W, (D)).

calculate_probability_temperature(inp: List[Iterable] | ndarray, midstage: bool = False, full: bool = False) → List[Iterable] | ndarray

Calculate the calibrated probability with parameters, based on temperature scaling. The challenge is the calculation of non-maximum probability. To achieve this, we calculate the pseudo-temperature for each samples such that the confidence score match the max probability after the temperature scaling process. Then, we utilize the pseudo-temperature to scale the probabilities of other classes.

Note:: The calibrated probability should have the same dimension with network outputs. The calculation is slow as we need to go through every samples. Also, it might be not accurate when calculating estimation algorithms other than MCP, as they are in range [0, 1] instead of [1/n, 1].
Update:: The function is barely used now, as it is too slow!
Args:: inp: The network output (logits) of shape (n, d) or a list of n (d, H, W, (D)). midstage: If True, return the first calibrated results. full: If False, we just put the score in and fill the other domains with 0.
Returns:: calibrated_probability: The calibrated probability which would match the accuracy/DSC on validation data, of shape (n, d) or a list of n (d, H, W, (D)).

calibrate(inp: List[Iterable] | ndarray) → List[Iterable] | ndarray

Calibrate the confidence scores with parameters.

Different estimation algorithms would adopt different strateiges to calibrate the scores.

estimate_accuracy(inp: List[Iterable] | ndarray, midstage: bool = False, gt_guide: ndarray | None = None) → Tuple[float, ndarray]

Estimate the accuracy using network output.

Args:: inp: The network output (logits) of shape (n, d) or a list of n (d, H, W, (D)). midstage: If True, return the first calibrated results. gt_guide: The cooresponding annotation guide of shape (n, d) for segmentation. This is only rquired for segmentation task during optimizaing.

We will utilize this to determine if there is label in the case. If not, we do not calculate the dsc and utilize for optimizing. This is because we do not want to optimize parameter with blank segmentation map. This should be bool, if False, it means that there isn’t any manuel label of class d in this sample.
Returns:: estim_acc: A float scalar that represents the estimated accuracy for the given input data. estim_cls: Estimated class-wise accuracy of shape (d, ).

estimate_auc(probability: List[Iterable] | ndarray, gt_guide: ndarray | None = None, sel_cls: int | None = None) → Tuple[float, ndarray]

Esimate the AUC using network output.

Args:

probability: The calibrated probability of shape (n, d) or a list of n (d, H, W, (D)). gt_guide: The cooresponding annotation guide of shape (n, d) for segmentation. This is only rquired for segmentation task during optimizaing.

We will utilize this to determine if there is label in the case. If not, we do not calculate the dsc and utilize for optimizing. This is because we do not want to optimize parameter with blank segmentation map. This should be bool, if False, it means that there isn’t any manuel label of class d in this sample.

sel_cls: The selected class for calculation. If it is None, return all classes.

Returns:

estim_AUC: Estimated class-wise AUC of shape (d, ), or (1, ) if sel_cls is givien.

estimate_f1score(inp: List[Iterable] | ndarray, probability: List[Iterable] | ndarray, gt_guide: ndarray | None = None) → Tuple[float, ndarray]

Esimate the F1score using network output.

Args:: inp: The network output (logits) of shape (n, d) or a list of n (d, H, W, (D)). probability: The calibrated probability of shape (n, d) or a list of n (d, H, W, (D)). gt_guide: The cooresponding annotation guide of shape (n, d) for segmentation. This is only rquired for segmentation task during optimizaing.

We will utilize this to determine if there is label in the case. If not, we do not calculate the dsc and utilize for optimizing. This is because we do not want to optimize parameter with blank segmentation map. This should be bool, if False, it means that there isn’t any manuel label of class d in this sample.
Returns:: estim_F1score: Estimated class-wise F1score of shape (d, ).

estimate_precision(probability: List[Iterable] | ndarray, gt_guide: ndarray | None = None) → Tuple[float, ndarray]

Esimate the precision using network output.

Args:: probability: The calibrated probability of shape (n, d) or a list of n (d, H, W, (D)). gt_guide: The cooresponding annotation guide of shape (n, d) for segmentation. This is only rquired for segmentation task during optimizaing.

We will utilize this to determine if there is label in the case. If not, we do not calculate the dsc and utilize for optimizing. This is because we do not want to optimize parameter with blank segmentation map. This should be bool, if False, it means that there isn’t any manuel label of class d in this sample.
Returns:: estim_precision: Estimated class-wise precision of shape (d, ).

estimate_sensitivity(inp: List[Iterable] | ndarray, probability: List[Iterable] | ndarray, gt_guide: ndarray | None = None) → Tuple[float, ndarray]

Esimate the sensitivity using network output.

Args:: inp: The network output (logits) of shape (n, d) or a list of n (d, H, W, (D)). probability: The calibrated probability of shape (n, d) or a list of n (d, H, W, (D)). gt_guide: The cooresponding annotation guide of shape (n, d) for segmentation. This is only rquired for segmentation task during optimizaing.

We will utilize this to determine if there is label in the case. If not, we do not calculate the dsc and utilize for optimizing. This is because we do not want to optimize parameter with blank segmentation map. This should be bool, if False, it means that there isn’t any manuel label of class d in this sample.
Note:: The user may wonder why we need inp here in this function. It is because we utilize inp to determine the prediction results, which are not always feasible by probability.
Returns:: estim_sensitivity: Estimated class-wise sensitivity of shape (d, ).

eval(): Initialize the model into evaluation mode.

train(): Initialize the model into training mode.

class moval.models.acModel(mode: str, num_class: int, confidence_scores: str, class_specific: bool)

Bases: Model

MOVAL model with average confidence.

calibrate(inp: List[Iterable] | ndarray) → List[Iterable] | ndarray

Calibrate the confidence scores for average confidence.

Args:: inp: The network output (logits) of shape (n, d) or a list of n (d, H, W, (D)).
Returns:: calibrated_scores: The calibrated scores which would match the accuracy/DSC on validation data, of shape (n, ) or a list of n (H, W, (D)).

class moval.models.tsModel(mode: str, num_class: int, confidence_scores: str, class_specific: bool)

Bases: Model

MOVAL model with temperature scaling.

calibrate(inp: List[Iterable] | ndarray) → List[Iterable] | ndarray

Calibrate the confidence scores with temperature scaling.

Args:: inp: The network output (logits) of shape (n, d) or a list of n (d, H, W, (D)).
Returns:: calibrated_scores: The calibrated scores which would match the accuracy/DSC on validation data, of shape (n, ) or a list of n (H, W, (D)).

class moval.models.docModel(mode: str, num_class: int, confidence_scores: str, class_specific: bool)

Bases: Model

MOVAL model with difference of confidence.

calibrate(inp: List[Iterable] | ndarray) → List[Iterable] | ndarray

Calibrate the confidence scores based on difference of confidence.

Args:: inp: The network output (logits) of shape (n, d) or a list of n (d, H, W, (D)).
Returns:: calibrated_scores: The calibrated scores which would match the accuracy/DSC on validation data, of shape (n, ) or a list of n (H, W, (D)).
Note:: The implementation of class-specific DoC for segmentation is different from what we did in our MICCAI work. In MICCAI, we calculate the difference as the difference of soft dsc. Here, we modify the confidence score to match dsc. Current version should be more compatable with other techiques.

class moval.models.atcModel(mode: str, num_class: int, confidence_scores: str, class_specific: bool)

Bases: Model

MOVAL model with average thresholded confidence.

calibrate(inp: List[Iterable] | ndarray) → List[Iterable] | ndarray

Calibrate the confidence scores based on average thresholded confidence.

Args:: inp: The network output (logits) of shape (n, d) or a list of n (d, H, W, (D)).
Returns:: calibrated_scores: The calibrated scores which would match the accuracy/DSC on validation data, of shape (n, ) or a list of n (H, W, (D)).

class moval.models.tsatcModel(mode: str, num_class: int, confidence_scores: str, class_specific: bool)

Bases: Model

MOVAL model with average thresholded confidence after temperature scaling.

calibrate(inp: List[Iterable] | ndarray, midstage: bool = False) → List[Iterable] | ndarray

Calibrate the confidence scores based on average thresholded confidence after temperature scaling.

Args:: inp: The network output (logits) of shape (n, d) or a list of n (d, H, W, (D)). midstage: If True, return the first calibrated results.
Returns:: calibrated_scores: The calibrated scores which would match the accuracy/DSC on validation data, of shape (n, ) or a list of n (H, W, (D)).