helikite.processing.post.fda
Attributes
Classes
Parameters for Flag Detection Algorithm. |
|
Flag Detection Algorithm (FDA) for identifying anomalies in time-series data. |
Module Contents
- helikite.processing.post.fda.CONC_COLUMN_NAME = 'concentration'
- helikite.processing.post.fda.GRAD_COLUMN_NAME = 'gradient'
- helikite.processing.post.fda.FLAG_COLUMN_NAME = 'flag'
- helikite.processing.post.fda.COLOR_RED = '#d73027'
- helikite.processing.post.fda.logger
- class helikite.processing.post.fda.FDAParameters
Parameters for Flag Detection Algorithm.
- Parameters:
(bool) (use_sparse_filter)
None) (sparse_thr (float |)
(Literal["power_law" (main_filter)
"iqr"]) (Core detection method)
(bool)
(bool)
(bool)
(float) (upper_thr)
(float)
None)
None)
(float)
(float)
None)
None)
None)
None)
- inverse: bool
- avg_time: str | None = None
- main_filter: Literal['power_law', 'iqr'] = 'power_law'
- use_neighbor_filter: bool = False
- use_median_filter: bool = False
- use_sparse_filter: bool = False
- use_duration_filter: bool = False
- pl_a: float
- pl_m: float = 0
- iqr_window: str | None = None
- iqr_factor: float | None = None
- lower_thr: float
- upper_thr: float
- median_window: str | None = None
- median_factor: float | None = None
- sparse_window: str | None = None
- sparse_thr: float | None = None
- min_duration: str | None = None
- class helikite.processing.post.fda.FDA(df: pandas.DataFrame, conc_column_name: str, gt_flag_column_name: str | None, params: FDAParameters)
Flag Detection Algorithm (FDA) for identifying anomalies in time-series data.
Based on the algorithm described in: “Automated identification of local contamination in remote atmospheric composition time series” by Ivo Beck et al. (2020) https://doi.org/10.5194/amt-15-4195-2022
- Parameters:
- _title
- _params
- _df_orig
- _conc_orig
- _df
- _filters: list[Callable] | None = None
- _intermediate_flags: list[pandas.Series] | None = None
- plot_data(use_time_index: bool = True, figsize=(18, 10), bins=100, fontsize=22, markersize=3, save_path: str | pathlib.Path | None = None)
Visualize concentration and gradient distributions. Generates a joint histogram and time-series plot to inspect raw signal behavior and threshold placement.
- Parameters:
use_time_index – Plot against timestamps instead of sample index.
figsize – Figure size.
bins – Histogram bin count.
fontsize – Axis label font size.
markersize – Marker size for time-series plot.
save_path – Optional output path for saving the figure.
- detect() pandas.Series
Execute configured filters and produce a flag series.
Applies filters sequentially, storing intermediate results, and returns the final flag series aligned with the original index.
- Returns:
Series indicating detected flag events.
- plot_detection(use_time_index: bool = True, figsize=None, fontsize=14, markersize=3, yscale='log', save_path: str | pathlib.Path | None = None, start_time: datetime.datetime | None = None, end_time: datetime.datetime | None = None)
Visualize intermediate filtering stages.
Displays concentration and gradient signals alongside each filter’s flag results.
- Parameters:
use_time_index – Plot against timestamps instead of sample index.
figsize – Figure size.
bins – Histogram bin count.
fontsize – Axis label font size.
markersize – Marker size.
save_path – Optional output path for saving the figure.
yscale – Y-axis scale for concentration.
start_time – Optional plot start time.
end_time – Optional plot end time.
- static power_law_filter(conc: pandas.Series, grad: pandas.Series, flag_old: pandas.Series, params: FDAParameters)
Flag anomalies using a power-law gradient threshold.
- static iqr_filter(conc: pandas.Series, grad: pandas.Series, flag_old: pandas.Series, params: FDAParameters)
Flag anomalies using rolling interquartile range thresholds.
- static neighbor_filter(conc: pandas.Series, grad: pandas.Series, flag_old: pandas.Series, params: FDAParameters)
Extend flags to neighboring samples.
- static median_filter(conc: pandas.Series, grad: pandas.Series, flag_old: pandas.Series, params: FDAParameters)
Flag samples exceeding a rolling median threshold.
- static sparse_filter(conc: pandas.Series, grad: pandas.Series, flag_old: pandas.Series, params: FDAParameters)
Flag regions with high density of anomalies.
- static duration_filter(conc: pandas.Series, grad: pandas.Series, flag_old: pandas.Series, params: FDAParameters)
Remove flagged events shorter than a minimum duration.
- static evaluate(conc: pandas.Series, flag: pandas.Series, flag_manual: pandas.Series, verbose: bool = False)
Compute detection performance metrics, in case ground truth is available. Calculates precision, recall, and F1 score relative to reference flags.
- helikite.processing.post.fda.FDA_PARAMS_POLLUTION: FDAParameters
- helikite.processing.post.fda.FDA_PARAMS_HOVERING
- helikite.processing.post.fda.FDA_PARAMS_CLOUD