helikite.processing.post.fda

Attributes

CONC_COLUMN_NAME

GRAD_COLUMN_NAME

FLAG_COLUMN_NAME

COLOR_RED

logger

FDA_PARAMS_POLLUTION

FDA_PARAMS_HOVERING

FDA_PARAMS_CLOUD

Classes

FDAParameters

Parameters for Flag Detection Algorithm.

FDA

Flag Detection Algorithm (FDA) for identifying anomalies in time-series data.

Module Contents

helikite.processing.post.fda.CONC_COLUMN_NAME = 'concentration'
helikite.processing.post.fda.GRAD_COLUMN_NAME = 'gradient'
helikite.processing.post.fda.FLAG_COLUMN_NAME = 'flag'
helikite.processing.post.fda.COLOR_RED = '#d73027'
helikite.processing.post.fda.logger
class helikite.processing.post.fda.FDAParameters

Parameters for Flag Detection Algorithm.

Parameters:
inverse: bool
avg_time: str | None = None
main_filter: Literal['power_law', 'iqr'] = 'power_law'
use_neighbor_filter: bool = False
use_median_filter: bool = False
use_sparse_filter: bool = False
use_duration_filter: bool = False
pl_a: float
pl_m: float = 0
iqr_window: str | None = None
iqr_factor: float | None = None
lower_thr: float
upper_thr: float
median_window: str | None = None
median_factor: float | None = None
sparse_window: str | None = None
sparse_thr: float | None = None
min_duration: str | None = None
class helikite.processing.post.fda.FDA(df: pandas.DataFrame, conc_column_name: str, gt_flag_column_name: str | None, params: FDAParameters)

Flag Detection Algorithm (FDA) for identifying anomalies in time-series data.

Based on the algorithm described in: “Automated identification of local contamination in remote atmospheric composition time series” by Ivo Beck et al. (2020) https://doi.org/10.5194/amt-15-4195-2022

Parameters:
  • (pandas.DataFrame) (df)

  • (str) (conc_column_name)

  • None) (gt_flag_column_name (str |)

  • (FDAParameters) (params)

_title
_params
_df_orig
_conc_orig
_df
_filters: list[Callable] | None = None
_intermediate_flags: list[pandas.Series] | None = None
plot_data(use_time_index: bool = True, figsize=(18, 10), bins=100, fontsize=22, markersize=3, save_path: str | pathlib.Path | None = None)

Visualize concentration and gradient distributions. Generates a joint histogram and time-series plot to inspect raw signal behavior and threshold placement.

Parameters:
  • use_time_index – Plot against timestamps instead of sample index.

  • figsize – Figure size.

  • bins – Histogram bin count.

  • fontsize – Axis label font size.

  • markersize – Marker size for time-series plot.

  • save_path – Optional output path for saving the figure.

detect() pandas.Series

Execute configured filters and produce a flag series.

Applies filters sequentially, storing intermediate results, and returns the final flag series aligned with the original index.

Returns:

Series indicating detected flag events.

plot_detection(use_time_index: bool = True, figsize=None, fontsize=14, markersize=3, yscale='log', save_path: str | pathlib.Path | None = None, start_time: datetime.datetime | None = None, end_time: datetime.datetime | None = None)

Visualize intermediate filtering stages.

Displays concentration and gradient signals alongside each filter’s flag results.

Parameters:
  • use_time_index – Plot against timestamps instead of sample index.

  • figsize – Figure size.

  • bins – Histogram bin count.

  • fontsize – Axis label font size.

  • markersize – Marker size.

  • save_path – Optional output path for saving the figure.

  • yscale – Y-axis scale for concentration.

  • start_time – Optional plot start time.

  • end_time – Optional plot end time.

static power_law_filter(conc: pandas.Series, grad: pandas.Series, flag_old: pandas.Series, params: FDAParameters)

Flag anomalies using a power-law gradient threshold.

static iqr_filter(conc: pandas.Series, grad: pandas.Series, flag_old: pandas.Series, params: FDAParameters)

Flag anomalies using rolling interquartile range thresholds.

static neighbor_filter(conc: pandas.Series, grad: pandas.Series, flag_old: pandas.Series, params: FDAParameters)

Extend flags to neighboring samples.

static median_filter(conc: pandas.Series, grad: pandas.Series, flag_old: pandas.Series, params: FDAParameters)

Flag samples exceeding a rolling median threshold.

static sparse_filter(conc: pandas.Series, grad: pandas.Series, flag_old: pandas.Series, params: FDAParameters)

Flag regions with high density of anomalies.

static duration_filter(conc: pandas.Series, grad: pandas.Series, flag_old: pandas.Series, params: FDAParameters)

Remove flagged events shorter than a minimum duration.

static evaluate(conc: pandas.Series, flag: pandas.Series, flag_manual: pandas.Series, verbose: bool = False)

Compute detection performance metrics, in case ground truth is available. Calculates precision, recall, and F1 score relative to reference flags.

helikite.processing.post.fda.FDA_PARAMS_POLLUTION: FDAParameters
helikite.processing.post.fda.FDA_PARAMS_HOVERING
helikite.processing.post.fda.FDA_PARAMS_CLOUD