helikite
Submodules
Attributes
Classes
Package Contents
- class helikite.Cleaner(instruments: list[helikite.instruments.base.Instrument], reference_instrument: helikite.instruments.base.Instrument, input_folder: str | pathlib.Path, flight_date: datetime.date, flight: str | None = None, time_takeoff: datetime.datetime | None = None, time_landing: datetime.datetime | None = None, time_offset: datetime.time = datetime.time(0, 0))
Bases:
helikite.classes.base.BaseProcessor- _instruments: list[helikite.instruments.base.Instrument] = []
- input_folder: str
- flight = None
- flight_date: datetime.date
- time_takeoff: datetime.datetime | None = None
- time_landing: datetime.datetime | None = None
- time_offset: datetime.time
- pressure_column: str = 'pressure'
- master_df: pandas.DataFrame | None = None
- housekeeping_df: pandas.DataFrame | None = None
- reference_instrument: helikite.instruments.base.Instrument
- _data_state_info() List[str]
- set_pressure_column(column_name_override: str | None = None) None
Set the pressure column for each instrument’s dataframe
- set_time_as_index() None
Set the time column as the index for each instrument dataframe
- data_corrections(start_altitude: float = None, start_pressure: float = None, start_temperature: float = None) None
- plot_pressure() None
Creates a plot with the pressure measurement of each instrument
Assumes the pressure column has been set for each instrument
- remove_duplicates() None
Remove duplicate rows from each instrument based on time index, and clear repeated values in ‘msems_scan_’, ‘msems_inverted_’ columns, and specific ‘mcda_*’ columns, keeping only the first instance.
- merge_instruments(tolerance_seconds: int = 0, remove_duplicates: bool = True) None
Merges all the dataframes from the instruments into one dataframe.
All columns from all instruments are included in the merged dataframe, with unique prefixes to avoid column name collisions.
- Parameters:
tolerance_seconds (int) – The tolerance in seconds for merging dataframes.
remove_duplicates (bool) – If True, removes duplicate times and keeps the first result.
- export_data(filename: str | None = None) None
Export all data columns from all instruments to local files
The function will export a CSV and a Parquet file with all columns from all instruments. The files will be saved in the current working directory unless a filename is provided.
The Parquet file will include the metadata from the class.
- _apply_rolling_window_to_pressure(instrument, window_size: int = 20)
Apply rolling window to the pressure measurements of instrument
Then plot the pressure measurements with the rolling window applied
- define_flight_times()
Creates a plot to select the start and end of the flight
Uses the pressure measurements of the reference instrument to select the start and end of the flight. The user can click on the plot to select the points.
- correct_time_and_pressure(max_lag=180, walk_time_seconds: int | None = None, apply_rolling_window_to: list[helikite.instruments.base.Instrument] = [], rolling_window_size: int = constants.ROLLING_WINDOW_DEFAULT_SIZE, reference_pressure_thresholds: tuple[float, float] | None = None, detrend_pressure_on: list[helikite.instruments.base.Instrument] = [], offsets: list[tuple[helikite.instruments.base.Instrument, int]] = [], match_adjustment_with: list[tuple[helikite.instruments.base.Instrument, helikite.instruments.base.Instrument]] = [])
Correct time and pressure for each instrument based on time lag.
- Parameters:
max_lag (int) – The maximum time lag to consider for cross-correlation.
walk_time_seconds (int) – The time in seconds to walk the pressure data to match the reference instrument.
apply_rolling_window_to (list[Instrument]) – A list of instruments to apply a rolling window to the pressure data.
rolling_window_size (int) – The size of the rolling window to apply to the pressure data.
reference_pressure_thresholds (tuple[float, float]) – A tuple with two values (low, high) to apply a threshold to the reference instrument’s pressure data.
detrend_pressure_on (list[Instrument]) – A list of instruments to detrend the pressure data.
offsets (list[tuple[Instrument, int]]) – A list of tuples with an instrument and an offset in seconds to apply to the time index.
match_adjustment_with (dict[Instrument, list[Instrument]]) – A list of tuples with two instruments, in order to be able to to match the same time adjustment. This can be used, for example, if an instrument does not have a pressure column, and as such, can use the time adjustment from another instrument. The first instrument is the one that has the index adjustment, and the second instrument is the one that will be adjusted.
- shift_msems_columns_by_90s(df: pandas.DataFrame) pandas.DataFrame
Shift all ‘msems_inverted_’ and ‘msems_scan_’ columns by 90 seconds in time.
- Parameters:
df (pd.DataFrame) – The DataFrame containing the time-indexed data to shift.
- Returns:
The DataFrame with specified columns time-shifted by 90 seconds.
- Return type:
pd.DataFrame
- fill_missing_timestamps(df: pandas.DataFrame, freq: str = '1S', fill_method: str | None = None) pandas.DataFrame
Reindex the DataFrame to fill in missing timestamps at the specified frequency. Optionally forward- or backward-fill missing values. Prints the number of timestamps added.
- Parameters:
df (pd.DataFrame) – The input DataFrame with a DateTimeIndex.
freq (str) – The desired frequency for the DateTimeIndex (e.g., “1S” for 1 second).
fill_method (str or None) – Method to fill missing values: “ffill”, “bfill”, or None (default: None).
- Returns:
A DataFrame with missing timestamps added and values optionally filled.
- Return type:
pd.DataFrame
- helikite.__version__ = '1.1.3'
- helikite.__appname__ = 'helikite-data-processing'
- helikite.__description__ = 'Library to generate quicklooks and data quality checks on Helikite campaigns'