flightrisk.utils

class flightrisk.utils.paths.ProjectPaths(root, data_raw, data_interim, data_features, mlruns, reports, configs)[source]

Bases: object

Container with all resolved project paths.

Parameters:
  • root (Path) – Repository root.

  • data_raw (Path) – Immutable raw datasets, DVC-tracked.

  • data_interim (Path) – Cleaned, time-aligned intermediate tables.

  • data_features (Path) – Final feature parquet partitions.

  • mlruns (Path) – MLflow tracking store.

  • reports (Path) – Generated figures and HTML reports.

  • configs (Path) – Reference YAML configuration tree (see configs/README.md).

root: Path
data_raw: Path
data_interim: Path
data_features: Path
mlruns: Path
reports: Path
configs: Path
ensure()[source]

Create every directory that does not yet exist.

Returns:

The same instance, for fluent chaining.

Return type:

ProjectPaths

flightrisk.utils.paths.get_paths()[source]

Resolve project paths from the environment or defaults.

Honours FLIGHTRISK_ROOT if set, otherwise walks up from this module.

Returns:

A frozen ProjectPaths instance.

Return type:

ProjectPaths

flightrisk.utils.logging.get_logger(name)[source]

Return a configured logger named name.

Reads FLIGHTRISK_LOG_LEVEL (default INFO). Subsequent calls with the same name are cached so handlers are never duplicated.

Parameters:

name (str) – Module-qualified logger name, typically __name__.

Returns:

A logging.Logger writing to stderr.

Return type:

Logger

flightrisk.utils.seed.seed_everything(seed)[source]

Seed Python, NumPy, and downstream ML libraries.

Sets PYTHONHASHSEED so child processes inherit the seed. If torch is importable it is also seeded, including CUDA generators when available.

Parameters:

seed (int) – Non-negative integer seed.

Returns:

The seed that was applied, for logging.

Raises:

ValueError – If seed is negative.

Return type:

int

flightrisk.utils.mlflow_helpers.configure_mlflow(experiment=None)[source]

Configure MLflow tracking and ensure the target experiment exists.

Parameters:

experiment (str | None) – Optional experiment name; defaults to the project setting.

Returns:

The resolved experiment name.

Return type:

str

flightrisk.utils.mlflow_helpers.start_run(run_name, tags=None)[source]

Context manager that starts a tagged MLflow run.

Parameters:
  • run_name (str) – Human-readable run name.

  • tags (Mapping[str, str] | None) – Optional tag mapping (e.g. data hashes, git SHAs).

Yields:

The active MLflow run, for direct mlflow.log_* calls.

Return type:

Iterator[ActiveRun]

class flightrisk.config.Settings(_case_sensitive=None, _nested_model_default_partial_update=None, _env_prefix=None, _env_prefix_target=None, _env_file=PosixPath('.'), _env_file_encoding=None, _env_ignore_empty=None, _env_nested_delimiter=None, _env_nested_max_split=None, _env_parse_none_str=None, _env_parse_enums=None, _cli_prog_name=None, _cli_parse_args=None, _cli_settings_source=None, _cli_parse_none_str=None, _cli_hide_none_type=None, _cli_avoid_json=None, _cli_enforce_required=None, _cli_use_class_docs_for_groups=None, _cli_exit_on_error=None, _cli_prefix=None, _cli_flag_prefix_char=None, _cli_implicit_flags=None, _cli_ignore_unknown_args=None, _cli_kebab_case=None, _cli_shortcuts=None, _secrets_dir=None, _build_sources=None, *, mlflow_tracking_uri=None, mlflow_experiment='flightrisk', random_seed=1337, kaggle_username=None, kaggle_key=None)[source]

Bases: BaseSettings

Process-wide settings resolved from environment variables.

Parameters:
  • mlflow_tracking_uri (str | None) – MLflow tracking URI; defaults to a local store.

  • mlflow_experiment (str) – Default experiment name for runs.

  • random_seed (int) – Global seed for reproducibility.

  • kaggle_username (str | None) – Kaggle credential, used only by data ingestion.

  • kaggle_key (str | None) – Kaggle API key, used only by data ingestion.

  • _case_sensitive (bool | None)

  • _nested_model_default_partial_update (bool | None)

  • _env_prefix (str | None)

  • _env_prefix_target (EnvPrefixTarget | None)

  • _env_file (DotenvType | None)

  • _env_file_encoding (str | None)

  • _env_ignore_empty (bool | None)

  • _env_nested_delimiter (str | None)

  • _env_nested_max_split (int | None)

  • _env_parse_none_str (str | None)

  • _env_parse_enums (bool | None)

  • _cli_prog_name (str | None)

  • _cli_parse_args (bool | list[str] | tuple[str, ...] | None)

  • _cli_settings_source (CliSettingsSource[Any] | None)

  • _cli_parse_none_str (str | None)

  • _cli_hide_none_type (bool | None)

  • _cli_avoid_json (bool | None)

  • _cli_enforce_required (bool | None)

  • _cli_use_class_docs_for_groups (bool | None)

  • _cli_exit_on_error (bool | None)

  • _cli_prefix (str | None)

  • _cli_flag_prefix_char (str | None)

  • _cli_implicit_flags (bool | Literal['dual', 'toggle'] | None)

  • _cli_ignore_unknown_args (bool | None)

  • _cli_kebab_case (bool | Literal['all', 'no_enums'] | None)

  • _cli_shortcuts (Mapping[str, str | list[str]] | None)

  • _secrets_dir (PathType | None)

  • _build_sources (tuple[tuple[PydanticBaseSettingsSource, ...], dict[str, Any]] | None)

model_config = {'arbitrary_types_allowed': True, 'case_sensitive': False, 'cli_avoid_json': False, 'cli_enforce_required': False, 'cli_exit_on_error': True, 'cli_flag_prefix_char': '-', 'cli_hide_none_type': False, 'cli_ignore_unknown_args': False, 'cli_implicit_flags': False, 'cli_kebab_case': False, 'cli_parse_args': None, 'cli_parse_none_str': None, 'cli_prefix': '', 'cli_prog_name': None, 'cli_shortcuts': None, 'cli_use_class_docs_for_groups': False, 'enable_decoding': True, 'env_file': '.env', 'env_file_encoding': 'utf-8', 'env_ignore_empty': False, 'env_nested_delimiter': None, 'env_nested_max_split': None, 'env_parse_enums': None, 'env_parse_none_str': None, 'env_prefix': 'FLIGHTRISK_', 'env_prefix_target': 'variable', 'extra': 'ignore', 'json_file': None, 'json_file_encoding': None, 'nested_model_default_partial_update': False, 'protected_namespaces': ('model_validate', 'model_dump', 'settings_customise_sources'), 'secrets_dir': None, 'toml_file': None, 'validate_default': True, 'yaml_config_section': None, 'yaml_file': None, 'yaml_file_encoding': None}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

mlflow_tracking_uri: str | None
mlflow_experiment: str
random_seed: int
kaggle_username: str | None
kaggle_key: str | None
resolved_tracking_uri()[source]

Return a concrete MLflow tracking URI.

Falls back to a file:// URI rooted at mlruns/ inside the repo.

Returns:

A URI string MLflow can consume.

Return type:

str

flightrisk.config.get_settings()[source]

Build a fresh Settings instance from the environment.

Returns:

Settings populated from env vars and .env if present.

Return type:

Settings