Helper Functions

Various generic helper functions used the SCR and supporting classes.

allocation_matrix(df, column, calc_type)[source]

Create a fast allocation matrix showing which elements are in each combination.

Returns a DataFrame with combinations as index and unique elements as columns, containing 1 if element is in combination, 0 otherwise.

Parameters:
  • df (DataFrame)

  • column (str)

  • calc_type (Literal['diversification', 'individual', 'overall'])

Return type:

DataFrame

combins_df_col(df, column, calc_type)[source]

Generate all possible combinations of unique values from a DataFrame column.

Args:

df: Input DataFrame column: Column name to extract combinations from calc_type: Type of combinations to generate:

  • ‘diversification’: All possible combinations (1 to n items)

  • ‘individual’: Single items and all items combined

  • ‘overall’: Only the combination of all items

Returns:

List of frozensets containing all requested combinations

Raises:

ValueError: If calc_type is not one of the valid options

Parameters:
  • df (DataFrame)

  • column (str)

  • calc_type (Literal['diversification', 'individual', 'overall'])

Return type:

list

f_accumulate_figures_vectorized(dest_df, source_df, dest_col, source_col, dest_index_col='index', dest_index=True, source_index_col='index', source_index=True, agg_func='sum')[source]

Highly optimized vectorized version using pandas operations. Best performance for large datasets.

This function takes tuples/sets in dest_df index and aggregates corresponding values from source_df for each element in those tuples.

Args:

dest_df: Target dataframe with tuple/set indices source_df: Source dataframe with individual element indices dest_col: Column in dest_df to update with aggregated values source_col: Column in source_df to aggregate from dest_index: If True, use dest_df.index; if False, use dest_index_col source_index: If True, use source_df.index; if False, use source_index_col agg_func: Aggregation function (‘sum’, ‘mean’, ‘max’, etc.)

Example:

dest_df.index = [(‘A’, ‘B’), (‘B’, ‘C’)] # tuples to explode source_df.index = [‘A’, ‘B’, ‘C’] # individual elements # Will sum source values for A+B, and B+C respectively

Parameters:
  • dest_df (DataFrame)

  • source_df (DataFrame)

  • dest_col (str)

  • source_col (str)

  • dest_index_col (str)

  • dest_index (bool)

  • source_index_col (str)

  • source_index (bool)

  • agg_func (str | Callable)

Return type:

None

f_best_join(left_df, right_df, dest_field, source_field)[source]
f_best_match_new(x, join_list)[source]

Function not used, repalce with f_new_match_element.

f_fast_join(left_df, right_df, dest_field, source_field)[source]
f_fast_match_element(x, right_list)[source]
f_get_total_row(df)[source]

Returns the index with the longest tuple.

f_new_match_element(x, right_list)[source]
f_new_match_idx(left_list, right_list, both=False)[source]
Parameters:
  • left_list (list)

  • right_list (list)

  • both (bool)

Return type:

Series | DataFrame

log_decorator(func)[source]

A decorator that logs the runtime of the decorated function and appends it to the output_runtimes attribute of the first argument if it has either scr or output_runtimes attributes. This decorator also ensures that the decorated function retains its original name and docstring.

Parameters:

func (function) – The function to be decorated.

Returns:

The wrapped function with added logging functionality.

Return type:

function

The decorator measures the time taken by the function to execute and appends the runtime information to the output_runtimes list of the first argument if it has the scr or output_runtimes attribute.

Example:

>>> @log_decorator
... def create_prem_res():
...     return PremRes(sam_scr, True)
>>>
>>> prem_res = create_prem_res()