Skip to contents

Loads and processes data from a comprehensive panel database containing economic, financial, and development indicators. The function handles data filtering, frequency adjustments, aggregation, and seasonal adjustments. Memory usage is optimized by loading data on demand (approximately 15-50MB during execution).

Usage

wp_data(
  ISO,
  formula,
  variable = NULL,
  years,
  adjust_seasonal = FALSE,
  window_seasadj = NULL,
  matching_yq = "Q2Y",
  interpolation_method = "Linear",
  aggregate_iso = NULL,
  aggregate_period = NULL,
  quartile = FALSE,
  na.rm = TRUE,
  reference = TRUE,
  country_names = FALSE,
  clean = TRUE,
  verbose = TRUE,
  debug = FALSE
)

Arguments

ISO

character vector; ISO 3-letter country codes or category names: - Individual countries (e.g., "USA", "CHN") - Categories (e.g., "CTR_LDR", "BRICS", "AFRICA") - Category exclusions using hyphen (e.g., "CTR_LDR - USA") See wp_get_category() for available categories.

formula

character vector; mathematical expressions using Symbol codes: - Simple variables (e.g., "GDP_C") - Calculations with basic operators (e.g., "100*CU_C/GDP_C") - Multiple formulas as vector - Basic operators: + - * / ( ) Division by zero not handled, use with caution.

variable

character vector or NULL; names for formula outputs (Column Variable): - Must match length of formula if provided - Used for output labeling and plotting Default is NULL, using formula as names.

years

numeric vector of length 2; year range; c(start_year, end_year): - First element: start year - Second element: end year Data availability varies by country/indicator.

adjust_seasonal

logical (TRUE/FALSE); apply seasonal adjustment: - TRUE: adjust quarterly data using STL decomposition - Only affects quarterly data (no effect on annual) Default is FALSE.

window_seasadj

numeric or NULL (= 7); window for seasonal adjustment: - Controls smoothing in STL decomposition - Larger values = more smoothing - Only used if adjust_seasonal = TRUE Default is 7.

matching_yq

character; method for handling mixed frequencies: - "Q2Y": convert quarterly to yearly - "Y2Q": convert yearly to quarterly Default is "Q2Y".

interpolation_method

character; method for Y2Q conversion: - "None": repeat yearly value - "Linear": linear interpolation - "Linear-Scale": scaled linear interpolation Only used if matching_yq = "Y2Q".

aggregate_iso

character or NULL; method for country aggregation: - "Sum": sum values across countries - "Mean": average values across countries - "Median": median values across countries - NULL: no aggregation

aggregate_period

character or NULL; method for time aggregation: - "Sum": sum over period - "Mean"/"Median": central tendency - "SD": standard deviation - "Growth": period-over-period growth - "CAGR": compound annual growth rate - "GeoMean": geometric mean - NULL: no aggregation

quartile

logical; include quartile calculations: - TRUE: add first/third quartiles to aggregations - Only used if aggregate_iso or aggregate_period is specified Default is FALSE.

na.rm

logical; handle missing values in aggregations: - TRUE: exclude NA values - FALSE: return NA if any value is NA Default is TRUE.

reference

logical; include data source citations: - TRUE: add Reference column to output Default is TRUE.

country_names

logical; include full country names: - TRUE: add Country column with names from ISO codes Default is FALSE.

clean

logical; remove rows with NA values: - TRUE: remove NA rows from final output Default is TRUE.

verbose

logical; print processing information: - TRUE: show progress and warnings Default is TRUE.

debug

logical; print detailed debugging information: - TRUE: show technical details Default is FALSE.

Value

A data.frame containing:

  • ISO: 3-letter country codes

  • Date: time period (YYYY or YYYYQN format)

  • Variable: indicator names from formula/variable

  • Value: calculated values

  • Reference: data sources (if reference=TRUE)

  • Country: country names (if country_names=TRUE)

Details

The function processes data in several steps:

  1. Validates inputs and resolves country categories

  2. Loads required data (quarterly/yearly) on demand

  3. Extracts symbols from formulas and filters data

  4. Handles frequency mismatches (Q2Y or Y2Q conversion)

  5. Evaluates formulas for each country

  6. Performs any requested aggregations

  7. Applies seasonal adjustments if specified

  8. Cleans and formats output

Memory usage is optimized by:

  • Loading data only when needed

  • Filtering to required columns early

  • Processing one formula at a time

  • Clearing intermediate objects

Data Validation and Error Handling:

  • Missing data warnings by country/variable

  • Automatic date range adjustment if requested years unavailable

  • Minimum observations check for seasonal adjustment (16 required)

  • Warnings for inappropriate aggregation requests (e.g., growth rates with negative values)

Variable Types and Units:

  • Stock variables: measured at a point in time (e.g., reserves, debt)

  • Flow variables: measured over a period (e.g., GDP, trade)

  • Index variables: base year representations (various base years available)

  • Percentage variables: bounded ratios

  • For IMF Balance of Payments data, quarterly values are multiplied by 4 to represent annualized flows, ensuring consistency with yearly data

Common Data Patterns: World Bank (WB) indicators:

  • _ZS suffix: ratios expressed as percentages

  • _CD suffix: current US dollars

  • _KD suffix: constant dollars

  • _XD suffix: indices

  • _PC suffix: per capita values

  • _FE/_MA suffixes: female/male specific indicators

IMF Balance of Payments (BOP):

  • _C suffix: current prices

  • _R suffix: real/constant prices (with base year)

  • o/i prefixes: outward/inward flows

  • DI/POR/OI suffixes: direct investment/portfolio/other investment

  • E/D suffixes: equity/debt components

Natural Disasters (EMDAT):

  • DIS prefix: disaster-related indicators

  • _AFF/_DEATH/_DMG suffixes: affected people/fatalities/economic damage

  • BIO/CLIM/GEO/HYDRO prefixes: biological/climatic/geological/hydrological disasters

Financial Market Data (BIS):

  • CRED prefix: credit-related indicators

  • _ALL/_BANK prefixes: all sectors/banking sector

  • _CD/_KN/_ZS suffixes: USD/local currency/percentage of GDP

Other specialized databases (JST, KOF, etc.) have their own consistent naming patterns that are documented in their respective sources.

Note

Variables in formulas refer to Symbol codes in the underlying database. Users should understand the economic/financial meaning of variables and their units before performing calculations.

See also

wp_plot_series for plotting time series wp_plot_scatter for scatter plots wp_plot_bar for bar plots wp_get_category for available country categories

Examples

# Basic usage - GDP ratio for one country
data <- wp_data(
  ISO = "USA",
  formula = "100*CU_C/GDP_C",
  variable = "Current Account (% GDP)",
  years = c(2000, 2023)
)
#>  [Step 1] Input Validation. 
#>  [Step 2] Data Filtering (ISO codes, Symbols, and Years). 
#>  [LOADING] Loading quarterly data (first use) 
#>  [Steps 3 to 7] Loop through each formula. 
#>   ---  Step 3-4: Formula: 100*CU_C/GDP_C  --  Symbols: CU_C (Q)   GDP_C (Q)   
#>   ---  Step 5: Adjust frequencies. 
#>   ---  Step 6-7: Data Processing (Loop for each country). 
#>   ---   
#>  [Step 10] Clean database [remove NAs in output] - clean is TRUE. 

# Multiple countries and indicators with aggregation
data <- wp_data(
  ISO = c("CHN", "JPN", "KOR"),
  formula = c("EXg_C/GDP_C", "IMg_C/GDP_C"),
  variable = c("Exports", "Imports"),
  years = c(2010, 2023),
  adjust_seasonal = TRUE,
  aggregate_iso = "Mean"
)
#>  [Step 1] Input Validation. 
#>  [Step 2] Data Filtering (ISO codes, Symbols, and Years). 
#>  [Steps 3 to 7] Loop through each formula. 
#>   ---  Step 3-4: Formula: EXg_C/GDP_C  --  Symbols: EXg_C (Q)   GDP_C (Q)   
#>   ---  Step 5: Adjust frequencies. 
#>   ---  Step 6-7: Data Processing (Loop for each country). 
#>   ---   
#>   ---  Step 3-4: Formula: IMg_C/GDP_C  --  Symbols: IMg_C (Q)   GDP_C (Q)   
#>   ---  Step 5: Adjust frequencies. 
#>   ---  Step 6-7: Data Processing (Loop for each country). 
#>   ---   
#>  [Step 8] Aggregate values (group of ISO codes) - Method: Mean | Quartile: FALSE | na.rm: TRUE 
#>  [Step 9] Adjust for seasonal variations [only for quarterly data] - adjust_seasonal is TRUE 
#>  [INFO] Seasonal adjustments for: Exports   Imports 
#>  [Step 10] Clean database [remove NAs in output] - clean is TRUE. 

# Using categories with exclusions
data <- wp_data(
  ISO = "CTR_LDR - USA",
  formula = "FA_C/GDP_C",
  years = c(1990, 2023),
  adjust_seasonal = TRUE,
  aggregate_period = "Growth"
)
#>  [Step 0] Get ISO codes for ISO codes. 
#>   ---  ISO3(5): DEU FRA GBR ITA JPN 
#>  [Step 1] Input Validation. 
#>  [Step 2] Data Filtering (ISO codes, Symbols, and Years). 
#>  [Steps 3 to 7] Loop through each formula. 
#>   ---  Step 3-4: Formula: FA_C/GDP_C  --  Symbols: FA_C (Q)   GDP_C (Q)   
#>   ---  No data for FA_C: JPN[1990-95] 
#>   ---  Step 5: Adjust frequencies. 
#>   ---  Step 6-7: Data Processing (Loop for each country). 
#>   ---   
#>  [Step 8] Aggregate values (time periods) - Method: Growth | Quartile: FALSE | na.rm: TRUE 
#> Warning: NAs introduced by coercion
#> /!\  Negative values detected for FA_C/GDP_C in ISO: DEU . Growth calculation not suitable for negative values. Returning NA. 
#> /!\  Negative values detected for FA_C/GDP_C in ISO: FRA . Growth calculation not suitable for negative values. Returning NA. 
#> /!\  Negative values detected for FA_C/GDP_C in ISO: GBR . Growth calculation not suitable for negative values. Returning NA. 
#> /!\  Negative values detected for FA_C/GDP_C in ISO: ITA . Growth calculation not suitable for negative values. Returning NA. 
#> /!\  Negative values detected for FA_C/GDP_C in ISO: JPN . Growth calculation not suitable for negative values. Returning NA. 
#>  [Step 10] Clean database [remove NAs in output] - clean is TRUE. 

# Mixed frequency handling
data <- wp_data(
  ISO = "DEU",
  formula = c("GDP_R_2015_Y", "CU_C"),
  years = c(2015, 2023),
  matching_yq = "Y2Q",
  interpolation_method = "Linear"
)
#>  [Step 0] Get ISO codes for ISO codes. 
#>   ---  ISO3(1): DEU 
#>  [Step 1] Input Validation. 
#>  [Step 2] Data Filtering (ISO codes, Symbols, and Years). 
#>  [LOADING] Loading yearly data (first use) 
#>  [Steps 3 to 7] Loop through each formula. 
#>   ---  Step 3-4: Formula: GDP_R_2015_Y  --  Symbols: GDP_R_2015_Y (Y)   
#>   ---  Step 5: Adjust frequencies. 
#>   ---  Step 6-7: Data Processing (Loop for each country). 
#>   ---   
#>   ---  Step 3-4: Formula: CU_C  --  Symbols: CU_C (Q)   
#>   ---  Step 5: Adjust frequencies. 
#>   ---  Step 6-7: Data Processing (Loop for each country). 
#>   ---   
#>  [Step 10] Clean database [remove NAs in output] - clean is TRUE.