Skip to contents

Creates scatter plots for comparing two or more variables across countries, with support for various visual customizations including color grouping, ISO label placement, trend lines, and statistical annotations. The function automatically determines the most appropriate plot type based on the number of variables and specified parameters.

Usage

wp_plot_scatter(
  data,
  y_axis = NULL,
  x_axis = NULL,
  color = "Subregion",
  ISO = "Both",
  interpolation = FALSE,
  spline_bw = NULL,
  r_squared = 0,
  r_squared_pos = "flexible",
  label_nudge = 0.005,
  highlight_zero = FALSE,
  filename = NULL,
  print = TRUE,
  legend = TRUE,
  reference = TRUE,
  title = NULL,
  subtitle = NULL,
  subfig_title = FALSE,
  verbose = TRUE,
  debug = FALSE,
  no_other = TRUE,
  size = 4,
  base_size = 16,
  common_var = TRUE,
  bg = "transparent"
)

Arguments

data

data.frame that must contain the following columns: - ISO: ISO 3-letter country codes (required) - Variable: indicator names (required) - Value: numeric values (required) - Country: country names (optional, auto-generated from ISO if missing) - Reference: source citations (required if reference=TRUE)

y_axis

Single string, vector of strings, TRUE, FALSE, or NULL: - Single string: common y-axis label for all plots - Vector of strings: individual labels for each plot - TRUE: use Variable names as labels - FALSE/NULL: no axis labels For multi-variable plots, length must match number of plot panels.

x_axis

Similar to y_axis, controls x-axis labeling: - Single string: common x-axis label for all plots - Vector of strings: individual labels for each plot - TRUE: use Variable names as labels - FALSE/NULL: no axis labels

color

Controls point coloring scheme: - "Subregion": colors by geographical subregion - "Region": colors by continent/major region - "Center-Periphery": colors by economic classification - "Hydrocarbon": colors by hydrocarbon exporter/importer status - list: custom grouping with named groups of countries - FALSE/NULL: no color grouping (all points gray) Default is "Subregion".

ISO

Controls country label display: - TRUE: replaces points with ISO codes - "Both": shows both points and repelled ISO labels - FALSE/NULL: shows only points Default is "Both".

interpolation

Controls trend line fitting: - TRUE/"Linear": linear regression line - "Square": quadratic regression curve - "Cubic": cubic regression curve - "Spline": smoothed spline curve - FALSE: no trend line Default is FALSE.

spline_bw

Numeric value for spline smoothing bandwidth: - Only used when interpolation="Spline" - Lower values create more flexible curves - NULL for automatic bandwidth selection Default is NULL.

r_squared

Controls R-squared statistic display: - 0: no R-squared display - 1: shows R-squared value - 2: shows R-squared and adjusted R-squared - 3: shows equation and both R-squared values Default is 0.

r_squared_pos

Controls R-squared text position: - "flexible": automatically chooses best position - "topleft": fixed top-left position - "topright": fixed top-right position - "bottomleft": fixed bottom-left position - "bottomright": fixed bottom-right position Default is "flexible".

label_nudge

Numeric value for ISO label positioning: - Controls distance between point and label - Only used when ISO="Both" - Larger values increase separation Default is 0.005.

highlight_zero

Logical; whether to add x=0 and y=0 reference lines: - TRUE: adds both reference lines - FALSE: no reference lines Default is FALSE.

filename

Character string for saving plots: - Specify name without extension - Plots saved as both PNG and PDF in 'img/' directory - NULL/FALSE for no file output

print

Logical; controls plot display: - TRUE: displays the plot - FALSE: creates but doesn't display the plot Default is TRUE.

legend

Logical; controls legend display: - TRUE: includes a legend - FALSE: omits the legend Default is TRUE.

reference

Logical; controls reference panel: - TRUE: includes reference citations panel (requires Reference column) - FALSE: omits reference panel Default is TRUE.

title

Character string for main plot title: - NULL/FALSE for no title

subtitle

Character string for plot subtitle: - NULL/FALSE for no subtitle

subfig_title

Controls subplot titles in multi-variable plots: - TRUE: uses default titles (pairs of variables) - FALSE: no subplot titles - Character vector: custom titles (length must match number of panels) - Ignored for two-variable plots Default is TRUE.

verbose

Logical; controls information messages: - TRUE: prints processing information - FALSE: suppresses information messages Default is TRUE.

debug

Logical; controls debugging output: - TRUE: prints detailed debugging information - FALSE: suppresses debugging information Default is FALSE.

no_other

Logical; controls treatment of unclassified countries: - TRUE: excludes countries not in any color group - FALSE: includes unclassified countries in gray Default is TRUE.

size

Numeric or NULL; controls plot dimensions: - 1: small (10x7 inches) - 2: medium (15x7 inches) - 3: large (15x10 inches) - 4: extra large (20x10 inches) - NULL: auto-sizes based on data

base_size

Numeric; base font size in points: - Controls text size throughout the plot - Must be positive number Default is 16.

common_var

Controls variable pairing in multi-variable plots: - TRUE: uses first variable as common x-axis - Character: specified variable used as common x-axis - FALSE: sequential pairing of variables Default is TRUE.

bg

character; controls plot background color: - "transparent": transparent background (default) - Any valid color string: sets background to that color

Value

A ggplot2 object containing the scatter plot(s)

Details

The function automatically selects the appropriate plotting implementation based on the number of variables:

  • Two variables:

    • Simple scatter plot with optional trend lines

    • Color grouping by region or custom groups

    • Flexible ISO code labeling options

  • Multiple variables:

    • Matrix of scatter plots

    • Option for common x-axis variable

    • Consistent scales and formatting across panels

The function includes multiple features for customization:

  • Automatic axis scaling and formatting

  • Flexible color grouping options

  • Statistical annotations (R-squared, trend lines)

  • Reference panel for data sources

  • Consistent theme across all plot types

See also

wp_plot_series for time series plots wp_plot_bar for bar plots wp_from_iso for ISO code to country name conversion

Other plotting functions: wp_plot_bar(), wp_plot_series()

Examples

# Basic scatter plot with two variables
data <- data.frame(
  ISO = c("FRA", "DEU", "ITA"),
  Variable = rep(c("GDP", "Inflation"), each = 3),
  Value = c(2.1, 1.8, 1.5, 3.2, 2.8, 2.5),
  Reference = "World Bank"
)
wp_plot_scatter(data, y_axis = "Inflation Rate (%)")
#>  [COUNTRY] Column Country added to data 
#>  [CONFIG] n_countries = 3 | n_indicators = 2 | plot_type = scatter 
#>  [SUBPLOT] call 'in_plot_scatter_two_vars' 
#>  [ARGS] Unused arguments for function 'plot_func': subfig_title, bg, common_var, dimensions 
#>  [SCATTER] Starting two-variable scatter plot 
#>  [DATA] Preparing data for scatter plot 
#>  [DATA] Variables: GDP vs Inflation 
#>  [LABELS] Labels created: WESTERN_EUROPE Western Europe 
#>  [LEGEND] Legend items (display): Western Europe 
#>  [LEGEND] plot_type = 'scatter' | legend_cols = 1' | legend_height = 1.33333333333333' | n_items = 1 
#>  [ADD_REF] References: World Bank 

# Scatter plot with color groups and trend line
wp_plot_scatter(data,
               color = "Region",
               interpolation = "Linear",
               r_squared = 2)
#>  [COUNTRY] Column Country added to data 
#>  [CONFIG] n_countries = 3 | n_indicators = 2 | plot_type = scatter 
#>  [SUBPLOT] call 'in_plot_scatter_two_vars' 
#>  [ARGS] Unused arguments for function 'plot_func': subfig_title, bg, common_var, dimensions 
#>  [SCATTER] Starting two-variable scatter plot 
#>  [DATA] Preparing data for scatter plot 
#>  [DATA] Variables: GDP vs Inflation 
#>  [LABELS] Labels created: EUROPE Europe 
#>  [LEGEND] Legend items (display): Europe 
#>  [LEGEND] plot_type = 'scatter' | legend_cols = 1' | legend_height = 1.33333333333333' | n_items = 1 
#>  [ADD_REF] References: World Bank 

# Multiple variables with common x-axis
data_multi <- data.frame(
  ISO = rep(c("FRA", "DEU", "ITA"), 3),
  Variable = rep(c("GDP", "Inflation", "Debt"), each = 3),
  Value = runif(9, 0, 5),
  Reference = "World Bank"
)
wp_plot_scatter(data_multi,
               common_var = "GDP",
               color = "Subregion",
               ISO = "Both")
#>  [COUNTRY] Column Country added to data 
#>  [CONFIG] n_countries = 3 | n_indicators = 3 | plot_type = scatter 
#>  [SUBPLOT] call 'in_plot_scatter_multi_vars' 
#>  [ARGS] Unused arguments for function 'plot_func': bg, r_squared_pos, no_other, dimensions 
#> Using variables from data: GDP, Inflation, Debt
#>  [SCATTER] Starting two-variable scatter plot 
#>  [DATA] Preparing data for scatter plot 
#>  [DATA] Variables: Inflation vs GDP 
#>  [LABELS] Labels created: WESTERN_EUROPE Western Europe 
#>  [SCATTER] Starting two-variable scatter plot 
#>  [DATA] Preparing data for scatter plot 
#>  [DATA] Variables: Debt vs GDP 
#>  [LABELS] Labels created: WESTERN_EUROPE Western Europe 
#>  [LEGEND] Legend items (display): Western Europe 
#>  [LEGEND] plot_type = 'scatter' | legend_cols = 1' | legend_height = 1.33333333333333' | n_items = 1 
#>  [ADD_REF] References: World Bank 

# Custom color grouping
color_groups <- list(
  "Core" = c("DEU", "FRA"),
  "Periphery" = c("ITA", "ESP")
)
wp_plot_scatter(data,
               color = color_groups,
               highlight_zero = TRUE)
#>  [COUNTRY] Column Country added to data 
#>  [CONFIG] n_countries = 3 | n_indicators = 2 | plot_type = scatter 
#>  [COLOR] Validating color list 
#> /!\  Countries not found in data for group 'Core': DEU, FRA 
#> /!\  Countries not found in data for group 'Periphery': ITA, ESP 
#>  [SUBPLOT] call 'in_plot_scatter_two_vars' 
#>  [ARGS] Unused arguments for function 'plot_func': subfig_title, bg, common_var, dimensions 
#>  [SCATTER] Starting two-variable scatter plot 
#>  [DATA] Preparing data for scatter plot 
#>  [DATA] Variables: GDP vs Inflation 
#>  [LABELS] Labels created: Group A Core; Group B Periphery 
#>  [LEGEND] Legend items (display): Core, Periphery 
#>  [LEGEND] plot_type = 'scatter' | legend_cols = 2' | legend_height = 1.33333333333333' | n_items = 2 
#>  [ADD_REF] References: World Bank