Package 'countrycode' reference manual

Title:	Convert Country Names and Country Codes
Description:	Standardize country names, convert them into one of 40 different coding schemes, convert between coding schemes, and assign region descriptors.
Authors:	Vincent Arel-Bundock [aut, cre] , CJ Yetman [ctb] , Nils Enevoldsen [ctb] , Etienne Bacher [ctb] , Samuel Meichtry [ctb]
Maintainer:	Vincent Arel-Bundock <[email protected]>
License:	GPL-3
Version:	1.6.0.9000
Built:	2024-11-06 06:13:45 UTC
Source:	https://github.com/vincentarelbundock/countrycode

List of CLDR country name codes and associated examples

Description

Code: CLDR code
Example: French Southern Territories in different languages

Usage

cldr_examples
cldr_examples

Format

data frame

Country Code Translation Data Frame (Cross-Sectional)

Description

A data frame used internally by the countrycode() function. countrycode can use any valid code as destination, but only some codes can be used as origin.

Format

A data frame with codes as columns.

Details

Origin and Destination

cctld: IANA country code top-level domain
country.name: country name (English)
country.name.de: country name (German)
country.name.fr: country name (French)
country.name.it: country name (Italian)
cowc: Correlates of War character
cown: Correlates of War numeric
dhs: Demographic and Health Surveys Program
ecb: European Central Bank
eurostat: Eurostat
fao: Food and Agriculture Organization of the United Nations numerical code
fips: FIPS 10-4 (Federal Information Processing Standard)
gaul: Global Administrative Unit Layers
genc2c: GENC 2-letter code
genc3c: GENC 3-letter code
genc3n: GENC numeric code
gwc: Gleditsch & Ward character
gwn: Gleditsch & Ward numeric
imf: International Monetary Fund
ioc: International Olympic Committee
iso2c: ISO-2 character
iso3c: ISO-3 character
iso3n: ISO-3 numeric
p5n: Polity V numeric country code
p5c: Polity V character country code
p4n: Polity IV numeric country code
p4c: Polity IV character country code
un: United Nations M49 numeric codes
unicode.symbol: Region subtag (often displayed as emoji flag)
unhcr: United Nations High Commissioner for Refugees
unpd: United Nations Procurement Division
vdem: Varieties of Democracy (V-Dem version 8, April 2018)
wb: World Bank (very similar but not identical to iso3c)
wvs: World Values Survey numeric code

Destination only

⁠cldr.*⁠: 600+ country name variants from the UNICODE CLDR project (e.g., "cldr.short.en"). Inspect the cldr_examples data.frame for a full list of available country names and examples.
ar5: IPCC's regional mapping used both in the Fifth Assessment Report (AR5) and for the Reference Concentration Pathways (RCP)
continent: Continent as defined in the World Bank Development Indicators
cow.name: Correlates of War country name
currency: ISO 4217 currency name
eurocontrol_pru: European Organisation for the Safety of Air Navigation
eurocontrol_statfor: European Organisation for the Safety of Air Navigation
eu28: Member states of the European Union (as of December 2015), without special territories
icao.region: International Civil Aviation Organization region
iso.name.en: ISO English short name
iso.name.fr: ISO French short name
iso4217c: ISO 4217 currency alphabetic code
iso4217n: ISO 4217 currency numeric code
p4.name: Polity IV country name
region: 7 Regions as defined in the World Bank Development Indicators
region23: 23 Regions as used to be in the World Bank Development Indicators (legacy)
un.name.ar: United Nations Arabic country name
un.name.en: United Nations English country name
un.name.es: United Nations Spanish country name
un.name.fr: United Nations French country name
un.name.ru: United Nations Russian country name
un.name.zh: United Nations Chinese country name
un.region.name: United Nations region name
un.region.code: United Nations region code
un.regionintermediate.name: United Nations intermediate region name
un.regionintermediate.code: United Nations intermediate region code
un.regionsub.name: United Nations sub-region name
un.regionsub.code: United Nations sub-region code
unhcr.region: United Nations High Commissioner for Refugees region name
wvs.name: World Values Survey numeric code country name

Note

The Correlates of War (cow) and Polity 4 (p4) project produce codes in country year format. Some countries go through political transitions that justify changing codes over time. When building a purely cross-sectional conversion dictionary, this forces us to make arbitrary choices with respect to some entities (e.g., Western Germany, Vietnam, Serbia). countrycode includes a reconciled dataset in panel format, codelist_panel. Instead of converting code, we recommend that users dealing with panel data "left-merge" their data into this panel dictionary.

Country Code Translation Data Frame (Country-Year Panel)

Description

A panel of country-year observations with various codes

Usage

codelist_panel
codelist_panel

Format

data frame with codes as columns

Convert Country Codes

Description

Converts long country names into one of many different coding schemes. Translates from one scheme to another. Converts country name or coding scheme to the official short English country name. Creates a new variable with the name of the continent or region to which each country belongs.

Usage

countrycode(
  sourcevar,
  origin,
  destination,
  warn = TRUE,
  nomatch = NA,
  custom_dict = NULL,
  custom_match = NULL,
  origin_regex = NULL
)
countrycode(
  sourcevar,
  origin,
  destination,
  warn = TRUE,
  nomatch = NA,
  custom_dict = NULL,
  custom_match = NULL,
  origin_regex = NULL
)

Arguments

`sourcevar`	Vector which contains the codes or country names to be converted (character or factor)
`origin`	A string which identifies the coding scheme of origin (e.g., `"iso3c"`). See `codelist` for a list of available codes.
`destination`	A string or vector of strings which identify the coding scheme of destination (e.g., `"iso3c"` or `c("cowc", "iso3c")`). See `codelist` for a list of available codes. When users supply a vector of destination codes, they are used sequentially to fill in missing values not covered by the previous destination code in the vector.
`warn`	Prints unique elements from sourcevar for which no match was found
`nomatch`	When countrycode fails to find a match for the code of origin, it fills-in the destination vector with `nomatch`. The default behavior is to fill non-matching codes with `NA`. If `nomatch = NULL`, countrycode tries to use the origin vector to fill-in missing values in the destination vector. `nomatch` must be either `NULL`, of length 1, or of the same length as `sourcevar`.
`custom_dict`	A data frame which supplies a new dictionary to replace the built-in country code dictionary. Each column contains a different code and must include no duplicates. The data frame format should resemble `codelist`. Users can pre-assign attributes to this custom dictionary to affect behavior (see Examples section): "origin.regex" attribute: a character vector with the names of columns containing regular expressions. "origin.valid" attribute: a character vector with the names of columns that are accepted as valid origin codes.
`custom_match`	A named vector which supplies custom origin and destination matches that will supercede any matching default result. The name of each element will be used as the origin code, and the value of each element will be used as the destination code.
`origin_regex`	NULL or Logical: When using a custom dictionary, if TRUE then the origin codes will be matched as regex, if FALSE they will be matched exactly. When NULL, `countrycode` will behave as TRUE if the origin name is in the `custom_dictionary`'s `origin_regex` attribute, and FALSE otherwise. See examples section below.

Note

For a complete description of available country codes and languages, please see the documentation for the codelist conversion dictionary.

Panel data (i.e., country-year) can pose particular problems when converting codes. For instance, some countries like Vietnam or Serbia go through political transitions that justify changing codes over time. This can pose problems when using codes from organizations like CoW or Polity IV, which produce codes in country-year format. Instead of converting codes using countrycode(), we recommend that users use the codelist_panel data.frame as a base into which they can merge their other data. This data.frame includes most relevant code, and is already "reconciled" to ensure that each political unit is only represented by one row in any given year. From there, it is just a matter of using merge() to combine different datasets which use different codes.

Examples

library(countrycode)

# ISO to Correlates of War
countrycode(c('USA', 'DZA'), origin = 'iso3c', destination = 'cown')

# English to ISO
countrycode('Albania', origin = 'country.name', destination = 'iso3c')

# German to French
countrycode('Albanien', origin = 'country.name.de', destination = 'iso.name.fr')

# Using custom_match to supercede default codes
countrycode(c('United States', 'Algeria'), 'country.name', 'iso3c')
countrycode(c('United States', 'Algeria'), 'country.name', 'iso3c',
            custom_match = c('Algeria' = 'ALG'))

x <- c("canada", "antarctica")
countryname(x)
countryname(x, destination = "cowc", warn = FALSE)
countryname(x, destination = "cowc", warn = FALSE, nomatch = x)

library(countrycode)

# ISO to Correlates of War
countrycode(c('USA', 'DZA'), origin = 'iso3c', destination = 'cown')

# English to ISO
countrycode('Albania', origin = 'country.name', destination = 'iso3c')

# German to French
countrycode('Albanien', origin = 'country.name.de', destination = 'iso.name.fr')

# Using custom_match to supercede default codes
countrycode(c('United States', 'Algeria'), 'country.name', 'iso3c')
countrycode(c('United States', 'Algeria'), 'country.name', 'iso3c',
            custom_match = c('Algeria' = 'ALG'))

x <- c("canada", "antarctica")
countryname(x)
countryname(x, destination = "cowc", warn = FALSE)
countryname(x, destination = "cowc", warn = FALSE, nomatch = x)

Convert country names in any language to another name or code

Description

Converts long country names in any language to one of many different country code schemes or country names. countryname does 2 passes on the data. First, it tries to detect variations of country names in many languages extracted from the Unicode Common Locale Data Repository. Second, it applies countrycode's English regexes to try to match the remaining cases. Because it does two passes, countryname can sometimes produce ambiguous results, e.g., Saint Martin vs. Saint Martin (French Part). Users who need a "safer" option can use: countrycode(x, "country.name", "country.name") Note that the function works with non-ASCII characters. Please see the Github page for examples.

Usage

countryname(
  sourcevar,
  destination = "country.name.en",
  nomatch = NA,
  warn = TRUE
)
countryname(
  sourcevar,
  destination = "country.name.en",
  nomatch = NA,
  warn = TRUE
)

Arguments

`sourcevar`	Vector which contains the codes or country names to be converted (character or factor)
`destination`	Coding scheme of destination (string such as "iso3c" enclosed in quotes ""): type `?codelist` for a list of available codes.
`nomatch`	When countrycode fails to find a match for the code of origin, it fills-in the destination vector with `nomatch`. The default behavior is to fill non-matching codes with `NA`. If `nomatch = NULL`, countrycode tries to use the origin vector to fill-in missing values in the destination vector. `nomatch` must be either `NULL`, of length 1, or of the same length as `sourcevar`.
`warn`	Prints unique elements from sourcevar for which no match was found

Examples

## Not run: 
x <- c('Afaganisitani', 'Barbadas', 'Sverige', 'UK')
countryname(x)
countryname(x, destination = 'iso3c')

## End(Not run)

## Not run: 
x <- c('Afaganisitani', 'Barbadas', 'Sverige', 'UK')
countryname(x)
countryname(x, destination = 'iso3c')

## End(Not run)

A dataframe of alternative country names in many languages. Used internally by the `countryname` function.

Description

A dataframe of alternative country names in many languages. Used internally by the countryname function.

Format

dataframe

Get Custom Dictionaries

Description

Download a custom dictionary to use in the custom_dict argument of countrycode()

Usage

get_dictionary(dictionary = NULL)
get_dictionary(dictionary = NULL)

Arguments

dictionary

A character string that specifies the dictionary to be retrieved. It must be one of "global_burden_of_disease", "ch_cantons", "us_states", "exiobase3", "gtap10". If NULL, the function will print the list of available dictionaries. Default is NULL.

Value

If a valid dictionary is specified, the function will return that dictionary as a data.frame. If an invalid dictionary or no dictionary is specified, the function will stop and throw an error message.

Examples

## Not run: 
cd <- get_dictionary("us_states")
countrycode::countrycode(c("MO", "MN"), origin = "state.abb", "state.name", custom_dict = cd)

## End(Not run)
## Not run: 
cd <- get_dictionary("us_states")
countrycode::countrycode(c("MO", "MN"), origin = "state.abb", "state.name", custom_dict = cd)

## End(Not run)

Guess the code/name of a vector

Description

Users sometimes do not know what kind of code or field their data contain. This function tries to guess by comparing the similarity between a user-supplied vector and all the codes included in the countrycode dictionary.

Usage

guess_field(codes, min_similarity = 80)
guess_field(codes, min_similarity = 80)

Arguments

`codes`	a vector of country codes or country names
`min_similarity`	the function returns all field names where over than `min_similarity`% of codes are shared between the supplied vector and the `countrycode` dictionary.

Examples

# Guess ISO codes
guess_field(c('DZA', 'CAN', 'DEU'))

# Guess country names
guess_field(c('Guinea','Iran','Russia','North Korea',rep('Ivory Coast',50),'Scotland'))
# Guess ISO codes
guess_field(c('DZA', 'CAN', 'DEU'))

# Guess country names
guess_field(c('Guinea','Iran','Russia','North Korea',rep('Ivory Coast',50),'Scotland'))

Package 'countrycode'

Help Index

List of CLDR country name codes and associated examples

Description

Usage

Format

Country Code Translation Data Frame (Cross-Sectional)

Description

Format

Details

Origin and Destination

Destination only

Note

Country Code Translation Data Frame (Country-Year Panel)

Description

Usage

Format

Convert Country Codes

Description

Usage

Arguments

Note

Examples

Convert country names in any language to another name or code

Description

Usage

Arguments

Examples

A dataframe of alternative country names in many languages. Used internally by the countryname function.

Description

Format

Get Custom Dictionaries

Description

Usage

Arguments

Value

Examples

Guess the code/name of a vector

Description

Usage

Arguments

Examples

A dataframe of alternative country names in many languages. Used internally by the `countryname` function.