Country Concordances: Difference between revisions

From Pardee Wiki
Jump to navigation Jump to search
 
(11 intermediate revisions by the same user not shown)
Line 1: Line 1:
=== Summary ===
== Summary ==
[https://github.com/n1shamin/Country-Concordance Country concordance] is an important aspect of a data technician's job to maintain the International Futures database. Country concordance refers to the differences between the IFs Country list and another organization's country list and merging them to ensure the IFs country list in IFs system. For example, in some organization "Türkiye" is the name displayed, but for IFs we use the name "Turkey". Therefore IFs data technicians need to change the name to "Turkey" for the system to process it. '''Below lists the organizations IFs draw their data from and the common changes needed as well as a link to a GitHub repository that contains the corresponding country list.''' 
[https://github.com/n1shamin/Country-Concordance Country concordance] is an important aspect of a data technician's job to maintain the International Futures database. Country concordance refers to the differences between the IFs Country list and another organization's country list and merging them to ensure the IFs country list in IFs system. For example, in some organization "Türkiye" is the name displayed, but for IFs we use the name "Turkey". Therefore IFs data technicians need to change the name to "Turkey" for the system to process it. To do this there are multiple steps that can be taken and they are shown below: 


==== Proxies ====
== GitHub Repositories ==
Proxies are used when a certain country does not have data points and estimates are needed. Proxies should be used on a case by case basis and for certain series. Proxies essentially use a similar country to the country being approximated (in terms of population, GDP, etc.) and calculates the scale of what the approximated countries value should be.
There are two repositories essential to this process; [https://github.com/quciet/datagator_flask_mongo a mini app that can handle concordance for files less than 15mb] and [https://github.com/n1shamin/Country-Concordance a centralized GitHub repository for all country concordance tables].  


An example from IHME's series is Kosovo and Albania. Albania is similar and geographically close to Kosovo making it an ideal for a proxy. For a series that deals with death such as SocietalViolenceDeathsTotal use the following equation: 
The app allows users to upload datasets from other organizations and utilizes fuzzy matching against standardized International Futures territory names. This is to make it easier for data technicians to ensure all country data points are being pulled correctly. You can clone the repository but a link to the application can be found [https://datagator-lite.onrender.com/ here].  
 
* Kosovo’s number of deaths = Albania’s number of deaths * (Kosovo’s Population / Albania’s Population)
 
==== Disaggregation ====
A lot of organizations have values for dissolved states such as USSR, Yugoslavia, Czechoslovakia, and more. This then leads to gaps in values for the newly formed such as Serbia, Slovakia, Czechia, etc. Therefore, disaggregation is a great tool to fill in these gaps. Below is the disaggregation steps for common groups:
 
# Czechoslovakia = Slovakia and Czech Republic
# Yugoslav SFR = Slovenia, Croatia, Bosnia and Herzegovina, North Macedonia, Serbia, and Montenegro
# Serbia and Montenegro = Serbia and Montenegro
# Sudan (former) = Sudan and South Sudan
# USSR = Armenia, Azerbaijan, Belarus, Estonia, Georgia, Kazakhstan, Kyrgyzstan, Latvia, Lithuania, Moldova, Russia, Tajikistan, Turkmenistan, Ukraine, and Uzbekistan
 
For disaggregation you are adding up the values for the year after the state disbanded, using the data when the state existed, have the values for the new states after the entity disbanded, and then extrapolate (find the percentage of each state in the data from the OG state and then multiply it to get the right amount) data for the new states in the previous years.
 
# USSR add up the values for 1992 (year it disbanded); have totals for 1961-1991; have the values for 15 states for 1992; extrapolate data for 15 states 1961-1991
# Czechoslovakia add up the values for for 1993 (year it disbanded); have totals for 1961-1992; have the values for 2 states for 1993; extrapolate data for 2 states 1961-1992
# Sudan (former) add up the values for 2012 (year it disbanded); have totals for 1961-2011; have the values for 2 states for 2012; extrapolate data for 2 states 1961-2011
# Yugoslav SFR add up the values for 1992 (year it disbanded); have totals for 1961-1991; have the values for 6 states for 1992; extrapolate for 6 states 1961-1991
# Serbia and Montenegro add up the values for 2006 (year it disbanded); have totals for 1992-2005; have the values for 2 states for 2006; extrapolate data for 2 states 1992-2005
An example of this disaggregation can be found in [[FAOSTAT Land Use]].


== Organizations ==
The centralized GitHub is a growing project thank contains excel sheets for mapping tables between an organization's country list and International Futures country names.


=== [https://github.com/shangkexin/The_Federal_Institute_for_Geosciences_and_Natural_Resources_BGR/blob/main/Country.xlsx BGR] ===
The format for these excels are formatted as:  
In BGR there are German and English versions and usually the latest are in German. Below are the differing versions. As usual the link leads to a full country list. 
{| class="wikitable"
{| class="wikitable"
!IFs Countries
|+
!BGR Countries
!IFs Country
|-
!ex: ITU Country
|Bosnia and  Herzegovina
|Bosnia &  herzegovin
|-
|Bosnia and Herzegovina
|Bosnia & Herzegovina
|-
|Bosnia and Herzegovina
|Bosnien & Herzegowina
|-
|Congo
|Congo, Rep.
|-
|Congo
|Congo, rep.
|-
|Congo
|Congo, Republic
|-
|Congo
|Kongo, Rep.
|-
|Congo, Dem. Republic of the
|Congo, Democratic
|-
|Congo, Dem. Republic of the
|Congo, DR
|-
|Congo, Dem. Republic of the
|Congo, dr
|-
|Congo, Dem. Republic of the
|Kongo, DR
|-
|Cote D'Ivoire
|Côte d‘Ivoire
|-
|Cote D'Ivoire
|Côte d'Ivoire
|-
|Cote D'Ivoire
|Côte d'ivoire
|-
|Dominican Republic
|Dominican Rep.
|-
|Dominican Republic
|Dominican rep.
|-
|Dominican Republic
|Dominikahe Rep.
|-
|Dominican Republic
|Dominikanische Rep.
|-
|Guinea Bissau
|Guinea-Bissau
|-
|Guinea Bissau
|Guinea-bissau
|-
|Guinea Bissau
|Guinea–bissau
|-
|Indonesia
|Indoa
|-
|Indonesia
|Indoen
|-
|Indonesia
|Indonesien
|-
|Korea, Dem. People's Republic
|Korea, DPR
|-
|Korea, Dem. People's Republic
|Korea, dpr
|-
|Korea, Dem. People's Republic
|Korea, DVR
|-
|Korea, Republic of
|Korea, Rep.
|-
|Korea, Republic of
|Korea, rep.
|-
|Korea, Republic of
|Korea, Republic
|-
|Lao People's Dem. Republic
|Lao People‘s Democratic R
|-
|Lao People's Dem. Republic
|Lao, People‘s Demo
|-
|Lao People's Dem. Republic
|Laos
|-
|Luxembourg
|Luxemburg
|-
|Moldova, Republic of
|Moldau, Republik
|-
|Moldova, Republic of
|Moldova, Republic
|-
|Moldova, Republic of
|Moldova, republic
|-
|Palestine
|Palästina
|-
|Palestine
|Palästinensische Gebiete
|-
|Palestine
|Palestina
|-
|Palestine
|Palestinian territories
|-
|Sahrawi Arab Democratic Republic
|Westerhara
|-
|Sahrawi Arab Democratic Republic
|Western Sahara
|-
|Sahrawi Arab Democratic Republic
|Westsahara
|-
|Sao Tome and Principe
|Sao Tome and Prici
|-
|Sao Tome and Principe
|Sao Tome and Prin
|-
|Sao Tome and Principe
|São tomé and prínci
|-
|Sao Tome and Principe
|São Tomé and Príncipe
|-
|Sao Tome and Principe
|São Tomé und Príncipe
|-
|Sudan South
|South Sudan
|-
|Sudan South
|South sudan
|-
|Sudan South
|Südsudan
|-
|Sudan South
|South sudan, repub
|-
|Tunisia
|Tua
|-
|Tunisia
|Tuen
|-
|Tunisia
|Tunesien
|-
|United States of America
|United States
|-
|United States of America
|USA
|-
|United States of America
|Usa
|-
|United States of America
|Vereinigte Staaten
|-
|Venezuela, Bolivarian Republic
|Venezuela
|-
|Venezuela, Bolivarian Republic
|Venezuela, boliva
|-
|Venezuela, Bolivarian Republic
|Venezuela, Bolivar
|-
|Venezuela, Bolivarian Republic
|Venezuela, Bolivaria
|-
|Venezuela, Bolivarian Republic
|Venezuela, bolivaria
|-
|Venezuela, Bolivarian Republic
|Venezuela, bolivarian
|}
 
=== [https://github.com/shangkexin/CIA_World_Factbook_Ethnic_groups/blob/main/Country.xlsx CIA] ===
{| class="wikitable"
!IFs Countries
!CIA Countries
|-
|-
|Afghanistan
|Afghanistan
|
|Afghanistan
|-
|Bahamas
|Bahamas, The
|-
|Brunei Darussalam
|Brunei
|-
|Burundi
|
|-
|Comoros
|
|-
|Congo
|Congo, Republic of the
|-
|Congo, Dem. Republic of the
|
|-
|Czech Republic
|Czechia
|-
|Eswatini
|
|-
|France
|
|-
|Gambia
|Gambia, The
|-
|Guinea Bissau
|Guinea-Bissau
|-
|Iran
|
|-
|Italy
|
|-
|Korea, Dem. People's Republic
|
|-
|Korea, Republic of
|
|-
|Lao People's Dem. Republic
|Laos
|-
|Macedonia, North
|North Macedonia
|-
|Madagascar
|
|-
|Maldives
|
|-
|Malta
|
|-
|Mauritius
|
|-
|Micronesia
|Micronesia, Federated States of
|-
|Moldova, Republic of
|Moldova
|-
|Myanmar
|Burma
|-
|Oman
|
|-
|Palestine
|
|-
|Papua New Guinea
|
|-
|Russian Federation
|Russia
|-
|Rwanda
|
|-
|Sahrawi Arab Democratic Republic
|
|-
|Sao Tome and Principe
|
|-
|Seychelles
|
|-
|Somalia
|
|-
|St. Lucia
|Saint Lucia
|-
|St. Vincent and the Grenadines
|Saint Vincent and the Grenadines
|-
|Sudan South
|South Sudan
|-
|Syrian Arab Republic
|Syria
|-
|Timor-Leste
|
|-
|Turkey
|Turkey (Turkiye)
|-
|United States of America
|United States
|-
|Venezuela, Bolivarian Republic
|
|-
|-
|Viet Nam
|Albania
|Vietnam
|Albania
|-
|-
|Yemen
| colspan="2" |and so on...
|
|}
|}


=== [https://github.com/n1shamin/Country-Concordance/blob/main/FAO%20Country.xlsx Food Agricultural Organization (FAO)] ===
=== Process to Create Country Concordance ===
{| class="wikitable"
Use the app to import your dataset. Once you choose a file, and click upload and select the worksheet if needed: 
|+
[[File:Step 1- Upload Excel-CSV File - sheet.png|none|thumb|600x600px]]The page will go to Data Preview. Select the row or column that contains the country names. For this example it is the column "country_name"[[File:Step 2- Data Preview.png|none|thumb|600x600px]]Select Fuzzy Match and at least 80%. You can go higher but high percentages mean that names can be missed if they fall short. [[File:Step 3- Name-matching.png|none|thumb|600x600px]]Once you click match names you will be led to this page: [[File:Review Name Matches (Page 1 of 1).png|none|thumb|600x600px]]In the bottom of the page there are two important options; "Save Mapping" and "Submit & Download File.
!IFs Countries
!FAO Countries
|-
|Bolivia
|Bolivia (Plurinational State of)
|-
|China
|China, mainland
|-
|Congo, Dem. Republic of the
|Democratic Republic of the Congo
|-
|Cote D'Ivoire
|Côte d'Ivoire
|-
|Czech Republic
|Czechia
|-
|Guinea Bissau
|Guinea-Bissau
|-
|Hong Kong
|China, Hong Kong SAR
|-
|Iran
|Iran (Islamic Republic of)
|-
|Korea, Dem. People's Republic
|Democratic People's Republic of Korea
|-
|Korea, Republic of
|Republic of Korea
|-
|Kosovo
|
|-
|Lao People's Dem. Republic
|Lao People's Democratic Republic
|-
|Macedonia, North
|North Macedonia
|-
|Micronesia
|Micronesia (Federated States of)
|-
|Moldova, Republic of
|Republic of Moldova
|-
|Netherlands
|Netherlands (Kingdom of the)
|-
|Sahrawi Arab Democratic Republic
|Western Sahara
|-
|St. Lucia
|Saint Lucia
|-
|St. Vincent and the Grenadines
|Saint Vincent and the Grenadines
|-
|Sudan South
|South Sudan
|-
|Taiwan
|China, Taiwan Province of
|-
|Tanzania
|United Republic of Tanzania
|-
|Turkey
|Türkiye
|-
|United Kingdom
|United Kingdom of Great Britain and Northern Ireland
|-
|Venezuela, Bolivarian Republic
|Venezuela (Bolivarian Republic of)
|}
FAO does not have values for Kosovo.


=== [https://github.com/n1shamin/Country-Concordance/blob/main/GCP%20Country.xlsx Global Carbon Project GCP] ===
"Submit & Download File" changes the name in the original file to the IFs country names so the data technician can use the IFs interface to import into a db file. '''Before doing this make sure that all the country names have been changed correctly'''. Sometimes Turkey can be missed due to a less percentage in FuzzyMatch.
{| class="wikitable"
!IFs Countries
!GCP Countries
|-
|Cabo Verde
|Cape Verde
|-
|Congo, Dem. Republic of the
|Democratic Republic of the Congo
|-
|Cote D'Ivoire
|Côte d'Ivoire
|-
|Czech Republic
|Czechia
|-
|Guinea Bissau
|Guinea-Bissau
|-
|Korea, Dem. People's Republic
|North Korea
|-
|Korea, Republic of
|South Korea
|-
|Lao People's Dem. Republic
|Laos
|-
|Macedonia, North
|North Macedonia
|-
|Micronesia
|Micronesia (Federated States of)
|-
|Moldova, Republic of
|Moldova
|-
|Palestine
|State of Palestine
|-
|Puerto Rico
|
|-
|Russian Federation
|Russia
|-
|Sahrawi Arab Democratic Republic
|
|-
|St. Lucia
|Saint Lucia
|-
|St. Vincent and the Grenadines
|Saint Vincent and the Grenadines
|-
|Sudan South
|Sudan
|-
|Syrian Arab Republic
|Syria
|-
|Turkey
|Türkiye
|-
|United States of America
|USA
|-
|Venezuela, Bolivarian Republic
|Venezuela
|}
Puerto Rico and Sahrawi Arab Democratic Republic are not present in dataset.  


=== [https://github.com/n1shamin/Country-Concordance/blob/main/GWNO%20Country.xlsx GNWO Codes]: Used in [[Historical GDP/GDPPC/Population|Historical GDP]] ===
"Save Mapping" will download a file that shows the organization's country list compared to IFs country list.
Numerical values are used instead of IDs and or names. Click the links to head to the GitHub repo or wiki.  
[[File:Mapping Table.png|none|thumb|1157x1157px]]
Before inputting the excel file into this [https://github.com/n1shamin/Country-Concordance GitHub] change "original_name" to Country or IFs Country and "matched_name" to the organization's name. For example this is from World Bank's Poverty and Inequality Platform so the name would PIP Country.


=== [https://github.com/n1shamin/Country-Concordance/blob/main/CountryIHME.xlsx Institute for Health Metrics and Evaluation (IHME)] ===
If you are using a code to import directly into a db file, here is an example using Python and panadas:  
There are different versions of country lists used by the IHME. For example interpersonal violence has a different country list than others. All of them are below:  
{| class="wikitable"
!IFs Countries
!IHME Countries
|-
|
|
|-
|
|
|-
|
|
|}{| class="wikitable"
|+
!IFs Countries
!IHME Countries
|-
|
|
|-
|
|
|-
|
|
|}
Interpersonal Violence (usually)
{| class="wikitable"
!IFs Countries
!IHME Countries
|-
|
|
|-
|
|
|-
|
|
|}


=== [https://github.com/n1shamin/Country-Concordance/blob/main/ILO%20Country.xlsx International Labor Organization] ===
country = pd.read_excel('ITU Country.xlsx') #get this excel from the repo listed above
{| class="wikitable"
!IFs Countries
!ILO Countries
|-
|Bolivia
|Bolivia  (Plurinational State of)
|-
|Congo, Dem. Republic of the
|Congo, Democratic Republic of the
|-
|Cote D'Ivoire
|Côte d'Ivoire
|-
|Czech Republic
|Czechia
|-
|Grenada
|
|-
|Guinea Bissau
|Guinea-Bissau
|-
|Hong Kong
|Hong Kong, China
|-
|Iran
|Iran (Islamic Republic of)
|-
|Kiribati
|
|-
|Korea, Dem. People's Republic
|Korea (the Democratic People's Republic  of)
|-
|Korea, Republic of
|Republic of Korea
|-
|Kosovo
|
|-
|Lao People's Dem. Republic
|Lao People's Democratic Republic
|-
|Macedonia, North
|North Macedonia
|-
|Micronesia
|
|-
|Moldova, Republic of
|Republic of Moldova
|-
|Palestine
|Occupied Palestinian Territory
|-
|Sahrawi Arab Democratic Republic
|Western Sahara
|-
|Seychelles
|
|-
|St. Lucia
|Saint Lucia
|-
|St. Vincent and the Grenadines
|Saint Vincent and the Grenadines
|-
|Sudan South
|South Sudan
|-
|Taiwan
|Taiwan, China
|-
|Tanzania
|Tanzania, United Republic of
|-
|Turkey
|Türkiye
|-
|United Kingdom
|United Kingdom of Great Britain and  Northern Ireland
|-
|Venezuela, Bolivarian Republic
|Venezuela (Bolivarian Republic of)
|}
Grenada, Kiribati, Kosovo, Micronesia, and Seychelles are not present in ILO.


=== International Monetary Funds (IMF) ===
dt = country.merge(df2, on='entityName', how='left') #entityName is what the list is called in ITU


====== IMF World Economics Outlook ======
dt = dt.drop(columns='entityName') #after merging you must drop this column before importing
{| class="wikitable"
|+
!IFs Countries
!IMF Countries
|-
|Bahamas
|The Bahamas
|-
|Congo
|Republic of Congo
|-
|Congo, Dem. Republic of the
|Democratic Republic of the Congo
|-
|Cote D'Ivoire
|Côte d'Ivoire
|-
|Cuba
|
|-
|Gambia
|The Gambia
|-
|Guinea Bissau
|Guinea-Bissau
|-
|Hong Kong
|Hong Kong SAR
|-
|Iran
|Islamic Republic of Iran
|-
|Korea, Dem. People's Republic
|
|-
|Korea, Republic of
|Korea
|-
|Kyrgyzstan
|Kyrgyz Republic
|-
|Lao People's Dem. Republic
|Lao P.D.R.
|-
|Macedonia, North
|North Macedonia
|-
|Moldova, Republic of
|Moldova
|-
|Palestine
|West Bank and Gaza
|-
|Russian Federation
|Russia
|-
|Sao Tome and Principe
|São Tomé and Príncipe
|-
|Slovakia
|Slovak Republic
|-
|Sudan South
|South Sudan
|-
|Syrian Arab Republic
|Syria
|-
|Taiwan
|Taiwan Province of China
|-
|Turkey
|Türkiye
|-
|United States of America
|United States
|-
|Venezuela, Bolivarian Republic
|Venezuela
|-
|Viet Nam
|Vietnam
|}
Cuba, Korea and Dem. People's Republic do not exist in this database.


====== [https://github.com/n1shamin/Country-Concordance/blob/main/IMFGFS%20Country.xlsx IMF Government Financial Statistics] ======
dt = dt.rename(columns={'IFs Country':'Country'}) #this must be changed before importing as the column name in IFs is Country
{| class="wikitable"
|+
!IF Countries
!IMFGFS Countries
|-
|Afghanistan
|Afghanistan, Islamic  Rep. of
|-
|Armenia
|Armenia, Rep. of
|-
|Azerbaijan
|Azerbaijan, Rep. of
|-
|Belarus
|Belarus, Rep. of
|-
|Hong Kong
|China, P.R.: Hong Kong
|-
|China
|China, P.R.: Mainland
|-
|Croatia
|Croatia, Rep. of
|-
|Czech Republic
|Czech Rep.
|-
|Dominican Republic
|Dominican Rep.
|-
|Egypt
|Egypt, Arab Rep. of
|-
|Estonia
|Estonia, Rep. of
|-
|Iran
|Iran, Islamic Rep. of
|-
|Kazakhstan
|Kazakhstan, Rep. of
|-
|Korea, Dem. People's Republic
|Korea, Rep. of
|-
|Kosovo
|Kosovo, Rep. of
|-
|Kyrgyzstan
|Kyrgyz Rep.
|-
|Moldova, Republic of
|Moldova, Rep. of
|-
|Netherlands
|Netherlands, The
|-
|Poland
|Poland, Rep. of
|-
|Serbia
|Serbia, Rep. of
|-
|Slovenia
|Slovenia, Rep. of
|-
|Tajikistan
|Tajikistan, Rep. of
|-
|Timor-Leste
|Timor-Leste, Dem. Rep. of
|-
|Turkey
|Türkiye, Rep of
|-
|United States of America
|United States
|-
|Uzbekistan
|Uzbekistan, Rep. of
|-
|Yemen
|Yemen, Rep. of
|}
GFS is limited and many countries are missing. View the GitHub repo.


=== [https://github.com/n1shamin/Country-Concordance/blob/main/ITU%20Country.xlsx International Telecommunication Union] ===
== Special Cases ==
{| class="wikitable"
Each organization might have special cases due to the nature of their country list. Therefore in the "Notes"/"Data Notes" section of each data source page (example: [[FAOSTAT Land Use#Notes]]) contains information necessary to complete the data pulling tasks (example: FAOSTAT Land Use needs disaggregation, notes for Ethiopia, and notes for Kosovo and Serbia).  
!IFs Countries
!ITU Countries
|-
|Bolivia
|Bolivia  (Plurinational State of)
|-
|Central African Republic
|Central African Rep.
|-
|Congo
|Congo (Rep. of the)
|-
|Congo, Dem. Republic of the
|Dem. Rep. of the Congo
|-
|Cote D'Ivoire
|Côte d'Ivoire
|-
|Dominican Republic
|Dominican Rep.
|-
|Guinea Bissau
|Guinea-Bissau
|-
|Hong Kong
|Hong Kong, China
|-
|Iran
|Iran (Islamic Republic of)
|-
|Korea, Dem. People's Republic
|
|-
|Korea, Republic of
|Korea (Rep. of)
|-
|Kosovo
|
|-
|Lao People's Dem. Republic
|Lao P.D.R.
|-
|Macedonia, North
|North Macedonia
|-
|Moldova, Republic of
|Moldova
|-
|Nepal
|Nepal (Republic of)
|-
|Palestine
|State of Palestine
|-
|Sahrawi Arab Democratic Republic
|
|-
|St. Lucia
|Saint Lucia
|-
|St. Vincent and the Grenadines
|Saint Vincent and the Grenadines
|-
|Sudan South
|South Sudan
|-
|Taiwan
|Taiwan, Province of China
|-
|Turkey
|Türkiye
|-
|United States of America
|United States
|-
|Venezuela, Bolivarian Republic
|Venezuela
|}
Korea, Dem. People's Republic, Kosovo, and Sahrawi Arab Democratic Republic are not present in the database.  


=== [https://github.com/n1shamin/Country-Concordance/blob/main/JMP%20WHO%20UNICEF%20Country.xlsx Joint Monitoring Programme WHO UNICEF] ===
=== Proxies ===
{| class="wikitable"
Proxies are used when a certain country does not have data points and estimates are needed. Proxies should be used on a case by case basis and for certain series. Proxies essentially use a similar country to the country being approximated (in terms of population, GDP, etc.) and calculates the scale of what the approximated countries value should be.
!IFs Countries
!JMP Countries
|-
|Bolivia
|Bolivia (Plurinational State of)
|-
|Congo, Dem. Republic of the
|Democratic Republic of the Congo
|-
|Cote D'Ivoire
|Côte d’Ivoire
|-
|Czech Republic
|Czechia
|-
|Guinea Bissau
|Guinea-Bissau
|-
|Hong Kong
|China, Hong Kong SAR
|-
|Iran
|Iran (Islamic Republic of)
|-
|Korea, Dem. People's Republic
|Democratic People's Republic of Korea
|-
|Korea, Republic of
|Republic of Korea
|-
|Kosovo
|
|-
|Lao People's Dem. Republic
|Lao People's Democratic Republic
|-
|Macedonia, North
|North Macedonia
|-
|Micronesia
|Micronesia (Federated States of)
|-
|Moldova, Republic of
|Republic of Moldova
|-
|Netherlands
|Netherlands (Kingdom of the)
|-
|Palestine
|State of Palestine
|-
|Sahrawi Arab Democratic Republic
|
|-
|St. Lucia
|Saint Lucia
|-
|St. Vincent and the Grenadines
|Saint Vincent and the Grenadines
|-
|Sudan South
|South Sudan
|-
|Taiwan
|
|-
|Tanzania
|United Republic of Tanzania
|-
|Turkey
|Türkiye
|-
|United Kingdom
|United Kingdom of Great Britain and Northern Ireland
|-
|Venezuela, Bolivarian Republic
|Venezuela (Bolivarian Republic of)
|}
Kosovo, Sahrawi Arab Democratic Republic, and Taiwan are not present in dataset.  


=== [https://github.com/shangkexin/The_Organisation_for_Economic_Cooperation_and_Development_OECD/blob/main/Country.xlsx OECD] ===
An example from IHME's series is Kosovo and Albania. Albania is similar and geographically close to Kosovo making it an ideal for a proxy. For a series that deals with death such as SocietalViolenceDeathsTotal use the following equation: 
In sheet in "OECD" in excel. There are many missing countries; there are also sheets specific to the table (ie Donor) that might have more countries not included in the base country tab. '''Make sure to check before pulling'''.
{| class="wikitable"
|+
!IFs Countries
!OECD Countries
|-
|Australia
|
|-
|Austria
|
|-
|Belgium
|
|-
|Bulgaria
|
|-
|Canada
|
|-
|China
|China (People’s Republic of)
|-
|Congo, Dem. Republic of the
|Democratic Republic of the Congo
|-
|Cote D'Ivoire
|Côte d’Ivoire
|-
|Czech Republic
|
|-
|Denmark
|
|-
|Estonia
|
|-
|Finland
|
|-
|France
|
|-
|Germany
|
|-
|Greece
|
|-
|Guinea Bissau
|Guinea-Bissau
|-
|Hong Kong
|Hong Kong (China)
|-
|Hungary
|
|-
|Iceland
|
|-
|Ireland
|
|-
|Italy
|
|-
|Japan
|
|-
|Korea, Dem. People's Republic
|Democratic People’s Republic of Korea
|-
|Korea, Republic of
|Korea
|-
|Lao People's Dem. Republic
|Lao People’s Democratic Republic
|-
|Latvia
|
|-
|Lithuania
|
|-
|Luxembourg
|
|-
|Macedonia, North
|North Macedonia
|-
|Moldova, Republic of
|Moldova
|-
|Netherlands
|
|-
|New Zealand
|
|-
|Norway
|
|-
|Palestine
|Palestinian Authority or West Bank and Gaza Strip
|-
|Poland
|
|-
|Portugal
|
|-
|Puerto Rico
|
|-
|Romania
|
|-
|Russian Federation
|
|-
|Sahrawi Arab Democratic Republic
|
|-
|Slovakia
|
|-
|Spain
|
|-
|St. Lucia
|Saint Lucia
|-
|St. Vincent and the Grenadines
|Saint Vincent and the Grenadines
|-
|Sudan South
|South Sudan
|-
|Sweden
|
|-
|Switzerland
|
|-
|Taiwan
|Chinese Taipei
|-
|Turkey
|Türkiye
|-
|United Kingdom
|
|-
|United States of America
|
|-
|Venezuela, Bolivarian Republic
|Venezuela
|}


=== [https://github.com/n1shamin/Country-Concordance/blob/main/Transparency%20International%20Country.xlsx Transparency International] ===
* Kosovo’s number of deaths = Albania’s number of deaths * (Kosovo’s Population / Albania’s Population)
{| class="wikitable"
!IFs Countries
!TI Countries
|-
|Belize
|
|-
|Congo, Dem. Republic of the
|Democratic Republic of the Congo
|-
|Cote D'Ivoire
|Côte d'Ivoire
|-
|Czech Republic
|Czechia
|-
|Guinea Bissau
|Guinea-Bissau
|-
|Kiribati
|
|-
|Korea, Dem. People's Republic
|North Korea
|-
|Korea, Republic of
|South Korea
|-
|Lao People's Dem. Republic
|Laos
|-
|Macedonia, North
|North Macedonia
|-
|Micronesia
|
|-
|Moldova, Republic of
|Moldova
|-
|Palestine
|
|-
|Russian Federation
|Russia
|-
|Sahrawi Arab Democratic Republic
|
|-
|Samoa
|
|-
|St. Lucia
|Saint Lucia
|-
|St. Vincent and the Grenadines
|Saint Vincent and the Grenadines
|-
|Sudan South
|South Sudan
|-
|Syrian Arab Republic
|Syria
|-
|Tonga
|
|-
|Venezuela, Bolivarian Republic
|Venezuela
|-
|Viet Nam
|Vietnam
|}
Belize, Kiribati, Micronesia, Palestine, Sahrawi Arab Democratic Republic, Samoa, and Tonga are not present in the dataset.


=== [https://github.com/n1shamin/Country-Concordance/blob/main/UIS%20UNESCO%20Country.xlsx UIS UNESCO] ===
=== Disaggregation ===
{| class="wikitable"
A lot of organizations have values for dissolved states such as USSR, Yugoslavia, Czechoslovakia, and more. This then leads to gaps in values for the newly formed such as Serbia, Slovakia, Czechia, etc. Therefore, disaggregation is a great tool to fill in these gaps. Below is the disaggregation steps for common groups:
|+
!FIPS_CODE
!IFs Country
!COUNTRY_ID UIS
|-
|ZAR
|Congo, Dem. Republic of the
|COD
|-
|KSV
|Kosovo
|
|-
|MON
|Mongolia
|MNG
|-
|WBG
|Palestine
|PSE
|-
|ROM
|Romania
|ROU
|-
|YUG
|Serbia
|SRB
|-
|TWN
|Taiwan
|
|-
|TMP
|Timor-Leste
|TLS
|}
Kosovo and Taiwan are not present in this dataset. This dataset is also present in IDs and not names.


=== [https://github.com/n1shamin/Country-Concordance/blob/main/UNDP%20Country.xlsx United Nation Development Programme (UNDP)] ===
# Czechoslovakia = Slovakia and Czech Republic
{| class="wikitable"
# Yugoslav SFR = Slovenia, Croatia, Bosnia and Herzegovina, North Macedonia, Serbia, and Montenegro
|+
# Serbia and Montenegro = Serbia and Montenegro
!IFs Countries
# Sudan (former) = Sudan and South Sudan
!UNDP Countries
# USSR = Armenia, Azerbaijan, Belarus, Estonia, Georgia, Kazakhstan, Kyrgyzstan, Latvia, Lithuania, Moldova, Russia, Tajikistan, Turkmenistan, Ukraine, and Uzbekistan
|-
|Bolivia
|Bolivia (Plurinational State of)
|-
|Congo, Dem. Republic of the
|Democratic Republic of the Congo
|-
|Cote D'Ivoire
|Côte d'Ivoire
|-
|Czech Republic
|Czechia
|-
|Guinea Bissau
|Guinea-Bissau
|-
|Hong Kong
|China, Hong Kong SAR
|-
|Iran
|Iran (Islamic Republic of)
|-
|Korea, Dem. People's Republic
|Dem. People's Republic of Korea
|-
|Korea, Republic of
|Republic of Korea
|-
|Kosovo
|Kosovo (under UNSC res. 1244)
|-
|Lao People's Dem. Republic
|Lao People's Democratic Republic
|-
|Macedonia, North
|North Macedonia
|-
|Micronesia
|Micronesia (Fed. States of)
|-
|Moldova, Republic of
|Republic of Moldova
|-
|Palestine
|State of Palestine
|-
|Sahrawi Arab Democratic Republic
|Western Sahara
|-
|St. Lucia
|Saint Lucia
|-
|St. Vincent and the Grenadines
|Saint Vincent and the Grenadines
|-
|Sudan South
|South Sudan
|-
|Taiwan
|China, Taiwan Province of China
|-
|Tanzania
|United Republic of Tanzania
|-
|Turkey
|Türkiye
|-
|Venezuela, Bolivarian Republic
|Venezuela (Bolivarian Republic of)
|}


=== World Bank ===
For disaggregation you are adding up the values for the year after the state disbanded, using the data when the state existed, have the values for the new states after the entity disbanded, and then extrapolate (find the percentage of each state in the data from the OG state and then multiply it to get the right amount) data for the new states in the previous years.


====== [https://github.com/shangkexin/World_Development_Indicators_WDI/blob/main/WDI_Country.xlsx World Development Indicators] ======
# USSR add up the values for 1992 (year it disbanded); have totals for 1961-1991; have the values for 15 states for 1992; extrapolate data for 15 states 1961-1991
{| class="wikitable"
# Czechoslovakia add up the values for for 1993 (year it disbanded); have totals for 1961-1992; have the values for 2 states for 1993; extrapolate data for 2 states 1961-1992
|+
# Sudan (former) add up the values for 2012 (year it disbanded); have totals for 1961-2011; have the values for 2 states for 2012; extrapolate data for 2 states 1961-2011
!IFs Countries
# Yugoslav SFR add up the values for 1992 (year it disbanded); have totals for 1961-1991; have the values for 6 states for 1992; extrapolate for 6 states 1961-1991
!World Bank Countries
# Serbia and Montenegro add up the values for 2006 (year it disbanded); have totals for 1992-2005; have the values for 2 states for 2006; extrapolate data for 2 states 1992-2005
|-
An example of this disaggregation can be found in [[FAOSTAT Land Use]].
|Bahamas
|Bahamas, The
|-
|Congo
|Congo, Rep.
|-
|Congo, Dem. Republic of the
|Congo, Dem. Rep.
|-
|Czech Republic
|Czechia
|-
|Egypt
|Egypt, Arab Rep.
|-
|Gambia
|Gambia, The
|-
|Guinea Bissau
|Guinea-Bissau
|-
|Hong Kong
|Hong Kong SAR, China
|-
|Iran
|Iran, Islamic Rep.
|-
|Korea, Dem. People's Republic
|Korea, Dem. People's Rep.
|-
|Korea, Republic of
|Korea, Rep.
|-
|Kyrgyzstan
|Kyrgyz Republic
|-
|Lao People's Dem. Republic
|Lao PDR
|-
|Macedonia, North
|North Macedonia
|-
|Micronesia
|Micronesia, Fed. Sts.
|-
|Moldova, Republic of
|Moldova
|-
|Palestine
|West Bank and Gaza
|-
|Sahrawi Arab Democratic Republic
|
|-
|Slovakia
|Slovak Republic
|-
|Sudan South
|South Sudan
|-
|Taiwan
|
|-
|Turkey
|Turkiye
|-
|United States of America
|United States
|-
|Venezuela, Bolivarian Republic
|Venezuela, RB
|-
|Yemen
|Yemen, Rep.
|}
Sahrawi Arab Democratic Republic is not present in this dataset. 
 
For some World Bank indicators, Taiwan, China is present but in WDI is not.
 
=== World Health Organization (WHO) ===
 
====== [https://github.com/n1shamin/Country-Concordance/blob/main/Global%20Health%20Observatory%20(GHO)%2C%20WHO.csv Global Health Observatory (GHO)] ======
{| class="wikitable"
|+
!IFs Countries
!GHO Countries
|-
|Bolivia
|Bolivia  (Plurinational State of)
|-
|Congo, Dem. Republic of the
|Democratic Republic of the Congo
|-
|Czech Republic
|Czechia
|-
|Guinea Bissau
|Guinea-Bissau
|-
|Hong Kong
|
|-
|Iran
|Iran (Islamic Republic of)
|-
|Korea, Dem. People's Republic
|Democratic People's Republic of Korea
|-
|Korea, Republic of
|Republic of Korea
|-
|Kosovo
|
|-
|Lao People's Dem. Republic
|Lao People's Democratic Republic
|-
|Macedonia, North
|North Macedonia
|-
|Micronesia
|Micronesia (Federated States of)
|-
|Moldova, Republic of
|Republic of Moldova
|-
|Netherlands
|Netherlands (Kingdom of the)
|-
|Palestine
|
|-
|Sahrawi Arab Democratic Republic
|
|-
|St. Lucia
|Saint Lucia
|-
|St. Vincent and the Grenadines
|Saint Vincent and the Grenadines
|-
|Sudan South
|South Sudan
|-
|Taiwan
|
|-
|Tanzania
|United Republic of Tanzania
|-
|Turkey
|Türkiye
|-
|United Kingdom
|United Kingdom of Great Britain and  Northern Ireland
|-
|Venezuela, Bolivarian Republic
|Venezuela (Bolivarian Republic of)
|}
Hong Kong, Kosovo, Palestine, Sahrawi Arab Democratic Republic, and Taiwan are not present in database.

Latest revision as of 18:01, 3 October 2025

Summary

Country concordance is an important aspect of a data technician's job to maintain the International Futures database. Country concordance refers to the differences between the IFs Country list and another organization's country list and merging them to ensure the IFs country list in IFs system. For example, in some organization "Türkiye" is the name displayed, but for IFs we use the name "Turkey". Therefore IFs data technicians need to change the name to "Turkey" for the system to process it. To do this there are multiple steps that can be taken and they are shown below:

GitHub Repositories

There are two repositories essential to this process; a mini app that can handle concordance for files less than 15mb and a centralized GitHub repository for all country concordance tables.

The app allows users to upload datasets from other organizations and utilizes fuzzy matching against standardized International Futures territory names. This is to make it easier for data technicians to ensure all country data points are being pulled correctly. You can clone the repository but a link to the application can be found here.

The centralized GitHub is a growing project thank contains excel sheets for mapping tables between an organization's country list and International Futures country names.

The format for these excels are formatted as:

IFs Country ex: ITU Country
Afghanistan Afghanistan
Albania Albania
and so on...

Process to Create Country Concordance

Use the app to import your dataset. Once you choose a file, and click upload and select the worksheet if needed:

Step 1- Upload Excel-CSV File - sheet.png

The page will go to Data Preview. Select the row or column that contains the country names. For this example it is the column "country_name"

Step 2- Data Preview.png

Select Fuzzy Match and at least 80%. You can go higher but high percentages mean that names can be missed if they fall short.

Step 3- Name-matching.png

Once you click match names you will be led to this page:

Review Name Matches (Page 1 of 1).png

In the bottom of the page there are two important options; "Save Mapping" and "Submit & Download File."

"Submit & Download File" changes the name in the original file to the IFs country names so the data technician can use the IFs interface to import into a db file. Before doing this make sure that all the country names have been changed correctly. Sometimes Turkey can be missed due to a less percentage in FuzzyMatch.

"Save Mapping" will download a file that shows the organization's country list compared to IFs country list.

Mapping Table.png

Before inputting the excel file into this GitHub change "original_name" to Country or IFs Country and "matched_name" to the organization's name. For example this is from World Bank's Poverty and Inequality Platform so the name would PIP Country.

If you are using a code to import directly into a db file, here is an example using Python and panadas:

country = pd.read_excel('ITU Country.xlsx') #get this excel from the repo listed above

dt = country.merge(df2, on='entityName', how='left') #entityName is what the list is called in ITU

dt = dt.drop(columns='entityName') #after merging you must drop this column before importing

dt = dt.rename(columns={'IFs Country':'Country'}) #this must be changed before importing as the column name in IFs is Country

Special Cases

Each organization might have special cases due to the nature of their country list. Therefore in the "Notes"/"Data Notes" section of each data source page (example: FAOSTAT Land Use#Notes) contains information necessary to complete the data pulling tasks (example: FAOSTAT Land Use needs disaggregation, notes for Ethiopia, and notes for Kosovo and Serbia).

Proxies

Proxies are used when a certain country does not have data points and estimates are needed. Proxies should be used on a case by case basis and for certain series. Proxies essentially use a similar country to the country being approximated (in terms of population, GDP, etc.) and calculates the scale of what the approximated countries value should be.

An example from IHME's series is Kosovo and Albania. Albania is similar and geographically close to Kosovo making it an ideal for a proxy. For a series that deals with death such as SocietalViolenceDeathsTotal use the following equation:

  • Kosovo’s number of deaths = Albania’s number of deaths * (Kosovo’s Population / Albania’s Population)

Disaggregation

A lot of organizations have values for dissolved states such as USSR, Yugoslavia, Czechoslovakia, and more. This then leads to gaps in values for the newly formed such as Serbia, Slovakia, Czechia, etc. Therefore, disaggregation is a great tool to fill in these gaps. Below is the disaggregation steps for common groups:

  1. Czechoslovakia = Slovakia and Czech Republic
  2. Yugoslav SFR = Slovenia, Croatia, Bosnia and Herzegovina, North Macedonia, Serbia, and Montenegro
  3. Serbia and Montenegro = Serbia and Montenegro
  4. Sudan (former) = Sudan and South Sudan
  5. USSR = Armenia, Azerbaijan, Belarus, Estonia, Georgia, Kazakhstan, Kyrgyzstan, Latvia, Lithuania, Moldova, Russia, Tajikistan, Turkmenistan, Ukraine, and Uzbekistan

For disaggregation you are adding up the values for the year after the state disbanded, using the data when the state existed, have the values for the new states after the entity disbanded, and then extrapolate (find the percentage of each state in the data from the OG state and then multiply it to get the right amount) data for the new states in the previous years.

  1. USSR add up the values for 1992 (year it disbanded); have totals for 1961-1991; have the values for 15 states for 1992; extrapolate data for 15 states 1961-1991
  2. Czechoslovakia add up the values for for 1993 (year it disbanded); have totals for 1961-1992; have the values for 2 states for 1993; extrapolate data for 2 states 1961-1992
  3. Sudan (former) add up the values for 2012 (year it disbanded); have totals for 1961-2011; have the values for 2 states for 2012; extrapolate data for 2 states 1961-2011
  4. Yugoslav SFR add up the values for 1992 (year it disbanded); have totals for 1961-1991; have the values for 6 states for 1992; extrapolate for 6 states 1961-1991
  5. Serbia and Montenegro add up the values for 2006 (year it disbanded); have totals for 1992-2005; have the values for 2 states for 2006; extrapolate data for 2 states 1992-2005

An example of this disaggregation can be found in FAOSTAT Land Use.