SubRegionalization Handbook: Difference between revisions
AltheaDitter (talk | contribs) No edit summary |
AltheaDitter (talk | contribs) No edit summary |
||
Line 268: | Line 268: | ||
=== International Futures === | === International Futures === | ||
From the main menu, go to '''Extended Features '''''> ''''' | From the main menu, go to '''Extended Features '''''>'''''Change Country Sub-Regionalization'''''> '''''A''''''dd or Delete SubRegions.''' | ||
If there are no sub-regions underneath the “Countries with Sub-Regions in the system” heading, skip ahead to the Adding Sub-regions heading. Otherwise, you will need to delete the sub-regional breakdown in the model. Highlight and select delete in the bottom right corner of the menu: | If there are no sub-regions underneath the “Countries with Sub-Regions in the system” heading, skip ahead to the Adding Sub-regions heading. Otherwise, you will need to delete the sub-regional breakdown in the model. Highlight and select delete in the bottom right corner of the menu: | ||
Line 278: | Line 278: | ||
This removes the provincial model and returns IFs to the 186 country-level model. Just a warning: this is an intensive process. Your computer might say that IFs is not responding, but it is probably working just fine. The same goes for most of this process – do not take a not responding message as a cause for concern. | This removes the provincial model and returns IFs to the 186 country-level model. Just a warning: this is an intensive process. Your computer might say that IFs is not responding, but it is probably working just fine. The same goes for most of this process – do not take a not responding message as a cause for concern. | ||
<div>When this process is complete, you will see a message on the bottom left. Now select exit and restart the model entirely.</div> | <div>When this process is complete, you will see a message on the bottom left. Now select exit and restart the model entirely.</div> | ||
=== Adding SubRegions === | === Adding SubRegions === | ||
Revision as of 21:25, 1 May 2017
Foundational Elements
A Note on Terminology
Sub-regional units go by many names internationally. When we refer to a sub-region in this document, we may use a variety of names to refer to the same entity. In IFs, there is a bias toward the use of “province” to refer to a sub-regional unit, but it is functionally equivalent to states, departments, territories, or any other name for a sub-regional unit.
Much of the following discussion will involve variables included in the model, such as GDP or POPULATION. These may be referred to as variable or series. The two terms are used interchangeably here.
Database Modification
IFsHistSeriesXXX.mdb
Before a country in IFs can be divided into states/provinces/departments, a new data file called IFsHistSeriesXXX.mdb — for example IFsHistSeriesChina.mdb — must be created. That file is a variation of the file called IFsHistSeries.mdb, which contains data tables for all countries across time, for all variables.
For each data table in IFsHistSeriesXXX.mdb, rows correspond to all states/provinces/departments in the specified country. The file includes a table called DataDict (analogous to the file DataDict.mdb described below) that provides information on names, sources, and specialized procedures followed to prepare the state/province/department data.
To subregionalize countries that do not already have a IFsHistSeriesXXX.mdb file, users will need to create one. While this can be done manually, it is faster and simpler to just add the file with a feature in IFs. Before creating the table, however, make sure that a file called _ifsHistSeriesxxx.mdb exists in C:\Users\Public\IFs\DATA. After that is done, follow these steps:
- Click the path: Extended Features > Change Country Subregionalization > Add New Historical Series File
- Select the country that is intended to be subregionalized and click OK
This does a couple things. First, it creates a new IFsHistSeries.mdb for the country you selected in C:\Users\Public\IFs\DATA. The file will include tables for the 4 essential series and a datadict. It also adds the country to the Provinces tables in Provinces.mdb.
The entire process of sub-regionalization affects five files: IFs.mdb, IFsHistSeries.mdb, IFsCoVatra.mdb, IFsCoVatraSeries.mdb, andIFsWVSCohort.mdb. For each of the files, the process looks in the IFs\Data folder for an equivalent file with the name of the country at the end — again, IFsHistSeriesChina.mdb is an example — then creates a copy in the IFs\Runfiles directory. Modification is done in the copies on the \Runfiles directory as well to the original files in the \Data directory. Data for individual sub-regional units and the country as a whole are preserved, and can be copied back and forth between \Data and \Runfiles directories. This allows the user to switch between model runs with different sub-regionalizations, and reset everything to the starting point without any states/provinces/departments sub-regionalized.
DataDict.mdb
Path: C:/ > Users > Public > IFs > DATA > DataDict.mdb
DataDict.mdb documents the variable names, sources, units of measurement, years of data coverage, and aggregation rules for all historical data series in the model.
The IFsHistSeriesXXX.mdb file has a copy of the DataDict table, and all series in IFsHistSeriesXXX.mdb must have a corresponding DataDict row in both places; otherwise, the model will not recognize them. If the fields Years andAggregation are not correctly filled in the series may not read correctly.
Provinces.mdb
Path: C:/ > Users > Public > IFs > DATA > Provinces.mdb
In addition to the creation of IFsHistSeriesXXX.mdb, initiating the sub-regionalization process requires modifications to tables that exist in the file Provinces.mdb. The alterations to specific tables in Provinces.mdb are outlined below:
All Provinces. The name and FIPS code of the target country should be added to a line in this table (those names should be identical to those used for the country in IFs.mdb and IFsHistSeries.mdb files; and province break-out should only be done when the Full version of IFs is being used with each country represented as a separate region). The AllProvinces field should normally be checked. The only circumstances under which it is not checked is when the user is selecting one, or a few, provinces for use in the model in addition to the entire country — see the discussion of the Regionalization table below.
Provinces. In this table all of the provinces or states should be identified. The first column is the target country (the same for all provinces). The second is a unique FIPS Code created by the user (since country-specific codes are three letters and cannot be duplicated, two-letter codes normally work well for these). Finally, the name of the sub-regional unit should be specified.
The addition of province or state names to this table can be done manually, but it is also automated with the "Add New Historical Series File” option, which creates IFsHistSeriesXXX.mdb with only the basic tables needed. Users can access this automated option from the main menu of IFs by following the path:
Extended Features > Change Country Sub-Regionalization > Add New Historical Series File
The province/state names need to coincide exactly with the ones that are used in the tables of the IFsHistSeriesXXX.mdb file for variables holding data. The province/state names also need to coincide with the province/state names that are used on the global map file to display the breakdown of countries into provinces/states. The exact names used in the map file can be found in IFs_Province.mdb using a query of the form: N.B: select Province from provinces where Country='Republic of India.'
Regionalization. This table repeats the target Country and Full Province Name as in the Provinces table. The new feature is the addition of a Region name. Each province could be a separate Region, in which case the names should be less than 10-12 letters so as to allow easy display in IFs tables and graphs. Most often the provinces will be grouped into sub-regions of the target Country, in which case it is recommended that the Region name be the target country name or some abbreviated part of it followed by the sub-region, again a total of no more than 10-12 letters. Leading with the Country name facilitates the use of alphabetic listings of IFs.
If the user is only adding a single province (or a very small number) to the model and not splitting a country into regions based on all states/provinces, the lines added to the Regionalization table will be only for the added states/provinces (e.g. Hawaii could be added in addition to the US by adding a line to this table for it only, not other US states, and then not checking AllProvinces in the AllProvinces table). Even if all provinces/states are in the data set, as with India, this process allows a single state of India to be pulled out and added to IFs. Warning: if a user only adds a state/province/department rather than dividing a country into sub-regions based on all of them, there will be double counting of the country and the added unit(s) in the model; only selected variables in IFs, such as WPOP and variables of the agricultural model have been set up for correction of such double counting in the run of the model.
The process of creation of regions can be done within IFs, under:
Extended Features > Change Country Sub-Regionalization > Change Sub-Regionalization
IFsxxx.mdb
This file holds many tables, but only a few are of interest for sub-regionalization:
PopAgeCohortCountryFemale andPopAgeCohortCountryMale. These files store data on the age-sex cohort breakdown of a population. If possible, use 2012 data because we populate the country-level tables with 2012 UNDP data. The first column, Cohort0, is reserved for infants (age <1). Cohort1 is for ages 1-4. Cohort2 and onward are 5-year age groups (e.g. Cohort2 is ages 5-9, etc.).Data accuracy is important because if there are large jumps between cohorts, errors can occur when running the model through 2100. If these data are not available, these tables read from the country-level data.
HealthDetailedDeathsCtry. These files store data on mortality rates by sex of a population. Raw mortality data must be ascribed to the 15 mortality sub-types in IFs to match this file structure. These categories are as follows:
- Other communicable diseases
- Malignant neoplasms
- Cardiac
- Digestive
- Respiratory
- Other noncommunicable diseases
- Road injuries
- Other unintentional injuries
- Intentional injuries
- Diabetes
- HIV
- Diarrhea
- Malaria
- Respiratory infections
- Mental health
If these data are not available, this file will read from the country-level data that already exist in IFs.
PopMortalityCohortCountryFemale andPopMortalityCohortCountryMale. This table stores the survivor tables of a population by sex. These data series are calculated based on the mortality data from the table HealthDetailedDeathsCtry. This is calculated by the likelihood that a person at any given age will survive to the next cohort, given the prevailing trends in mortality. If these data are not available, this file will read from the country-level data that already exist in IFs.
Users may need to modify table EconBaseSector. Unlike other base files in IFs.mdb, rows are not automatically added to this table when additional countries (in this case sub-regions) are added. If too many sub-regions are added, this can cause problems when rebuilding the base. EconBaseSector has six rows for each country, so for example, if the user wants to break India into 36 sub-regions, a total of 1326 rows will be needed (221 countries*6). As of July 2014, the default number of rows for this table is 1304. Additional rows must be added to the table before adding sub-regions to ensure they are copied to \RUNFILES.
IFsXXX.DAT
C: > User > Public > IFs > Data > IFsXXX.DAT
When it is determined that tuning is necessary, the user needs to build the suitable scenario and save the scenario file (.sce extension). Then, open up the .sce file (C:\Users\Public\IFs\Scenario) and copy and paste the scenario code into a IFsXXX.DAT file. If a .DAT file does not exist for your country, simply copy and paste one available for another country (South Africa, United States), change the country name, and delete the contents. Three very important notes:
- .DAT files most likely will not be able to be accessed without changing the extension. Simply add “.txt” at the end of the file name and click OK when Windows warns you about modifying the file extension. Once this is done, you will be able to open it with wordpad or notepad.
- When copying the code from the .sce to the .DAT file, be sure NOT to include, “CUSTOM,” “COMMENT,” and “START.” See the following example. The .DAT file should only include the highlighted script:
- Once the IFsXXX.DAT file is done, users will have to delete the broken-out model and re-subregionalize. The process will incorporate the newly created .DAT file into the base case.
Provincial Data Processing
The model executes 4 processes when provinces are added (see section 3. ‘Add or Delete Sub-Regions’ for procedural steps):
- Checks for essential series in IFsHistSeriesXXX.mdb
- Normalizes and fills holes in essential series
- Normalizes base-year values in preprocessor variables present in IFsHistSeriesXXX.mdb when provinces are added
- Estimating values for preprocessor series NOT present in IFsHistSeriesXXX.mdb when provinces are added
The following sections will cover these processes in detail.
Checking the Essential Series
As described in section 1, data for sub-regions in IFs are stored in IFsHistSeriesXXX.mdb. In order for sub-regions to be added, data for the essential series is required. The 4 variables are described below.
Variable | Definition |
GDP2011 |
Gross Domestic Product in 2011 Constant Dollars |
GDP2011PCPPP | Gross Domestic Product per Capita in 2011 Constant Purchasing Power Parity International Dollars |
Population | Population in Millions |
LandArea | Land Area |
When sub-regions are added, the model checks that each sub-region has at least one year of data for each of these required variables. If this is not true, then (1) the specific state/province/department cannot be used, (2) it is excluded from the process, and (3) the system will give a message indicating the problem. With these four series, the shell model is established. The user can add incrementally build upon this by adding more data. Also note, sub-regional data must be for years that are available in the country-level data. For example, if only 2011 population data is available for sub-regions, national-level population data must include 2011 as well.
Normalization and Hole-Filling in Essential Series
Due to the special importance of these series, the model normalizes and fills in data across all years for each sub-region. Normalization means it modifies sub-regional values with respect to country-level data. To do this, the model checks the base year (2010) for all sub-regions. If base-year data is not available, it uses the closest available year. The aggregation rules for GDP2011, Population, and Land Area are all SUM, so the model simply multiplies the country base-year value by each sub regional contribution:
Since GDP2011PCPPP is defined as a rate, is filled out a little differently. Here, the sub regional values are multiplied by their respective weights for Population, then summed for a weighted average. Then a multiplier is computed by dividing the country-level GDP2011PCPPP value by this weighted average. The final normalized data are the product of the multiplier and the sub regional values.
Where POP weight is:
These normalization procedures are important for a couple reasons. First and foremost, they allow the model to produce reasonable forecasts. They also allow the user to use any type of unit for the essential series. This may be quite helpful because sub regional data is often harder to find than national figures, especially PPP data.
Normalization of Preprocessor Variables Present in IFsHistSeriesXXX.mdb When Provinces are Added
For preprocessor (non-essential) variables included in IFsHistSeriesXXX.mdb before adding provinces, base year values are normalized. Variables used in the preprocessor are marked in the field ‘UsedInPreProcessor’ (formerly ‘UseforProvince’) in the datadict.mdb. Non-preprocessor variables will not be normalized. In general, normalization is important because the preprocessor draws upon base year (2010) data in order to initialize values of the model run. Simply put, provincial values must be consistent with the values for the total country. The process is similar to what we saw for the essential series. The main difference is that the model only normalizes the base year value. Values for other years will not be changed and no hole-filling occurs. First, the model computes a multiplier for provincial data by dividing the country value in the base year by the weighted average, simple average, or sum (designated by aggregation field) of the provinces. If base year data are not available for the country or provinces, the model looks to the Most Recent column for provinces and country data. Note that this could mean the multiplier is computed with data from different years. Multipliers are calculated as:
Where weight is based on the aggregation rule of the variable (generally GDP or POP):
The computed multiplier is stored in the multiplier field for the series row in the datadict table of IFsHistSeriesXXX.mdb in \RUNFILES.
If provincial data differs significantly from country data, likely due to different units, users can check the ApplyMultAll field of the datadict table in IFsHistSeriesXXX. This signals the model to apply the multiplier to all previous years of provincial data. This should be done before sub-regions are added.
Estimating Values for Preprocessor Series Not Present in IFsHistSeriesXXX.mdb When Provinces Are Added
For variables used in preprocessor not included in IFsHistSeriesXXX.mdb before adding sub regions, the model jumps to IFsHistSeries.mdb and estimates values across all years data is available at the country level. This can happen 2 ways:
- If the aggregation rule (as seen in datadict.mdb, not the datadict table in IFsHistSeriesXXX.mdb) is SUM then we use the disaggregation rule, usually either GDP or POP, to formulate weighted estimates for sub-regions. Thus, the sum of the estimates will always equal the county value in a given year.
- If the aggregation rule is not SUM, the value at the national level is simply applied to all sub-regions for a given year. These should always be rate variables like total fertility rate and life expectancy.
Experienced users of IFs, or other models, may be familiar with other commonly used estimation techniques, like cross-sectional regression. These techniques are applied when rebuilding the base, not during the process of adding sub-regions. In any case, suppose that fertility data were not available for a particular sub-region, but we had GDP per capita figures for the region. The model would fill in the expected values for a country, based on the sub-region’s GDP per capita. Cross-sectional relationships such as TFR-GDP derive from IFs historical data for all countries. A list of all computed functions used within the model can be found under extended features. All variables can be estimated through a combination of such cross-sectional relationships and national level data.
Data Gathering and Prioritization
With the four essential series imported and the shell model in operation, the user can begin exploring available data sources to fill other preprocessor variables. While estimate-driven forecasts (from the shell model) are a starting point, they offer limited value to policy-makers. Each time a new series is added to IFsHistSeriesXXX.mdb, the quality of our sub-regional forecasts improve, becoming anchored less on estimates and more on empirics.
Some series are more important than others to the IFs preprocessor algorithms. Understanding the IFs initialization procedure can help inform the prioritization of data gathering efforts. For each module there are several series that contribute especially heavily to the robustness of the model.
Procedural Elements
The following sections present a guided walkthrough of some of the procedural elements of sub-regionalization in IFs. As such, “the user” will be referred to as “you” for the remainder of the document.
Add or Delete Subregions
To add sub-regions, choose Extended Features > Change Country Sub-Regionalization > Add or Delete Subregions. The popup window will show several sub-regions that already exist in IFs. Click on a country to see its available sub-regions.
Choose a region name and click Add. Click OK on popup windows to proceed.
That would initiate processes which include 1) checking the availability of four required data series for the states and provinces being created in IFsHistSeriesXXX.mdb, 2) filling holes in the four required data series for the base year and other years, 3) processing additional available data series from the province files, and 4) filling holes for series that are needed for further model processing.
Now select Add. You will be prompted with the following:
Choose Yes. This process begins rebuilding the model with the sub-regional breakout. Depending on the processing power of your computer, the first portion of this Add Provinces/Subregions process could take around 10 minutes. When the processes are complete, you will see the form for Rebuilding the Base Data Load, where IFs cleans and reconciles data series and also aggregates states/provinces/departments into sub-regions or individual ones.
The next process is initiated from the Rebuild Base form. Rebuilding the Base Case will place all initial conditions and parameters into the Base Case, but all computed values will be set at 0 until the model is re-run. All boxes on the Rebuild Base form should be checked, then select Rebuild Base.
Finally, run the model to 2100. Select Start Run.
Once the model finishes running to 2100, you will get two messages. First:
Select OK to this message and the next message. Then click Run Successful – Click to Continue.
This process completes a new Base Case run with the selected sub-regionalization.
It is also possible to switch between different model runs (i.e. between the national level and a sub-regional model). For instance, you have just completed a sub-regionalization that divides China into single provinces. Now, you would like to use data for the whole country as a Base Case again. To remove all sub-regions, open the window for Add or Delete Subregions. Click on Switch Provinces and choose No State/Province from the right side box of Available Runs. Click Switch to restore the model to pre-sub-regionalization status.
At this point you will be asked if you want to save your current configuration. It is important to select Yes in order switch back to the preferred sub-regionalization the next time you want to revert back to the sub-regional version.
To process newly imported data for the preferred sub-regionalization, the existing sub-region setup (used for switching) needs to be deleted first. The option to delete a sub-region setup is in the same window as to add sub-regions. Select a country from the box in the bottom (countries with sub-regions in the system) and click Delete to delete the sub-region setup. Deleting a sub-region setup will remove all files created in the process of adding the sub-region setup, and the corresponding model run will no longer be available from the Switch Provinces option.
Importing Data and Updating Data in IFs
Updating provincial data is a multi-step process. This walk-through will begin with the assumption that you have found the data and you are ready to introduce it into the model. Do not open your IFs model until you’ve made your data changes. With the model open, your changes will not properly register.
Importing Function for Provincial Data in IFs
IFs version 7.05 introduced a function that allows the user to import provincial data into the model. The process is nearly identical to importing data in the standard 186 country format. You do NOT need to have the model broken out into sub-regions to use this feature. Below are the steps for using this feature.
- Open IFs and go to: Extended Features > manage country data > Historical data processing > Import data from XLS file provincesThis brings you to the following window:
This brings you to the following window:
- Select ‘Open Excel File’ and double click the file that has the provincial data you want to import
- Note, while the menu path from above says “XLS file”, this feature can use both XLS and XLSX files. To use xlsx files, you need to use the drop down box located in the bottom right of the “Open Excel File” window:
- With your file selected, you will be prompted to select the correct Excel Sheet. To see your selected Excel file, click on View Excel. The file will be in read-only mode, so you cannot make changes to the file from here. Make sure to close the Excel file before importing the data; having it open can interfere with the importing process
- On the right side (Choose Excel Source Data Format), user must choose how the data is formatted. The bottom option is initially selected by default since most data sets follow this format – State/Province names in rows in rows (vertical) and years in columns. You will notice that the graph below the formatting options changes to visually represent the specific data orientation that you choose
- Next, the user needs to provide the location of state/province names and years in the Excel file. The contents of the selected cell will appear in the blank box to the right of the dropdown menus. Choose the column and row of the state/province name that appears first in your data (e.g. Andaman and Nicobar Islands) and the column/row of the earliest listed year (e.g. 1960)
- Use the dropdown menus to select the appropriate country (e.g. India) and concordance table (concordance tables covered in next section)
- Under “Variable (or table) Name”, select the correct variable name for the data (e.g. GDP2011PCPPP). For a new variable (does not exist in the list of tables in the IFsHistSeriesXXX.mdb file), select ‘new’. If the variable is new and unit conversion is required (e.g. thousands to millions), user will need to specify an operator and conversion factor. These will automatically be filled in if an existing variable is selected.
- After completing previous steps and closing the Excel file if it is open, click Import. This will take you to the following window:
- Click ‘view excluded’ to see any states/provinces in the source excel file that were not imported. For example, if one of the listed states/provinces is India, this will be included in the list. It is a common error that state/province names are spelled incorrectly on the source file; this button will alert the user that they need to correct the Excel file and reimport the data.
- Clicking Accept will bring you to the data dictionary (datadict) window:
- Again, if an existing variable (not ‘new’) was selected, a lot of this will be filled in. Users will be prompted to fill in required fields after pressing OK if they are blank.
- Imported data can be found here:
- C: > User > Public > IFs > Data > IFsDataImportXXX.mdb
Concordance Tables
Concordance tables are lists that translate the country (or state/province) names found in data source materials to IFs names. For example, if data was being imported from an Excel document from the Indian Planning Commission, the model will take in data under the sub-region A & N islands, and categorize it under Andaman and Nicobar Islands, the sub-region name in IFs. This saves the user from having to manually translating sub-region names on the source document every time they import new data.
Sub-regional translation tables are located in IFs\Data\Provinces.mdb. To add a list for a new source:
- Open the translation table for the appropriate country
- Click ‘View’, located in upper left corner of window, to modify the table design
- In the first blank cell under Field Name, type the name of your new source and select ‘Short Text’ as the data type:
- Click ‘View’ again to return to translation lists. Notice the new source now has a column.
- Add the exact sub-region names used in the new source to cells that correlate to the IFs countries. If the new source does not have data for a particular sub-region, leave the cell blank.
- When all sub-regions are entered, save and close Access. Restart IFs and your new concordance list should now be an option in the province data import window.
Manual Import
Users can also manually add provincial data, instead of using the feature within the model. Users can manually import data with any version of IFs, 7.05 is not needed. The data structure of International Futures uses Microsoft Access. For the South African provincial data file, go to:
C: > User > Public > IFs > Data > IFsHistSeries South Africa.mdb
This file is the home to all of the historical provincial data that has been gathered for the provinces (in this case, of South Africa). For the purposes of this tutorial, we will use the example of population.
From the South African 2011 Census, we can now add data for the population total in 2011.
Depending on the data source that you may have, you could want to either change existing data or add a new year of data. Complete the data import process in Microsoft Access before opening IFs.
International Futures
From the main menu, go to Extended Features >Change Country Sub-Regionalization> A'dd or Delete SubRegions.'
If there are no sub-regions underneath the “Countries with Sub-Regions in the system” heading, skip ahead to the Adding Sub-regions heading. Otherwise, you will need to delete the sub-regional breakdown in the model. Highlight and select delete in the bottom right corner of the menu:
You will be asked:
Select No only if you are sure that you will get more data and that you will need to reload the sub-regional setup (otherwise it's better to say yes and save the current setup to be able to use the switch). When you see the second prompt, select OK.
This removes the provincial model and returns IFs to the 186 country-level model. Just a warning: this is an intensive process. Your computer might say that IFs is not responding, but it is probably working just fine. The same goes for most of this process – do not take a not responding message as a cause for concern.
Adding SubRegions
With IFs restarted, go to 'Extended Features > Change Country Sub-Regionalization > 'Add or Delete SubRegions. With this menu open, select South Africa, and have SOUTHAFRICA2 selected in the middle window (or whichever sub-national model you’re concerned with).
Now select Add. You will be prompted with the following:
Choose Yes, only if you had made changes to your Base Case since the last time you saved it in this addition process. This process begins rebuilding the model with the sub-regional breakout.
Depending on the processing power of your computer, the first portion of this Add Sub-regions process could take around 4 hours.
When this is complete, you will be prompted with a string of messages to select OK to. The next process is initiated from the Rebuild Base form. Rebuilding the Base Case will place all initial conditions and parameters into the Base Case, but all computed values will be set at 0 until the model is re-run. All boxes on the Rebuild Base form should be checked, then select Rebuild Base.
Finally, run the model to 2100. Select Start Run.
Once the model finishes running to 2100, you will get two messages. First:
Select OK to this message and the next message. Then click Run Successful – Click to Continue.
Appendix
In this example, we will look at how normalized estimates are calculated for Female and Male Life Expectancy (LifExpectMale and LifExpectFemale). Below are abridged datadict entries for the two variables. Since there are values for the Multiplier field, we know this was taken from RUNFILES after sub regions were added. Also, the aggregation rule for the male series was changed from null to GDP for the sake of demonstration. The next steps will detail the steps for calculation the multipliers and the estimated base-year values.
Variable | Units | Multiplier | ApplyMultAll | Aggregation | UsedinPreprocessor | UseProvData |
LifeExpectFemale | Years | 0.98 | No | Yes | No | |
LifeExpectMale | Years | 0.97 | No | GDP | Yes | No |
We will start with the female series. Below is a sample table for what was included in IFsHistSeriesIndia.mdb, along with the 4 essential series (earlier years are left out to save space).
SeriesLifExpectFemale | ||||||
Country | FIPS_CODE | 2006 | 2010 | 2011 | Earliest | MostRecent |
Andaman and Nicobar Islands | AN | |||||
Andhra Pradesh | AP | 65.5 | 70.9 | 59.8 | 70.9 | |
Arunachal Pradesh | AR | |||||
Assam | AS | 59.3 | 64.8 | 51.9 | 64.8 | |
Bihar | BR | 60.4 | 68.7 | 51.5 | 68.7 | |
Chandigarh | CH | |||||
Chhatisgarh | CT | |||||
Dadra and Nagar Haveli | DN | |||||
Daman and Diu | DD | |||||
Delhi | DL | |||||
Goa | GA | |||||
Gujarat | GJ | 65.2 | 72.5 | 59.3 | 72.5 | |
Haryana | HR | 67.3 | 59 | |||
Himachal Pradesh | HP | 66.3 | 71.3 | 71.3 | ||
Jammu & Kashmir | JK | |||||
Jharkhand | JH | |||||
Karnataka | KA | 67.1 | 72.3 | 62 | 72.3 | |
Kerala | KL | 76.3 | 77.6 | 71.5 | 77.6 | |
Lakshadweep | LD | |||||
Madhya Pradesh | MP | 56.9 | 51.9 | |||
Maharashtra | MH | 57.9 | 65.3 | 62.1 | 65.3 | |
Manipur | MN | |||||
Meghalaya | ML | 68.4 | 72.5 | 72.5 | ||
Mizoram | MZ | |||||
Nagaland | NL | |||||
Orissa | OR | 59.6 | 67.3 | 53 | 67.3 | |
Pondicherry | PY | 70.4 | 72.8 | 72.8 | ||
Punjab | PB | 69.6 | 63.6 | |||
Rajasthan | RJ | 62.3 | 70.7 | 53.8 | 70.7 | |
Sikkim | SK | |||||
Tamil Nadu | TN | 67.4 | 71.8 | 57.4 | 71.8 | |
Tripura | TR | |||||
Uttar Pradesh | UP | 59.5 | 66.9 | 48.5 | 66.9 | |
Uttaranchal | UL | |||||
West Bengal | WB | 65 | 58 |
Below is the same table but from RUNFILES after the process of adding sub regions is complete (C:\Users\Public\IFs\RUNFILES). Note two things: 2010 estimates have been added and MostRecent values have been added where appropriate, as highlighted in orange.
SeriesLifExpectFemale | ||||||
Country | FIPS_CODE | 2006 | 2010 | 2011 | Earliest | MostRecent |
Andaman and Nicobar Islands | AN | |||||
Andhra Pradesh | AP | 65.5 | 69.65 | 70.9 | 59.8 | 70.9 |
Arunachal Pradesh | AR | |||||
Assam | AS | 59.3 | 63.65 | 64.8 | 51.9 | 64.8 |
Bihar | BR | 60.4 | 67.49 | 68.7 | 51.5 | 68.7 |
Chandigarh | CH | |||||
Chhatisgarh | CT | |||||
Dadra and Nagar Haveli | DN | |||||
Daman and Diu | DD | |||||
Delhi | DL | |||||
Goa | GA | |||||
Gujarat | GJ | 65.2 | 71.22 | 72.5 | 59.3 | 72.5 |
Haryana | HR | 67.3 | 66.11 | 59 | 67.3 | |
Himachal Pradesh | HP | 66.3 | 70.04 | 71.3 | 71.3 | |
Jammu & Kashmir | JK | |||||
Jharkhand | JH | |||||
Karnataka | KA | 67.1 | 71.02 | 72.3 | 62 | 72.3 |
Kerala | KL | 76.3 | 76.23 | 77.6 | 71.5 | 77.6 |
Lakshadweep | LD | |||||
Madhya Pradesh | MP | 56.9 | 55.89 | 51.9 | 56.9 | |
Maharashtra | MH | 57.9 | 64.15 | 65.3 | 62.1 | 65.3 |
Manipur | MN | |||||
Meghalaya | ML | 68.4 | 71.22 | 72.5 | 72.5 | |
Mizoram | MZ | |||||
Nagaland | NL | |||||
Orissa | OR | 59.6 | 66.11 | 67.3 | 53 | 67.3 |
Pondicherry | PY | 70.4 | 71.51 | 72.8 | 72.8 | |
Punjab | PB | 69.6 | 68.37 | 63.6 | 69.6 | |
Rajasthan | RJ | 62.3 | 69.45 | 70.7 | 53.8 | 70.7 |
Sikkim | SK | |||||
Tamil Nadu | TN | 67.4 | 70.53 | 71.8 | 57.4 | 71.8 |
Tripura | TR | |||||
Uttar Pradesh | UP | 59.5 | 65.72 | 66.9 | 48.5 | 66.9 |
Uttaranchal | UL | |||||
West Bengal | WB | 65 | 63.85 | 58 | 65 |
Since no 2010 values were available when the process to add sub regions was initiated, the model looked to the MostRecent column. Next, the model needs to fill in null values, which it does by simply using the national-level base-year value from IFsHistSeries.mdb. In this case, the value for India is 66.675 years, as highlight in orange. Once each state has a usable value, the model reads the aggregation rule from the datadict table in IFsHistSeriesIndia.mdb, as seen above. Since the rule is null, it defaults to a simple average (arithmetic).