Population Data: Difference between revisions
AltheaDitter (talk | contribs) No edit summary |
AltheaDitter (talk | contribs) No edit summary |
||
Line 4: | Line 4: | ||
Despite this estimation, there are still some unexplained spikes and drops in the population data around 2000. Guangdong has the most noticeable shift, where in 1999 population is 73.9 million and in 2000 population jumps up to 86.9. These shifts in the data are indeed in the data and not human errors (at least not on our end). The data source was contact regarding this and other potential data issues that were found while pulling data, they have yet to respond. | Despite this estimation, there are still some unexplained spikes and drops in the population data around 2000. Guangdong has the most noticeable shift, where in 1999 population is 73.9 million and in 2000 population jumps up to 86.9. These shifts in the data are indeed in the data and not human errors (at least not on our end). The data source was contact regarding this and other potential data issues that were found while pulling data, they have yet to respond. | ||
[[File:ChinaPopulation v728.jpg]] | |||
The provincial population series for China's subregional model came from the [[China_Statistical_Yearbooks|China_Statistical_Yearbooks]] from 1996-2016. These yearbooks are available online on the [[Media:National_Bureau_of_Statistics_of_China_website|National Bureau of Statistics of China website]]. There are other sources available for provincial population data in China, which can be read about at [[Alternative_Data_Sources|Alternative Population Data_Sources]]. The population data is published in tens of thousands, rather than the millions that are used in the full 186 version of IFs. Rather than converting the population data into millions, the population series was imported as it was published in tens of thousands and [[ApplyMultAll|ApplyMultAll]] is selected in the data dictionary. This normalizes the historical data to the population of China in the full 186 version of IFs. This choice to normalize rather than change units is meant to decrease the likelihood of human error, and simplify future data updates. |
Revision as of 15:01, 17 April 2017
Population
The provincial population series for China runs from 1995 through 2015. The original data lacked observations for Chongqing in 1995 and 1996 because Chongqing did not gain its status as a municipality until 1997. Chongqing was part of Sichuan province prior to 1997 and thus, Sichuan province's data included Chongqing in 1995 and 1996 and there was a substantial drop in Sichuan province in 1997. To remedy this issue, the ratio of population in Chongqing relative to the sum of Sichuan province and Chongqing in 1997 was used to estimate the population in Chongqing in 1995 and 1996. This population estimate for Chongqing was subtracted from Sichuan's population in 1995 and 1996. This provides a smoother and more accurate historical series. and data for Chongqing and Sichuan province were adjusted/estimated for 1995 and 1996.
Despite this estimation, there are still some unexplained spikes and drops in the population data around 2000. Guangdong has the most noticeable shift, where in 1999 population is 73.9 million and in 2000 population jumps up to 86.9. These shifts in the data are indeed in the data and not human errors (at least not on our end). The data source was contact regarding this and other potential data issues that were found while pulling data, they have yet to respond.
The provincial population series for China's subregional model came from the China_Statistical_Yearbooks from 1996-2016. These yearbooks are available online on the National Bureau of Statistics of China website. There are other sources available for provincial population data in China, which can be read about at Alternative Population Data_Sources. The population data is published in tens of thousands, rather than the millions that are used in the full 186 version of IFs. Rather than converting the population data into millions, the population series was imported as it was published in tens of thousands and ApplyMultAll is selected in the data dictionary. This normalizes the historical data to the population of China in the full 186 version of IFs. This choice to normalize rather than change units is meant to decrease the likelihood of human error, and simplify future data updates.