Template

From Pardee Wiki
Revision as of 16:10, 15 August 2025 by Norah.Shamin (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Summary

This section will detail the following details (preferably in this order):

- What is this data? (Specifically, what site is it from? What information does the dataset tell us? And why is it significant?)

- How often is this series updated? (Most data files will have a "annual", "biannual", "quarterly", etc. listed in the description; if it doesn't use your own discretion and leave a comment noting that it wasn't specified)

- How do we use this series? (Provide a quick rundown of when/where we would use this)

The XYZ dataset is from xyz.com. It's a geography-based dataset that focuses on X, Y, and Z from some industry. Because it's pulled from XYZ and ZYX, it's imperative that the puller pay attention to 1.), 2.), and 3.). These points are significant to ensuring that the calculation for this is accurate. XYZ dataset is updated on a biannual basis, the source collects data during some time period and posts between some time period.

In particular, the data team uses this dataset in order to calculate for a couple indicators:

- for x, y, and z

Summary Example

The Global Footprint Network dataset is from the National Footprint and Biocapacity Database. It's a geography-based dataset that focuses on the carbon, mineral, and ecological output for each country between the years 1961 and 2019. Because it's pulled from a free data source, it's imperative that the puller pay attention to the year of release, who the source uploader is, and whether or not the license is from the same year. These points are significant to ensuring that the calculation for the output of each country is accurate. The Global Footprint Network dataset is updated on a biannual basis, the source collects data from the previous year and compares it to a running average of years that already exists in the database.

In particular, the data team uses this dataset in order to calculate for a couple indicators:

  • Agriculture
  • Exports
  • Industrial Output

Tables In IFs

Variable Definition Name In Source UsedInPreprocessor UsedInPreprocessorFileName

- You can add more columns if you like.

Data Pulling Instructions


Step 1.) Navigate to xyz.com and look for the data icon located in the dashboard

xyz caption

Step 2.) In the dashboard select a subseries of XYZ and 123 years. This will ensure that the series is downloaded with interpolated data from 123 to 123.

xyz caption

Step 3.) This is an example of paragraph text where a longer explanation is given in addition to the writing the steps necessary to allow the reader to navigate to the correct panel. This example will mimic paragraph formatting by adding additional lines of plain text. add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text add text

xyz caption

Data Notes

Add extra information needed for the data puller. This can include, aggregation/disaggregation, blending information, scripts, and more.

Example from UNAIDS:

  1. Blending will be imperative as some of the data from countries was not included in the latest update of their data.
  2. "UNAIDS changes their data because they incorporate new surveillance and program data into their models each year"
  3. If the data is downloaded into a XLS file, the sheet will be formatted with LowEst, MidEst, and HighEst in the same cells. Visit this GitHub repository for a python script to separate the data into their own sheets: https://github.com/fr3nch1/Automations/blob/main/%23%20Data%20Separation%20Script.py