X

Syspop

Information

# Synthetic & Simulated Population (SysPop)

Sample Image

Sample Image Sample Image

_The above animations show: Top: where people travel to from 241800 (Silverstream, Upper Hutt) between 7am and 9am_. Bottom-left: _The reasons why people visit the SA2 area 241800 (Silverstream, Upper Hutt)._ Bottom-right: _Where people go to work in one day (3000 sampled synthetic citizens across Auckland region)_. To protect individuals' privacy, most census and survey data are distributed in aggregated form. However, microsimulation models, such as agent-based models, typically require data at the individual level. **Syspop** is a package designed to streamline and modularize the process of synthesizing individual-level populations from aggregated data. > For example: in New Zealand, while individual and unit data are accessible through the Integrated Data Infrastructure (IDI), the limited computational resources and constraints associated with third-party libraries pose significant challenges for running advanced models within the IDI environment. To utilize the data outside of IDI, it must first undergo confidentiality protocols and be aggregated. Once the data is released by IDI, **Syspop** can work with this aggregated dataset to "regenerate" synthetic individual data, and allowing it to be linked with other data sources. The outputs generated by **Syspop** are independent of any downstream applications, such as agent-based models, making them versatile inputs for various modeling needs or standalone products. **Syspop** supports both _python_ and _R_. **The documentation for the package can be found at [SysPop Wiki](https://github.com/jzanetti/Syspop/wiki). Any questions please raise an issue [here](https://github.com/jzanetti/Syspop/issues) ### Contents: * [Installation](https://github.com/jzanetti/Syspop#installation) * [Usage](https://github.com/jzanetti/Syspop#usage) * [Output](https://github.com/jzanetti/Syspop#output) ## Installation It is recommended to install the package via a local conda environment: \`\`\` make env \`\`\` The above will create a working environment (_python_) for **Syspop**. Alternatively, if _R_ is is used, the working environment can be created by \`\`\` renv::restore() \`\`\` ## ️ Usage A synthetic population can be created using: #### Using Python: \`\`\` from syspop.python.input import new_zealand from syspop.start import create as syspop_create output_dir = "/tmp/syspop" nz_data = new_zealand() syn_areas = [241300, 241800] syspop_create( syn_areas, output_dir, population = \{ "structure": nz_data["population_structure"] \}, geography = \{ "hierarchy": nz_data["geography_hierarchy"], "location": nz_data["geography_location"], "address": nz_data["geography_address"] \}, household=\{"composition": nz_data["household_composition"]\}, work=\{"employee": nz_data["work_employee"], "employer": nz_data["work_employer"]\}, commute=\{ "travel_to_work": nz_data["commute_travel_to_work"], "travel_to_school": nz_data["commute_travel_to_school"]\}, education=\{ "school": nz_data["school"], "kindergarten": nz_data["kindergarten"] \}, shared_space=\{ "hospital": nz_data["hospital"], "bakery": nz_data["shared_space_bakery"], "cafe": nz_data["shared_space_cafe"], "department_store": nz_data["shared_space_department_store"], "fast_food": nz_data["shared_space_fast_food"], "park": nz_data["shared_space_park"], "pub": nz_data["shared_space_pub"], "restaurant": nz_data["shared_space_restaurant"], "supermarket": nz_data["shared_space_supermarket"], "wholesale": nz_data["shared_space_wholesale"], \} ) \`\`\` #### Using R: \`\`\` source("syspop/r/global_vars.R") source("syspop/r/input.R") source("syspop/start.R") library(dplyr) library(tidyr) library(lubridate) library(purrr) library(arrow) library(dplyr) library(lubridate) library(dplyr) library(uuid) library(tibble) library(fs) library(data.table) nz_data <- new_zealand() output_dir <- "/tmp/syspop" syn_areas <- c(241300, 241800) # Create the synthetic population, assuming syspop_create is a defined function in R or custom code create( syn_areas = syn_areas, output_dir = output_dir, population = list(structure = nz_data$population_structure), geography = list( hierarchy = nz_data$geography_hierarchy, location = nz_data$geography_location, address = nz_data$geography_address ), household = list(composition = nz_data$household_composition), work = list( employee = nz_data$work_employee, employer = nz_data$work_employer ), commute = list( travel_to_work = nz_data$commute_travel_to_work, travel_to_school = nz_data$commute_travel_to_school ), education = list( school = nz_data$school, kindergarten = nz_data$kindergarten ), shared_space = list( hospital = nz_data$hospital, bakery = nz_data$shared_space_bakery, cafe = nz_data$shared_space_cafe, department_store = nz_data$shared_space_department_store, fast_food = nz_data$shared_space_fast_food, park = nz_data$shared_space_park, pub = nz_data$shared_space_pub, restaurant = nz_data$shared_space_restaurant, supermarket = nz_data$shared_space_supermarket, wholesale = nz_data$shared_space_wholesale ) ) \`\`\` Detailed descriptions of the input data for each argument can be found in [Input data](https://github.com/jzanetti/Syspop/wiki/Input-data). It's important to note that all arguments in the \`\`syspop.create\`\` function are optional, and their requirement depends on the specific synthetic information that needs to be generated. To understand the interdependencies between different synthetic information, refer to the documentation available [here](https://github.com/jzanetti/Syspop/wiki/Synthetic-population) There are many example scripts located in \`etc/examples\`: - \`run_nz.py\`/\`run_nz.R\`: Creating New Zealand synthetic population - \`run_diary.py\`: Creating agents' diary - \`run_validation.py\`: Creating validation report for the synthetic population - \`run_vis.py\`: Creating visualisation for the synthetic population ## Output The output from **Syspop** is a comprehensive table that contains the information for each synthetic individual, which can be used for any downstream analytic and modeling works. The list of attributes for each synthetic individual can be found [here](https://github.com/jzanetti/Syspop/wiki/Some-basic-attributes-for-Syspop) | area | age | gender | ethnicity | household | ... | school | primary_hospital | secondary_hospital | supermarket | restaurant | | ------ | --- | ------ | --------- | ------------ | --- | -------------------- | ------------------- | -------------------- | ------------------------------- | ------------------------------------------------- | | 236300 | 0 | female | European | 236300_4_438 | ... | NaN | 237800_hospital_2_0 | 235800_hospital_28_0 | supermarket_284,supermarket_283 | restaurant_1407,restaurant_2551,restaurant_287... | | 236300 | 13 | male | Maori | 236300_8_189 | ... | school_23134_primary | 237800_hospital_2_0 | 235800_hospital_28_0 | supermarket_284,supermarket_283 | restaurant_1407,restaurant_2551,restaurant_287... | | 236300 | 0 | female | Maori | 236300_4_638 | ... | NaN | 237800_hospital_2_0 | 235800_hospital_28_0 | supermarket_284,supermarket_283 | restaurant_1407,restaurant_2551,restaurant_287... | | 236300 | 0 | male | Maori | 236300_3_220 | ... | NaN | 237800_hospital_2_0 | 235800_hospital_28_0 | supermarket_284,supermarket_283 | restaurant_1407,restaurant_2551,restaurant_287... | | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ## R Development This R version of this package is under development, the following shows the steps for testing the package with R: - Start a new R env: Run \`renv::init()\` to initialize renv for your project - Record R env: \`renv::snapshot()\` - Reload R env: Run \`renv::restore()\` to load the renv environment for your project - Install R packages: \`renv::install(c("x", "y", "z"))\`

Prompts

Reviews

Tags

Write Your Review

Detailed Ratings

ALL
Correctness
Helpfulness
Interesting
Upload Pictures and Videos

Name
Size
Type
Download
Last Modified
  • Community

Add Discussion

Upload Pictures and Videos