# load the package
library(tidyverse)
# load the dataset
in_ds <- read.csv('YOUR FILE')
# some help for the structure in dplyr
summary <- in_ds %>%
filter() %>%
groupby() %>%
summarise()Building the Carbon Bookkeeping Model
For this lab, we will ask you to submit a .html file into moodle, which you generate by applying the knitr function in R-Studio. Make sure that all exercises become part of the .html file. The final lab is due on June 7th 2026, 23:59
Introduction
This is the first half of the assignment associated with the carbon bookkeeping model (CBKM). At first, it probably appears challenging, as we want you to actually develop code. We encourage you to work in groups and help each other. An obvious help is the use of a Large Language Model (LLM), such as ChatGPT or Mistral leChat, and we explicitly do not prohibit the use of these tools. In fact, in preparation of this lab we tested the use of ChatGPT to generate code and it worked quite well. However, make sure that at every step you review the code and make sure that you understand what you actually write.
The general idea of a bookkeeping model is that it tracks carbon stocks (pools) and carbon flows over time. The key idea of this type of model is that carbon is redistributed between pools when land-use changes occur, and released gradually over time depending on the pool
Part I: Load the table with the land-cover change estimates and summarize the the land cover estimates.
Download the file with the land-cover change estimates from the Chaco for the period 1986-2023. Load the file into Specifically, focus on the different conversion types, the years, and the ecoregions. In a first step, consolidate the the data in way that you summarize the different land-cover conversions per year for the dry and the wet Chaco. You can do that using the functions in the tidyverse package.
Part II: Build a static model: carbon loss and allocation
In this part, we are asking: what is actually happening during/after a land-use change? When a forest is converted into cropland, the carbon does not disappear, but it is redistributed into different pathways. Differently put, the forest carbon stock is a budget that gets split into different destinations. Following this logic, we can break the problem into two questions:
1. How much carbon is affected?
This depends on (a) the carbon stock per hectare, which one can get from the literature or through field work, and (b) the total area converted.
2. Where does that carbon go?
This is problem (research-question) specific and needs to defined a priori by the researcher. In our concrete example, the carbon is subdivided into four processes, resulting in the same amount of pools: (a) burned → immediate emission, (b) slash → short-term pool, (c) wood products → long-term pool, (d) soil/mineral → very slow pool. The calculation of the pools is rather straightforward. Below you find one worked example for the total carbon that is affected:
# Parameters (from paper)
veg_forest <- 60.8 # Mg C / ha (Carbon stored in 1ha of vegetation)
area_conv <- 100 # ha (example for the amount of forest that is lost)
# Step 1: total carbon affected
loss_total <- veg_forest * area_c# Paramters (fractions of where the carbon goes, from paper)
p_burn <- NA
p_slash <- NA
p_wood <- NA
p_ret <- NA
burn <- loss_total * p_burn
# Complete the other pathways
slash <- ...
wood <- ...
ret <- ...Once you have implemented the calculations, reflect on these calculations. For example, why should
sum(c(p_burn, p_slash, p_wood, p_ret))equal to 1? What would it mean, if the calculation below is not zero?
loss_total - (burn + slash + wood + ret)Part III: Incorporating time
Now that we understood how a basic model should look like, we can incorporate time, because ultimately this will allow us to trace emissions. Important to note here is that bookkeeping models are about the timing of the carbon release and not solely about the total amount of carbon loss. We are now starting to actually build the table, that we will feed with data in the end. Below you find some code chunks you will need for your script. Following the information in the paper, complete the code
# FROM DATA/PAPER
start_year <- NA
end_year <- NA
period <- seq(start_year, end_year, by = 1)
df <- data.frame(year = period)# Add columns for pools and emissions
df$slash_pool <- 0
# Complete
df$wood_pool <- ...
df$ret_pool <- ...Again, after implementing thee code line, ask yourself a few questions, such as Does the time line include the full decay of slow pools? or How would results change if we stopped the simulation earlier?
Part IV: Linking land-use change to the model
In the next step we are bringing the mapped land-use changes into our data table. For us, land-use change is presented in form of a time series of yearly deforestation events. This means, that each year new carbon enters the system (i.e., the different pools) as slash, wood, etc.
What would happen to the pools if we had one year in the time series without deforestation?
We implement this into our table df by iterating over the years of our time series. Below is a basic implementation of a for-loop that updates the pools based on the land-use change of each year. Feel free to take this code structure, implement it into your script, and add the missing information.
for (t in 1:nrow(df)) {
area <- conv$area[t]
loss <- veg_forest * area
df$burn_emis[t] <- loss * p_burn
# ➤ Complete:
df$slash_pool[t] <- ...
df$wood_pool[t] <- ...
}Part V: Adding dynamics: pools vs. fluxes
Conceptually this is the critical step now. So far, we have been moving in a static way by allocating each year the different components to the pools. In this step now, we make this system dynamic by incorporating fluxes. Compared to our Vensim notation, in the bookkeeping world we are not talking about stocks and flows but about pools and fluxes:
| CBKM-Term | Meaning | Comparable Vensim notation |
|---|---|---|
| Pool | Stored carbon | stock |
| Flux | carbon leaning the pool | flow |
Below is again the basic R-Notation, that you can use for your script. Make sure you understand here how the dynamic updating is happening by checking the index [t]
# slash example (fast decay)
df$slash_emis[t] <- k_slash * df$slash_pool[t-1]
# ➤ Do the same for the other emissions
df$wood_emis[t] <- ...
df$ret_emis[t] <- ...# Update the pools. key equation:
# pool_today = pool_yesterday + new_input - emission
df$slash_pool[t] <- df$slash_pool[t] +
df$slash_pool[t-1] -
df$slash_emis[t]
df$slash_pool[t] <- ...
df$wood_pool[t] <- ...
df$ret_pool[t] <- ...Why do emissions peak after deforestation?
Why does wood produce a long emission tail?
Part VI: Aggregate emissions and visualize
This is it! We have implemented a CBKM and were able to calculate emissions from one land-use change. What is left now, is to produce outputs by aggregating the emissions from the different pools, and by visualizing the emissions over time. Your task: (a) calculate the total (i.e., the sum of all emissions) cumulative emissions over time, and (b) visualize the yearly emissions and the cumulative emissions over the full period.
Should cumulative emissions always increase?
Why might annual emissions decline over time?
Part VII: Reflection and model extension
Now that you have calculated the emissions over time by coding it yourself, you probably already got a feeling for the structure of the model. Together with your fellow students, think about the following questions:
Which parameter matters most (sensitivity)?
What controls how long emissions last (temporal dynamics)?
What happens if all carbon is burned immediately (structural assumptions)?
With this in mind, start playing around with the model, for example by changing the decay rates
k_slash <- k_slash * 0.5and discuss whether this makes sense ecologically or from a land-use perspective.
Part VIII:
Now use your code, and estimate carbon emissionss from forest-to-pasture conversions or forest-to-cropland conversions (or both, if you like), in both cases solely focussing on the aboveground component (and hence leave the soil component out for the moment). If you read the paper by Baumann et al. (2017), you should be able to understand which parameters to use.
Part IX:
Lastly, here are some additional questions I want you to think about:
we assumed: all vegetation is removed from the site, but according to the paper, some stays –> how would you consider this in your model?
For the nerds among us: can you take the code you produced and convert it into a function that only needs some parameters and calculates the emission table?
What did you overall implement conceptually? What assumptions did you make?