In the getting started vignette we introduced a variety of predicate functions for interrogating the contents of a parsed R Markdown document. The family of rmd_has_* functions give a convenient way of checking explicitly for elements of an Rmd, however often we want to check for multiple elements at the same time and provide consistent output for any detected discrepancies.
To support this type of workflow, parsermd implements the concept of an Rmd template. These are tibble based representations of Rmd elements which can be easily be compared with a parsed document.
Example - hw01
Imagine that a homework assignment has been distributed to students in the form of an Rmd document named hw01.Rmd. This document describes all of the necessary tasks for the student to complete and also includes scaffolding in the form of Rmd chunks and markdown that indicates where the students are expected to include their solutions.
(rmd = parse_rmd(system.file("examples/hw01.Rmd", package = "parsermd")))
#> ├── YAML [2 fields]
#> ├── Heading [h3] - Load packages
#> │   └── Chunk [r, 2 lines] - load-packages
#> ├── Heading [h3] - Exercise 1
#> │   ├── Markdown [1 line]
#> │   └── Heading [h4] - Solution
#> │       └── Markdown [1 line]
#> ├── Heading [h3] - Exercise 2
#> │   ├── Markdown [1 line]
#> │   └── Heading [h4] - Solution
#> │       ├── Markdown [3 lines]
#> │       ├── Chunk [r, 5 lines] - plot-dino
#> │       ├── Markdown [1 line]
#> │       └── Chunk [r, 2 lines] - cor-dino
#> └── Heading [h3] - Exercise 3
#>     ├── Markdown [1 line]
#>     └── Heading [h4] - Solution
#>         ├── Markdown [3 lines]
#>         ├── Chunk [r, 1 line] - plot-star
#>         ├── Markdown [1 line]
#>         └── Chunk [r, 1 line] - cor-star
 
We can see examples of this templating by extracting the contents of the markdown in the Exercise 1 > Solution section.
rmd_select(
  rmd, 
  by_section(c("Exercise 1", "Solution")) & 
  has_type("rmd_markdown")
) |>
  as_document()
#> [1] "---"                                                                              
#> [2] "title: Homework 01 - Hello R"                                                     
#> [3] "output: html_document"                                                            
#> [4] "---"                                                                              
#> [5] ""                                                                                 
#> [6] "(Type your answer to Exercise 1 here. This exercise does not require any R code.)"
#> [7] ""                                                                                 
#> [8] ""
 
When a student completes this assignment we want to be able to check that they have included solutions in the appropriate sections. At a minimum this means that we need to check that these sections still exist, and secondarily we might also want to check that the provided content in the solution differs from the provided scaffolding.
Constructing a template
We will begin by subsetting the original parsed document to select only the elements that will contain the student’s answers - this assumes the other sections and elements are extraneous and contain things like background, instructions, and question text. Below we use rmd_select to select all of the elements of the original document contained withing a section matching “Exercise *” and “Solution” which should cover the answers for all three exercises.
(rmd_sols = rmd_select(rmd, by_section(c("Exercise *", "Solution"))))
#> ├── YAML [2 fields]
#> ├── Heading [h3] - Exercise 1
#> │   └── Heading [h4] - Solution
#> │       └── Markdown [1 line]
#> ├── Heading [h3] - Exercise 2
#> │   └── Heading [h4] - Solution
#> │       ├── Markdown [3 lines]
#> │       ├── Chunk [r, 5 lines] - plot-dino
#> │       ├── Markdown [1 line]
#> │       └── Chunk [r, 2 lines] - cor-dino
#> └── Heading [h3] - Exercise 3
#>     └── Heading [h4] - Solution
#>         ├── Markdown [3 lines]
#>         ├── Chunk [r, 1 line] - plot-star
#>         ├── Markdown [1 line]
#>         └── Chunk [r, 1 line] - cor-star
 
One we have this more limited set of elements we use the rmd_template function to generate our template. Here we have included keep_content  = TRUE in order to keep the scaffolded content for each answer which will then be compared to the student’s answers.
(rmd_tmpl = rmd_template(rmd_sols, keep_content = TRUE))
#> # A tibble: 9 × 5
#>   sec_h3     sec_h4   type         label     content  
#>   <chr>      <chr>    <chr>        <chr>     <list>   
#> 1 Exercise 1 Solution rmd_markdown <NA>      <chr [1]>
#> 2 Exercise 2 Solution rmd_markdown <NA>      <chr [3]>
#> 3 Exercise 2 Solution rmd_chunk    plot-dino <chr [5]>
#> 4 Exercise 2 Solution rmd_markdown <NA>      <chr [1]>
#> 5 Exercise 2 Solution rmd_chunk    cor-dino  <chr [2]>
#> 6 Exercise 3 Solution rmd_markdown <NA>      <chr [3]>
#> 7 Exercise 3 Solution rmd_chunk    plot-star <chr [1]>
#> 8 Exercise 3 Solution rmd_markdown <NA>      <chr [1]>
#> 9 Exercise 3 Solution rmd_chunk    cor-star  <chr [1]>
 
Using a template
One the template is constructed we can then compare it with a new Rmd document via the rmd_check_template function. Note that we can pass in an rmd_ast or rmd_tibble object directly, or the path to an Rmd which will then be parsed and compared.
file = system.file("examples/hw01-student.Rmd", package = "parsermd")
rmd_check_template(file, rmd_tmpl)
#> ✖ The following required elements were missing in the document:
#>   • Section "Exercise 3" > "Solution" is missing required "markdown text".
#>   • Section "Exercise 3" > "Solution" is missing required "markdown text".
#> ✖ The following document elements were unmodified from the template:
#>   • Section "Exercise 2" > "Solution" has a "code chunk" named "plot-dino"
#>     which has not been modified.
#>   • Section "Exercise 2" > "Solution" has "markdown text" which has not been
#>     modified.
#>   • Section "Exercise 2" > "Solution" has a "code chunk" named "cor-dino"
#>     which has not been modified.
 
From the output we can see that there are several issues with the document submitted by the student, they are missing the two expected markdown text entries for Exercise 3 and it appears that they have not entered any thing new for the chunks or markdown in Exercise 2.
Revising a template
Let assume that our original template was a bit too strict, and we would like to revise the feedback it is giving to students.
If we were to decide that for Exercise 3 the markdown text was not actually necessary, we can remove this requirement by filtering those elements from rmd_sols or from rmd_tmpl. (Generally, the former is the suggested workflow and will always work, while the later approach is likely to be somewhat fragile to any changes made to the template format in future releases.) Here we use rmd_select with the ! operator to remove these specific markdown elements.
rmd_sols |>
  rmd_select( 
    !(by_section(c("Exercise 3", "Solution")) & 
    has_type("rmd_markdown")) 
  )
#> ├── YAML [2 fields]
#> ├── Heading [h3] - Exercise 1
#> │   └── Heading [h4] - Solution
#> │       └── Markdown [1 line]
#> ├── Heading [h3] - Exercise 2
#> │   └── Heading [h4] - Solution
#> │       ├── Markdown [3 lines]
#> │       ├── Chunk [r, 5 lines] - plot-dino
#> │       ├── Markdown [1 line]
#> │       └── Chunk [r, 2 lines] - cor-dino
#> └── Heading [h3] - Exercise 3
#>     └── Heading [h4] - Solution
#>         ├── Chunk [r, 1 line] - plot-star
#>         └── Chunk [r, 1 line] - cor-star
 
This new AST can then be passed to rmd_template and rmd_check_template to provide the revised feedback,
file = system.file("examples/hw01-student.Rmd", package = "parsermd")
rmd_sols |>
  rmd_select( 
    !(by_section(c("Exercise 3", "Solution")) & 
    has_type("rmd_markdown")) 
  ) |>
  rmd_template(keep_content = TRUE) |>
  rmd_check_template(file, template = _)
#> ✖ The following document elements were unmodified from the template:
#>   • Section "Exercise 2" > "Solution" has a "code chunk" named "plot-dino"
#>     which has not been modified.
#>   • Section "Exercise 2" > "Solution" has "markdown text" which has not been
#>     modified.
#>   • Section "Exercise 2" > "Solution" has a "code chunk" named "cor-dino"
#>     which has not been modified.