`r params$name`
Quarto 是一个基于 Pandoc 的开源科学与技术出版系统。许多人认为它是 RMarkdown 的”下一代”产品,但它更加通用,支持多种语言,而不仅仅局限于 R 语言。
对我来说,最常见的用例之一是创建参数化报告。参数化报告是指使用包含参数的 .qmd 文件,在渲染时可以向其传递参数,从而根据不同的参数创建不同版本的输出报告。
顺便提一下,Meghan Hall 在她的博客上发表了关于这一主题的精彩文章,详细介绍了如何自定义输出——非常值得一读!
作为一名教授,我使用参数化报告的方式之一就是为学生的作业提供独特的反馈。通过使用诸如 student_name、grade 和 feedback 等参数,我可以使用一个 .qmd 文件,然后为每个学生生成一份独特的报告,其中这些参数会被替换为每个学生的相应信息。
RMarkdown 和 Quarto 在渲染参数化报告时使用了几乎相同的接口,因此本文的大部分内容也直接适用于 RMarkdown。但由于 Quarto 更加通用且较新,我将在这篇文章中重点介绍 Quarto。
还有一个补充说明——这些示例仅适用于使用 knitr 作为渲染引擎 的情况(你也可以在 Jupyter 中使用参数,但其语法有所不同)。
参数基础
在 Quarto 中渲染参数化输出需要遵循两个步骤:
- 在 .qmd 文件中添加参数
- 在渲染时传递参数值
在 .qmd 文件中添加参数
在 YAML 中,你可以使用 params 定义任何你想要的参数。例如,如果我想创建一个包含参数 name 的报告,在渲染时用一个人的名字来替换它,我会在 YAML 中添加如下内容:
---
params:
name: "John"
---值 "John" 是该参数的默认值,如果没有传递参数将使用此默认值。现在这个参数可以在 .qmd 文件的任何位置使用 params$name 来引用,它将会被实际的参数值替换。请注意,在代码块中你可以直接使用 params$name,但如果你想在行内使用(例如在句子中),你需要使用行内 R 命令,如下所示:
你可以根据需要包含任意多的参数,只需将它们添加到 params 中。例如,以下是如何添加 name 和 grade 参数的方法:
---
params:
name: "John"
grade: "100%"
---使用参数的一个优点是你可以使用默认值预览输出效果,这样在创建文档的不同版本之前,你可以确保一切看起来都符合你的预期。
我倾向于将这些文件保存为类似 “template.qmd” 的名称,因为这是一个模板,我将使用它来渲染生成多个不同版本。
在渲染时传递参数
一旦你准备好包含参数的 “template.qmd” 文件,你就可以在渲染时向其传递新参数。如果你更喜欢在终端中工作,可以在 quarto render 命令中传递参数,例如:
quarto render template.qmd -P name:'Paul' -P grade:'98%'
如果你更习惯在 R 中而不是终端中工作(像我一样),你可以使用 {quarto} R 包来渲染 .qmd 文件。主要函数是 quarto::quarto_render(),它接受一个 input 参数,用于指定 “template.qmd” 文件的路径。要传递参数,你使用 execute_params 参数,它必须是一个参数列表。例如,要渲染与上面终端示例相同的输出,你可以使用:
quarto::quarto_render(
input = "template.qmd",
execute_params = list(
name = "Paul",
grade = "98%"
)
)迭代渲染
我经常会遇到需要向 “template.qmd” 文件传递多组参数的情况(例如我需要为班级中的每个学生生成一份报告)。在这种情况下,我会在循环中使用 quarto::quarto_render() 命令。
例如,假设我有一个 “grades.csv” 文件,其中包含每个学生的 name 和 grade 列。我可以读取这个数据文件,然后为每个学生迭代渲染 “template.qmd” 文件。这时我需要特别注意,确保提供一个 output_file 参数,以便每份报告都有一个唯一的文件名。我的代码看起来会像这样:
df <- readr::read_csv("grades.csv")
for (i in 1:nrow(df)) {
student <- df[i, ] # Each row is a unique student
quarto::quarto_render(
input = "template.qmd",
output_file = paste0("feedback-", student$name, ".pdf"),
execute_params = list(
name = student$name,
grade = student$grade
)
)
}如果运行这段代码,我会在目录中得到很多 PDF 文件,每个文件名都是 “feedback-{name}.pdf” 的形式,其中 “{name}” 会被每个学生的姓名替换(例如 “feedback-John.pdf”)。
给 {purrr} 用户的说明:是的,我知道还有其他迭代方式,但对于这个特定目的,我发现使用循环更便于传递参数(特别是当有多个参数时)。
示例
在最近的一次 GW Coders 线下聚会(你可以观看视频)上,我演示了如何使用参数化 Quarto 文件,并提供了两个简单示例:成绩报告和婚礼邀请卡。这些演示的代码可以在 https://github.com/jhelvy/quarto-pdf-demo 获取。
成绩示例与我在本文中用来创建多个学生报告的示例类似。婚礼邀请卡示例则演示了我如何使用两个不同的模板,并根据条件渲染合适的模板(在这个例子中,是根据礼物是否为现金来生成包含不同信息的”感谢”卡)。
In each example, I have a “template.qmd” file that defines the content of the parameterized output PDF, and a “make_pdfs.R” file that contains the R code to iteratively render each PDF. I encourage you to download the files and play with them yourself to see how each example works. They are by no means the only (or even best) way to do this, but they provide a working starting point to build upon.
Some challenges
In the demo repo, I have included a third example called “data-frames” that demonstrates some fixes for two challenges I have run into when rendering parameterized reports in Quarto. Those are:
- Passing a data frame object as a parameter.
- Rendering the output to a different directory.
It is worth mentioning that neither of these are issues when using RMarkdown. They may be addressed more elegantly in the future, but for now here are my workaround solutions.
Passing data frames as parameters
Since Quarto is a separate program from R, it doesn’t know what a data frame is, so if you pass a data frame object as a parameter in execute_params, it will convert it to a list. This issue was posted in the Posit Community forum here.
After posting about the issue in the Fediverse, both Mickaël Canouil and Garrick Aden-Buie suggested using the {jsonlite} package to serialize the data frame to pass it as a parameter and then un-serialize it back to a data frame inside the .qmd file. Turns out this worked perfectly!
The specific functions I use to handle the job are jsonlite::toJSON() and jsonlite::fromJSON(). In the quarto::quarto_render() command, I have to serialize the data frame inside the parameter list like so:
quarto::quarto_render(
input = "template.qmd",
execute_params = list(
df = jsonlite::toJSON(df), # Serialize the data frame
month = month
)
)Then inside my “template.qmd” file I un-serialize it back to a data frame inside a code chunk with the following line:
df <- jsonlite::fromJSON(params$df)From there on I can use the df object anywhere in my “template.qmd” file as a data frame. The reason this isn’t an issue when using RMarkdown is that RMarkdown runs inside R, so it “knows” what a data frame is throughout the whole process.
In the “data-frames” example, I create monthly summary tables of flight departure and arrival delays by airline using the {nycflights} package.
In this specific example, an easier approach would be to simply pass the “month” as a parameter to the “template.qmd” file and then compute the summary table there (this is in fact my recommended approach if possible). But that requires that the data be accessible from outside the “template.qmd” file (e.g. saved to disc), and that the summary calculations be relatively fast. If, for example, reading in and summarizing the data is computationally expensive, then it may be easier to do what I have done in this example, which is first read in and summarize all the data, then pass along the summary data frame to the “template.qmd” file as a serialized data frame.
Rendering to a different directory
Unfortunately, at least at the moment it appears that quarto::quarto_render() is not capable of rendering an output file to any location other than the current directory. I noted this in the quarto-cli discussion forums here. The best solution for now seems to be to simply render the output and then copy it over to a desired output directory.
In practice, this is a bit cumbersome as there are a number of different conditions to consider that make the copy-pasting not so simple, so my solution was to write my own custom function that works as a wrapper around quarto::quarto_render() and allows the user to provide an optional output_dir for where the output file will be moved post-rendering.
I have put this function inside my person R package {jph}, which you can install if you wish to use it yourself. I named the function quarto_render_move(), which renders and then optionally moves the file to a desired location. The function source is available here.
In practice, it works as a drop-in replacement for quarto::quarto_render(). Here is an example:
jph::quarto_render_move(
input = "template.qmd",
output_file = "feedback-student.pdf",
output_dir = "output_folder",
execute_params = list(
name = "Paul",
grade = "98%"
)
)Using this code, the output file would be placed inside a folder called “output_folder”.
Wrap up
Quarto is still quite new, and the user base is still growing. Without a doubt, I expect that most current Quarto users are coming from RMarkdown, which has for years just seemed like total wizardry with how seamlessly it works.
Coming from RMarkdown myself, Quarto has a lot of very nice features that definitely build on the best of what RMarkdown has had to offer. But it’s not perfect, and the fact that it is totally separate from R (i.e. it’s not an R package) has meant giving up some of the conveniences I have enjoyed, like passing data frames around with wreckless abandon. Hopefully the tricks posted here will work for you too if you try to use them. However your Quarto journey goes, let me know with a comment!
Cheers, JP