Chapter 3 {officer} for Word

3.1 Add contents

To add paragraphs of text, tables, images to a Word document, you have to use one of the body_add_* functions:

  • body_add_blocks
  • body_add_break
  • body_add_caption
  • body_add_docx
  • body_add_fpar
  • body_add_gg
  • body_add_img
  • body_add_par
  • body_add_plot
  • body_add_table
  • body_add_toc

They all have the same output and the same first argument: the R object representing the Word document, these functions are all taking as first input the document that needs to be filled with some R content and are all returning the document, that has been augmented with the new R content(s) after the function call.

x <- body_add_par(x, "Level 1 title", style = "heading 1")

These functions are all creating one or more top level elements, either paragraphs, either tables.

3.1.1 Tables

The tabular reporting topic is handled by {officer} using the body_add_table() function. The function is rendering data.frame as Word tables with few formatting options available; it is recommended to use the {flextable} package for more advanced formatting needs.

The body_add_table() function adds a data.frame as a Word table whose formatting is defined in the document template, a group of settings that can be applied to a table. The settings include formatting for the overall table, rows, columns, etc.

You can activate the “conditional formatting” instructions, i.e. a style for the first or last row, the first or last column and a style for the row or column strips.

  • first_row: apply or remove formatting from the first row in the table.
  • first_column: apply or remove formatting from the first column in the table.
  • last_row: apply or remove formatting from the last row in the table.
  • last_column: apply or remove formatting from the last column in the table.
  • no_hband: don’t display odd and even rows.
  • no_vband: don’t display odd and even columns.
dat <- cbind(data.frame(cars = row.names(mtcars)),
             mtcars)
head(dat)
doc_table <- read_docx(path = "templates/template_demo.docx") %>% body_add_table(head(dat, n = 20), style = "Table") %>% body_add_break() %>% body_add_table(head(dat, n = 20), style = "Table", first_column = TRUE) print(doc_table, target = "reports/example_table.docx")

3.1.2 Paragraphs

The package makes it very easy to use paragraph styles. You can incrementally add a text associated with a paragraph style.

The function is not vectorized, it is planned to implement this vectorization in the future.

To add a text paragraph, use the body_add_paragraph() function. The function requires 3 arguments, the target document, the text to be used in the new paragraph, and the paragraph style to be used.

read_docx() %>% 
  body_add_par(value = "Hello World!", style = "Normal") %>% 
  body_add_par(value = "Salut Bretons!", style = "centered") %>% 
  print(target = "reports/example_par.docx")

3.1.3 Titles

A title is a paragraph. To add a title, use body_add_par() with the style argument set to the corresponding title style.

read_docx() %>% 
  body_add_par(value = "This is a title 1", style = "heading 1") %>% 
  body_add_par(value = "This is a title 2", style = "heading 2") %>% 
  body_add_par(value = "This is a title 3", style = "heading 3") %>% 
  print(target = "reports/example_titles.docx")

3.1.4 Tables of contents

A TOC (Table of Contents) is a Word computed field, table of contents is built by Word. The TOC field will collect entries using heading styles or another specified style.

Note: you have to update the fields with Word application to reflect the correct page numbers. See update the fields

Use function body_add_toc() to insert a TOC inside a Word document.

doc_toc <- read_docx(path = "templates/word_example.docx") %>%
  body_add_par("Table of Contents", style = "heading 1") %>% 
  body_add_toc(level = 2) %>% 
  body_add_par("Table of figures", style = "heading 1") %>% 
  body_add_toc(style = "Image Caption") %>% 
  body_add_par("Table of tables", style = "heading 1") %>% 
  body_add_toc(style = "Table Caption")

print(doc_toc, target = "reports/example_toc.docx")

3.1.5 Images

Images are specific because they are part of a paragraph. This means you can mix text and images in a paragraph. An image is always rendered in a paragraph. Functions body_add_img() is a sugar function that wrap an image into a paragraph. It accepts various image formats: png, jpeg or emf.

img.file <- file.path( R.home("doc"), "html", "logo.jpg" )

read_docx() %>% 
  body_add_img(src = img.file, height = 1.06, width = 1.39, style = "centered") %>% 
  print(target = "reports/example_image.docx")

3.1.6 Plots from {ggplot2}

Function body_add_gg() is also a sugar function that wrap an image generated from a ggplot into a paragraph.

library(ggplot2)

gg <- ggplot(data = iris, aes(Sepal.Length, Petal.Length)) + 
  geom_point()

doc_gg <- read_docx()
doc_gg <- body_add_gg(x = doc_gg, value = gg, style = "centered")

The size of the Word document can be used to maximize the size of the graphic to be produced.

word_size <- docx_dim(doc_gg)
word_size
#> $page
#>     width    height 
#>  8.263889 11.694444 
#> 
#> $landscape
#> [1] FALSE
#> 
#> $margins
#>       top    bottom      left     right    header    footer 
#> 0.9840278 0.9840278 0.9840278 0.9840278 0.4916667 0.4916667
width <- word_size$page['width'] - word_size$margins['left'] - word_size$margins['right']
height <- word_size$page['height'] - word_size$margins['top'] - word_size$margins['bottom']

doc_gg <- body_add_gg(x = doc_gg, value = gg, 
                      width = width, height = height, 
                      style = "centered")

print(doc_gg, target = "reports/example_gg.docx")

3.1.7 Base plot

To add a standard R graphic, use the body_add_plot function with plot_instr which contains the graphic instructions to be executed to produce a single graphic.

doc <- read_docx()
doc <- body_add_plot(doc,
    width = width, height = height,
    value = plot_instr(
      code = {barplot(1:5, col = 2:6)}),
      style = "centered")
print(doc, target = "reports/example_word_plot_instr.docx")

3.1.8 Microsoft charts

The {mschart} package allows you to create native office graphics that can be used with {officer}. The body_add_chart function must be used to generate an office chart in Word.

library(mschart)

my_barchart <- ms_barchart(data = browser_data,
  x = "browser", y = "value", group = "serie")
my_barchart <- chart_settings( x = my_barchart,
  dir="vertical", grouping="clustered", gap_width = 50 )

read_docx() %>% 
  body_add_chart(chart = my_barchart, style = "centered") %>% 
  print(target = "reports/example_word_chart.docx")

3.1.9 Page breaks

Page breaks are handy for formatting a Word document. They allow you to control where your document should move to the next page, such as at the end of a chapter or section.

Use function body_add_break() to add a page break in the Word document.

library(ggplot2)
library(flextable)

gg <- ggplot(data = iris, aes(Sepal.Length, Petal.Length)) + 
  geom_point()

ft <- flextable(head(iris, n = 10))
ft <- set_table_properties(ft, layout = "autofit")

read_docx() %>% 
  body_add_par(value = "dataset iris", style = "heading 2") %>% 
  body_add_flextable(value = ft ) %>% 
  
  body_add_break() %>% 

  body_add_par(value = "plot examples", style = "heading 2") %>% 
  body_add_gg(value = gg, style = "centered") %>% 
  
  print(target = "reports/example_break.docx")

3.1.10 External documents

Inserting a document of course allows you to integrate a previously-created Word document into another document. This can be useful when certain parts of a document need to be written manually but automatically integrated into a final document. The document to be inserted must be in docx format.

This can be done by using function body_add_docx().

read_docx() %>% 
  body_add_par(value = "An external document", style = "heading 1") %>% 
  body_add_docx(src = "reports/example_break.docx") %>% 
  print(target = "reports/example_add_docx.docx")

This can be advantageous when you are generating huge documents and the generation is getting slower and slower.

It is necessary to generate smaller documents and to design a main script that inserts the different documents into a main Word document. The following script illustrates the strategy:

library(uuid)

ft <- flextable(iris)
ft <- set_table_properties(ft, layout = "autofit")

gg_plot <- ggplot(data = iris ) +
  geom_point(mapping = aes(Sepal.Length, Petal.Length))

tmpdir <- tempfile()
dir.create(tmpdir, showWarnings = FALSE, recursive = TRUE)
tempfiles <- file.path(tmpdir, paste0(UUIDgenerate(n = 10), ".docx") )

for(i in seq_along(tempfiles)) {
  doc <- read_docx()
  doc <- body_add_par(doc, value = "", style = "Normal")
  doc <- body_add_gg(doc, value = gg_plot, style = "centered")
  doc <- body_add_par(doc, value = "", style = "Normal")
  doc <- body_add_flextable(doc, value = ft)
  temp_file <- tempfile(fileext = ".docx")
  print(doc, target = tempfiles[i])
}

# tempfiles contains all generated docx paths

main_doc <- read_docx()
for(tempfile in tempfiles){
  main_doc <- body_add_docx(main_doc, src = tempfile)
}
print(main_doc, target = "reports/example_huge.docx")

3.2 Add Sections

A section affects preceding paragraphs or tables (see Word Sections).

Usually, starting with a continous section and ending with the section you defined is enough.

To format your content in a section, you should use the body_end_block_section function. First you need to define the section with the block_section function, which takes an object returned by the prop_section function. It is prop_section() that allows you to define the properties of your section.

Let’s first create a document and add a graphic:

library(ggplot2)

gg <- ggplot(data = iris, aes(Sepal.Length, Petal.Length)) + 
  geom_point()

doc_section_1 <- read_docx()
doc_section_1 <- body_add_gg(
  x = doc_section_1, value = gg, 
  width = 9, height = 6,
  style = "centered")

Now, let’s add a section that will set the previously graphic display in a landscape oriented page.

ps <- prop_section(
  page_size = page_size(orient = "landscape"),
  type = "oddPage")

doc_section_1 <- body_end_block_section(
  x = doc_section_1, 
  value = block_section(property = ps))

That’s it, let’s add the graphic again to see it display at the end of the document in the default section:

doc_section_1 <- body_add_gg(
  x = doc_section_1, value = gg,
  width = 6.29, height = 9.72,
  style = "centered"
)

print(doc_section_1, target = "reports/example_landscape_gg.docx")

3.2.1 Supported features

Most of the properties of Word sections are available with the {officer} package: page size, page margins, section type (oddPage, continuous, nextColumn) and columns. The ability to link a header or footer to a section is not (yet) implemented.

Section properties are defined with function prop_section with arguments:

  • page_size: page dimensions defined with function page_size().
  • page_margins: page margins defined with function page_margins().
  • type : section type (“continuous”, “evenPage”, “oddPage”, …).
  • section_columns: section columns defined with function section_columns().

3.2.2 How to manage sections

The body_end_block_section function is usually used twice. The first time to close the previous section (and thus start the new one) and then another section to close the second one. All content between the end of the first section and the end of the second section will be arranged according to the rules defined for the second section.

Let’s illustrate the principle with a graphic that need to be in a landscape oriented page.

  1. A paragraph is added.
  2. We add an end of section (we use a continuous section for this) to let the first paragraph fit in the default section.
  3. Add the graphic.
  4. We add an end of section that will apply to the graphic (we reuse the property that allows to have a section oriented in landscape).
doc_section_2 <- read_docx() %>% 
  body_add_par("This is a dummy text. It is in a continuous section") %>% 
  body_end_block_section(block_section(prop_section(type = "continuous"))) %>% 
  body_add_gg(value = gg, width = 7, height = 5, style = "centered") %>% 
  body_end_block_section(block_section(property = ps))

print(doc_section_2, target = "reports/example_landscape_gg2.docx")

Note that if you add a section break at the end of the document with a different orientation than the default, it generates a last page that is empty. This is a behavior of Word and there is only one solution: using a template where the default orientation is the same as the last section break. For example, a default landscape orientation if you insert a landscape oriented section at the end of the document.

Now, let’s illustrate a complex layout. We are going to add two sections oriented in landscape. The first will contain a table, the second will contain long text separated into two columns. The final result will be a landscape oriented page containing a table and then text spread over two columns (and of course this famous extra blank page).

landscape_one_column <- block_section(
  prop_section(
    page_size = page_size(orient = "landscape"), type = "continuous"
  )
)
landscape_two_columns <- block_section(
  prop_section(
    page_size = page_size(orient = "landscape"), type = "continuous",
    section_columns = section_columns(widths = c(4, 4))
  )
)

doc_section_3 <- read_docx() %>%
  body_add_table(value = head(mtcars), style = "table_template") %>%
  body_end_block_section(value = landscape_one_column) %>% 
  body_add_par(value = paste(rep(letters, 60), collapse = " ")) %>%
  body_end_block_section(value = landscape_two_columns)

print(doc_section_3, target = "reports/example_complex_section.docx")

3.2.3 Sugar functions

In addition to the generic function body_end_block_section, some utility functions are available to be used as shortcuts:

  • body_end_section_landscape()
  • body_end_section_portrait()
  • body_end_section_columns()
  • body_end_section_columns_landscape()
  • body_end_section_continuous()

3.3 Remove content

The function body_remove() lets you remove content from a Word document. This function is often to be used with a cursor_* function.

For illustration purposes, we will reuse document produced here as initial document and the last three paragraphs will be removed.

my_doc <- read_docx(path = "reports/example_break.docx")

my_doc <- body_remove(my_doc) %>% cursor_end()
my_doc <- body_remove(my_doc) %>% cursor_end()
my_doc <- body_remove(my_doc) %>% cursor_end()

print(my_doc, target = "reports/example_remove.docx")