Chapter 2 Office documents generation

You can generate Word and PowerPoint reports in two ways.

  • With the {officer} package; in this case, you have a set of R functions that let you send R-generated content into a Word document or a PowerPoint presentation.
library(officer)
library(magrittr)

doc_1 <- read_docx() %>%
  body_add_par("Hello world!", style = "heading 1") %>%
  body_add_par("", style = "Normal") %>%
  body_add_table(airquality, style = "table_template")

print(doc_1, target = "reports/example_1.docx")

  • With the {officedown} package; in this case, you use the syntax offered by the {rmarkdown} package to define the content of the Word document or the PowerPoint presentation.
---
date: "2020-15-36"
author: "Your Name"
title: "officedown template"
output: officedown::rdocx_document
---

## A level 2 title

Some blah blah blah blah blah blah blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah blah blah blah blah blah blah.

2.1 Templates

These features are based on the use of Word or PowerPoint templates. One thing to keep in mind when getting started is that there is always a template document that is used to produce the final document. For simple uses, the user will not realize this because he will use the default document.

So, what is the purpose of this document template? Your document styles, lists definitions, the table styles, margin sizes, footers are some of the properties that will be reused in the produced document.

Taking full advantage of the packages’capabilities requires the use of document templates in which various styles and formatting parameters are stored.

You can create or re-use a Microsoft Word or PowerPoint document that will be used as template (ex: “template.docx” or “template.pptx”).

For the production of Word documents, it is recommended to learn how “Word styles” work for paragraphs, tables and lists if you never really used Word styles (Go to your favorite search engine and type word custom style). For the production of PowerPoint presentations, it is recommended to learn how layouts work (Go to your favorite search engine and type PowerPoint Masters and Layouts).

2.1.1 Template usage with officedown

The argument reference_docx (see ?rmarkdown::word_document or ?officedown::rdocx_document) lets you to use styles and settings from a template which is a Word document (ending with *.docx).

The R Markdown documentation on this topic is available via:
https://bookdown.org/yihui/rmarkdown/word-document.html

---
date: "2020-15-36"
author: "Your Name"
title: "officedown template"
output: 
  officedown::rdocx_document:
    reference_docx: path/to/your/template.docx
---

## A level 2 title

Some blah blah blah blah blah blah blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah blah blah blah blah blah blah.

The argument reference_doc (see ?rmarkdown::powerpoint_presentation or ?officedown::rpptx_document) lets you to use settings from a template which is a PowerPoint document (ending with *.pptx).

The R Markdown documentation on this topic is available via:
https://bookdown.org/yihui/rmarkdown/powerpoint-presentation.html

---
date: "2020-15-36"
author: "Your Name"
title: "officedown template"
output: 
  officedown::rpptx_document:
    reference_docx: path/to/your/template.pptx
---

## A level 2 title

Some blah blah blah blah blah blah blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah blah blah blah blah blah blah.

2.1.2 Template usage with officer

To specify a template, use parameter path which is the filename of the Word or PowerPoint template.

The following example illustrates our point with a template containing page numbers, a logo and a diagonal banner on the body of the document.

library(officer)
library(magrittr)

doc_2 <- read_docx(path = "templates/template_demo.docx") %>%
  body_add_par("Hello world!", style = "Normal") %>%
  body_add_table(head(mtcars), style = "Table")

print(doc_2, target = "reports/example_2.docx")

Formats and styles are defined in the initial file. The content of original document is preserved.

pres_2 <- read_pptx(path = "templates/template_demo.pptx") %>%
  add_slide() %>% 
  ph_with(value = "Hello world", location = ph_location_type(type = "title")) %>% 
  ph_with(value = head(iris), location = ph_location_type(type = "body")) 

print(pres_2, target = "reports/pptx_example_2.pptx")

Formats and available slide layouts will be those available in the template file. The content of original document is also preserved (but can be manipulated, i.e. delete a slide).

2.2 PowerPoint presentation properties

2.2.1 Slideshow dimensions

The size of slides can be read with function slide_size. Size is in inches.

slide_size(pres_2)
#> $width
#> [1] 10
#> 
#> $height
#> [1] 7.5

There is no function to modify these values. They are read-only. If you want to use other dimensions, use a template that have the dimensions you want.

2.2.2 List layouts names

From R, function officer::layout_summary() will return a data.frame listing layouts and masters in a presentation.

layout_summary(pres_2)

This is the information expected for the layout and master argument values of the add_slide() function.

2.2.3 Layouts properties

When adding content to a slide, you could need to know more informations. For example, the identifier of a placeholder, the position, the width and height.

All these informations can be read with function layout_properties().

z <- layout_properties(pres_2, layout = "Title and Content")
z

The ph_label column can be particularly interesting to allow to choose a placeholder by its identifier.

It’s easy to plot this information and see how placeholders are arranged.

ggplot(z, aes(xmin = offx, ymin = -offy, xmax = offx + cx, ymax = -offy - cy)) + 
  geom_rect(fill = "pink") + 
  geom_text(aes(x = offx, y = -offy - cy/2, label = ph_label), 
            color = "black", size = 3.5, hjust = 0) +
  theme_void()

2.3 Word Document styles

When you add content to the Word document, you will have a formatting whose properties are defined in the template. These layouts can be listed from R and from Word of course.

From Word, you have to open the menu “Quick Styles gallery” or the menu “Styles task pane” or the menu “Style”.

From R, you have to call officer::styles_info() to get a data.frame listing available styles in a document. The function is listing not only paragraph styles but also character styles, table styles and list styles.

styles_info(doc_1)
  • column style_type is providing the type of style, it can be ‘paragraph’, ‘character’, ‘table’ and ‘list’.
  • column style_id is providing the unique identifier of the style, users should not have to worry about it.
  • column style_name is providing name (unique) of the style. This is the value users will have to often use to specify which style is to be used when adding a content (‘heading 1’ for example).
  • columns is_custom and is_default are providing informations about the style (is it a custom style, is it a default style).

These results should be used to list the styles you want to associate with content. For example, to add a ‘level 1’ title, use ‘heading 1’.

Be careful, the style names are specific to your configuration. It is likely that the title of ‘level 1’ is not called ‘heading 1’; in this case, it is up to you to recognize it (in French for example, it can be called ‘Titre 1’).

2.3.1 Usage with rmarkdown

Being able to use style with paragraphs or text chunks is not a feature of {officedown} but of pandoc.

From https://pandoc.org/MANUAL.html#custom-styles:

If you define a paragraph or text chunk with the attribute custom-style, pandoc will apply your specified style to the contained elements (with the exception of elements whose function depends on a style, like headings, code blocks, block quotes, or links).

In the example below, the paragraph has style “centered” and a chunk of text has character/run style “strong”.

---
output: 
  officedown::rdocx_document:
    reference_docx: path/to/your/template.docx
---


::: {custom-style="centered"}

blah blah blah [strong blah]{custom-style="strong"}.

:::

2.3.2 Usage with officer

The paragraph is the main top container for content within a Word document. Note that tables are top container, they are at the same level as paragraphs. body_add_* functions are designed to add content as a top container: text as an entire paragraph, table, image, page break…

The {officer} package provides function officer::body_add_par() with an argument named stylename. Expected value is one of the available paragraph style names. This allows you to define a paragraph associated with a style. Note that a title is a paragraph.

library(officer)
library(magrittr)

doc_3 <- read_docx(path = "templates/template_demo.docx") %>%
  body_add_par("Level 1 title", style = "heading 1") %>%
  body_add_par("Hello world!", style = "Normal")

print(doc_3, target = "reports/example_3.docx")