Chapter 2 About Word and PowerPoint documents

Even if you consider yourself familiar with Word or PowerPoint, we suggest taking the time to read this section. It covers concepts that are valuable to know before getting started with officer or officedown and gives a clearer picture of what these formats actually contain.

With officer and officedown, you can produce Word documents built from paragraphs, tables, headers and footers, images, sections, styles and bookmarks, and PowerPoint presentations made of slides that draw on layouts, placeholders, text, images, tables and charts. The rest of this chapter introduces the vocabulary and the structure of both formats.

2.1 About Word Documents

2.1.1 Vocabulary

  1. Paragraphs: Paragraphs are the main blocks of text in a Word document. They can contain plain text, images and equations.

  2. Runs: A run is a contiguous piece of text inside a paragraph that shares the same character formatting (font, size, colour, bold, italic, etc.). A paragraph is typically made of several runs when the formatting changes mid-sentence. officer operates at this granularity via fpar() and the run_*() family of functions; docx_summary(x, detailed = TRUE) also exposes the document content at the run level.

  3. Tables: A table is a structured arrangement of cells organized in rows and columns. Tables allow for presenting data in a structured manner and can be inserted into a Word document. Each cell can contain paragraphs (with text, images and equations).

  4. Sections: A Word document is typically divided into sections, which allow for different layouts, headers and footers, page numbering, etc. Each section can have its own layout.

  5. Headers and Footers: Headers and footers are special elements related to sections located at the top and bottom of each page in the document. They can contain information such as the document title, page numbers, logos, etc. Headers and footers can vary from section to section.

  6. Fields: Fields are dynamic elements in a Word document. They instruct Word to insert values that are automatically updated, such as dates, page numbers, or references to other parts of the document. A table of contents is typically generated using fields that reference the heading styles and page numbers.

  7. Paragraph and Character Styles: Paragraph and character styles are predefined formatting attributes that can be applied to text in a Word document. Paragraph styles define the appearance of paragraphs (font, size, alignment, margins, etc.), while character styles are applied to specific runs inside a paragraph to highlight them or give them a specific format.

  8. Bookmarks: Bookmarks are markers in a Word document that allow for marking specific locations for future reference. Bookmarks can be used for targeted replacements or for creating hyperlinks to specific parts of the document.

By using these structural elements, you can create well-organized Word documents with distinct sections, customized headers and footers, structured tables, calculated fields for page numbers and tables of contents, as well as consistent styles for efficient formatting.

2.1.2 The Structure of a Word Document

A Word document consists of several key components that define its underlying structure and content organization. Understanding these components is essential for effectively working with Word documents programmatically. The main parts of a Word document are the body, headers, and footers.

  1. Body: The body is the main content area of a Word document. It contains paragraphs, tables, images, and other elements that make up the textual and visual content. The body is where the majority of the document’s content resides and is typically the primary focus when working with the document’s structure.

  2. Headers: Headers are sections located at the top of each page in the document. They can contain information such as document titles, logos, author names, or page numbers. Headers can be different for odd and even pages or for the first page of a section. They provide a consistent element that appears on each page in the specified section.

  3. Footers: Footers are sections located at the bottom of each page in the document. Similar to headers, footers can contain information such as page numbers, document properties, or copyright notices. Like headers, footers can have different content for odd and even pages or for the first page of a section.

Headers and footers are often used to provide additional context, page numbering, or branding elements to the document. They can be customized separately for each section of the document, allowing for distinct headers and footers based on the section’s requirements.

2.1.3 Styles

In Microsoft Word, styles allow you to apply formatting to text, paragraphs, headings, and other elements in your document. Styles define the appearance of various elements and provide a quick and efficient way to apply consistent formatting throughout your document.

When you apply a style to a portion of text or a paragraph, it automatically applies a predefined set of formatting attributes to that text or paragraph. These attributes can include font, font size, color, indentation, line spacing, alignment, and more. By using styles, you can easily change the formatting of an entire document by modifying the style definition.

Word comes with a set of built-in styles, such as “Heading 1,” “Heading 2,” “Normal,” “Title,” and “Subtitle.” These styles serve as a starting point, but you can also create custom styles to suit your specific needs. Custom styles allow you to define your own formatting preferences and apply them consistently across your document.

Styles are normally applied manually from Word’s Styles gallery; officer lets you apply existing styles and create new ones programmatically, see Chapter 4.

One of the main benefits of using styles is that they allow for easy and efficient document formatting. By using consistent styles throughout your document, you can ensure that headings, subheadings, and body text all have a consistent appearance.

Styles also provide a way to create a table of contents, manage headings and subheadings, and facilitate document navigation. Word can generate a table of contents automatically based on the styles applied to headings in your document.

2.2 About PowerPoint presentations

2.2.1 Vocabulary

A PowerPoint presentation consists of various components that define its structure and content organization. Here’s an overview of these components:

  1. Slides: Slides are the fundamental units of a PowerPoint presentation. Each slide represents a single page or screen and can contain different types of content, such as text, images, tables, charts, and multimedia elements.

  2. Text Blocks/Paragraphs: Text blocks or paragraphs are used to add textual content to a slide. You can create multiple text blocks within a slide and format them individually. Each text block can include headings, bullet points, or plain text.

  3. Tables: Tables provide a structured way to organize and present tabular data on a slide. Cells can only contain text.

  4. Images: Images can be inserted onto slides to add visual elements to the presentation. You can place images in specific locations on a slide, resize them, and apply formatting options.

  5. Placeholders: Placeholders are predefined areas on a slide layout that hold a specific type of content, such as a title, body text, an image, a table or a chart. A generic “content” placeholder can accept several content types (text, image, table, chart), but only one at a time: filling a placeholder with new content replaces whatever was in it.

  6. Slide Layout: A slide layout is a template for a category of slide (title slide, title and content, two content, comparison, etc.). It defines the position, size and type of the placeholders, as well as default formatting (background, fonts, colours). When you add a new slide, you choose the layout that matches the slide’s purpose.

  7. Slide Master: The slide master is the parent of every slide layout in the presentation. It defines the common visual identity (theme, fonts, colours, background, default placeholders) that the layouts inherit from. A presentation can contain several slide masters, each with its own set of layouts.

2.2.2 The Structure of a PowerPoint presentation

A PowerPoint presentation consists of a set of slides that serve to organize and present information in a structured manner. Each slide adopts a specific style defined by a slide layout of the slide master.

The slide master is a template that gathers information about the layout, fonts, colors, and graphical elements used in the slides. Its purpose is to ensure visual consistency throughout the presentation.

Every officer PowerPoint operation is driven by a reference presentation: a .pptx template that provides the slide masters, the layouts and the named placeholders that add_slide() and ph_with() target. read_pptx() without argument loads a default template shipped with the package; for a corporate look-and-feel, pass the path of your own template to read_pptx().

Specific layouts are associated with a slide master. Commonly used layouts include:

  • A slide with a title at the top and content below: This layout is used to present a main idea or key point followed by additional information.
  • A two-column slide with an image on one side and text on the other: This layout is ideal for comparing or contrasting two elements or presenting complementary information.
  • A slide with a large image as the background and overlaid text: This layout is effective for highlighting an impactful image while adding descriptive or explanatory text.