Chapter 9 Transform objects into flextable

Function as_flextable() is a generic function to cast objects into flextable.

This method exist for model objects such as lm, glm, mgcv::gam or htest but also xtable [@R-xtable].

9.1 Groups as row titles

Package flextable does not support data transformation. A grouped dataset is then needed (and tidy). This has to be done prior to the creation of the object.

Function as_grouped_data will modify data structure so that it will be easy to manage grouped data representation. Repeated consecutive values of group columns will be used to define the title of the groups and will be added as a row title.

Let’s have an example with aggragated data from dataset CO2:

library(data.table)
data_CO2 <- dcast(as.data.table(CO2), 
  Treatment + conc ~ Type, value.var = "uptake", fun.aggregate = mean)
head(data_CO2)
# Key: <Treatment, conc>
#     Treatment  conc   Quebec Mississippi
#        <fctr> <num>    <num>       <num>
# 1: nonchilled    95 15.26667    11.30000
# 2: nonchilled   175 30.03333    20.20000
# 3: nonchilled   250 37.40000    27.53333
# 4: nonchilled   350 40.36667    29.90000
# 5: nonchilled   500 39.60000    30.60000
# 6: nonchilled   675 41.50000    30.53333

as_grouped_data will restructure the dataset:

data_CO2 <- as_grouped_data(x = data_CO2, groups = c("Treatment"))
head(data_CO2)
#    Treatment conc   Quebec Mississippi
# 1 nonchilled   NA       NA          NA
# 2       <NA>   95 15.26667    11.30000
# 3       <NA>  175 30.03333    20.20000
# 4       <NA>  250 37.40000    27.53333
# 5       <NA>  350 40.36667    29.90000
# 6       <NA>  500 39.60000    30.60000

The result is suitable for method as_flextable. A call to this function and few formatting operations are producing the following result:

zz <- flextable::as_flextable(data_CO2) %>%
  bold(j = 1, i = ~ !is.na(Treatment), bold = TRUE, part = "body") %>%
  bold(part = "header", bold = TRUE) %>%
  colformat_double(i = ~ is.na(Treatment), j = "conc", digits = 0, big.mark = "") %>%
  autofit()
zz

conc

Quebec

Mississippi

Treatment: nonchilled

95

15.26667

11.30000

175

30.03333

20.20000

250

37.40000

27.53333

350

40.36667

29.90000

500

39.60000

30.60000

675

41.50000

30.53333

1000

43.16667

31.60000

Treatment: chilled

95

12.86667

9.60000

175

24.13333

14.76667

250

34.46667

16.10000

350

35.80000

16.60000

500

36.66667

16.63333

675

37.50000

18.26667

1000

40.83333

18.73333

Now let’s add nice bars before displaying the figures:

zz <- compose(
  zz,
  i = ~ is.na(Treatment), j = "Quebec",
  value = as_paragraph(minibar(Quebec, height = 0.1), " ", as_chunk(Quebec))
) %>%
  compose(
    i = ~ is.na(Treatment), j = "Mississippi",
    value = as_paragraph(minibar(Mississippi, height = 0.1), " ", as_chunk(Mississippi))
  ) %>%
  align(j = 2:3, align = "left") %>%
  width(width = c(.5, 1.5, 1.5))
zz

conc

Quebec

Mississippi

Treatment: nonchilled

95

15.27

11.30

175

30.03

20.20

250

37.40

27.53

350

40.37

29.90

500

39.60

30.60

675

41.50

30.53

1000

43.17

31.60

Treatment: chilled

95

12.87

9.60

175

24.13

14.77

250

34.47

16.10

350

35.80

16.60

500

36.67

16.63

675

37.50

18.27

1000

40.83

18.73

And finally, add a footnote in the footer part:

add_footer_lines(zz, "dataset CO2 has been used for this flextable") 

conc

Quebec

Mississippi

Treatment: nonchilled

95

15.27

11.30

175

30.03

20.20

250

37.40

27.53

350

40.37

29.90

500

39.60

30.60

675

41.50

30.53

1000

43.17

31.60

Treatment: chilled

95

12.87

9.60

175

24.13

14.77

250

34.47

16.10

350

35.80

16.60

500

36.67

16.63

675

37.50

18.27

1000

40.83

18.73

dataset CO2 has been used for this flextable

9.2 Dataset summary

Create a dataset summary with functions summarizor() and as_flextable(). This function is a sugar function that can be used to create clinical “Demographic Tables” but also can be used to present dataset summary when there is a grouping variable.

An option is available to add an Overall category at the end of the produced table.

library(dplyr)
library(flextable)
z <- palmerpenguins::penguins %>% 
  select(-contains("length")) %>% 
  summarizor(
    by = "species",
    overall_label = "Overall")

ft <- as_flextable(z, spread_first_col = TRUE) %>% 
  style(i = ~!is.na(variable), pr_t = fp_text_default(bold = TRUE),
        pr_p = officer::fp_par(text.align = "left", padding = 5, line_spacing = 1.5)) %>% 
  prepend_chunks(i = ~is.na(variable), j = 1, as_chunk("\t")) %>% 
  autofit(add_w = .01)
ft

Adelie
(N=152)

Chinstrap
(N=68)

Gentoo
(N=124)

Overall
(N=344)

island

Biscoe

44 (28.9%)

0 (0.0%)

124 (100.0%)

168 (48.8%)

Dream

56 (36.8%)

68 (100.0%)

0 (0.0%)

124 (36.0%)

Torgersen

52 (34.2%)

0 (0.0%)

0 (0.0%)

52 (15.1%)

bill_depth_mm

Mean (SD)

18.35 (1.22)

18.42 (1.14)

14.98 (0.98)

17.15 (1.97)

Median (IQR)

18.40 (1.50)

18.45 (1.90)

15.00 (1.50)

17.30 (3.10)

Range

15.50 - 21.50

16.40 - 20.80

13.10 - 17.30

13.10 - 21.50

Missing

1 (0.7%)

1 (0.8%)

2 (0.6%)

body_mass_g

Mean (SD)

3 700.66 (458.57)

3 733.09 (384.34)

5 076.02 (504.12)

4 201.75 (801.95)

Median (IQR)

3 700.00 (650.00)

3 700.00 (462.50)

5 000.00 (800.00)

4 050.00 (1 200.00)

Range

2 850.00 - 4 775.00

2 700.00 - 4 800.00

3 950.00 - 6 300.00

2 700.00 - 6 300.00

Missing

1 (0.7%)

1 (0.8%)

2 (0.6%)

sex

female

73 (48.0%)

34 (50.0%)

58 (46.8%)

165 (48.0%)

male

73 (48.0%)

34 (50.0%)

61 (49.2%)

168 (48.8%)

Missing

6 (3.9%)

5 (4.0%)

11 (3.2%)

year

Mean (SD)

2 008.01 (0.82)

2 007.97 (0.86)

2 008.08 (0.79)

2 008.03 (0.82)

Median (IQR)

2 008.00 (2.00)

2 008.00 (2.00)

2 008.00 (2.00)

2 008.00 (2.00)

Range

2 007.00 - 2 009.00

2 007.00 - 2 009.00

2 007.00 - 2 009.00

2 007.00 - 2 009.00

9.3 Models and tests

glm, lm, gam, htest, kmeans and pam objects can be easily converted to flextable:

9.3.1 GLM example

clotting <- data.frame(
    u = c(5,10,15,20,30,40,60,80,100),
    lot1 = c(118,58,42,35,27,25,21,19,18),
    lot2 = c(69,35,26,21,18,16,13,12,12))
as_flextable(glm(lot1 ~ log(u), data = clotting, family = Gamma))

Estimate

Standard Error

z value

Pr(>|z|)

(Intercept)

-0.017

0.001

-17.847

0.0000

***

log(u)

0.015

0.000

36.975

0.0000

***

Signif. codes: 0 <= '***' < 0.001 < '**' < 0.01 < '*' < 0.05

(Dispersion parameter for Gamma family taken to be 0.002446059)

Null deviance: 3.513 on 8 degrees of freedom

Residual deviance: 0.01673 on 7 degrees of freedom

9.3.2 LM example

ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
weight <- c(ctl, trt)
lm(weight ~ group) %>% as_flextable()

Estimate

Standard Error

t value

Pr(>|t|)

(Intercept)

5.032

0.220

22.850

0.0000

***

groupTrt

-0.371

0.311

-1.191

0.2490

Signif. codes: 0 <= '***' < 0.001 < '**' < 0.01 < '*' < 0.05

Residual standard error: 0.6964 on 18 degrees of freedom

Multiple R-squared: 0.07308, Adjusted R-squared: 0.02158

F-statistic: 1.419 on 18 and 1 DF, p-value: 0.2490

9.3.3 GAM example

library(mgcv)
set.seed(2)

dat <- gamSim(1, n = 400, 
  dist = "normal", scale = 2, verbose = FALSE)
b <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), 
  data = dat)

ft <- as_flextable(b)
ft

Component

Term

Estimate

Std Error

t-value

p-value

A. parametric coefficients

(Intercept)

7.833

0.099

79.303

0.0000

***

Component

Term

edf

Ref. df

F-value

p-value

B. smooth terms

s(x0)

2.500

3.115

6.921

0.0001

***

s(x1)

2.401

2.984

81.858

0.0000

***

s(x2)

7.698

8.564

88.158

0.0000

***

s(x3)

1.000

1.000

4.343

0.0378

*

Signif. codes: 0 <= '***' < 0.001 < '**' < 0.01 < '*' < 0.05

Adjusted R-squared: 0.715, Deviance explained 0.725

GCV : 4.051, Scale est: 3.903, N: 400

9.3.4 Hypothesis testing example

x <- rnorm(50)
y <- runif(30)
ks.test(x, y) %>% as_flextable()

statistic

p.value

method

alternative

0.57

0.0000***

Exact two-sample Kolmogorov-Smirnov test

two-sided

Signif. codes: 0 <= '***' < 0.001 < '**' < 0.01 < '*' < 0.05

9.3.5 kmeans

cl <- kmeans(scale(mtcars[1:7]), 5)
ft <- as_flextable(cl)
ft

variable

1

2

3

4

5

withinss

2.15

5.03

33.38

4.07

9.58

size

3

5

14

5

5

mpg*

2.0147

0.9108

-0.8281

-0.0582

0.2571

cyl*

-1.2249

-1.2249

1.0149

-0.1050

-0.7769

disp*

-1.2551

-1.0170

0.9874

-0.5703

-0.4244

hp*

-1.2498

-0.7626

0.9120

-0.2696

-0.7714

drat*

1.5214

0.8443

-0.6869

0.4777

-0.3115

wt*

-1.3633

-1.1034

0.7992

-0.1924

-0.1239

qsec*

0.8103

0.0522

-0.6025

-0.3429

1.4915

(*) Centers

Total sum of squares: 217.00

Total within-cluster sum of squares: 54.20

Total between-cluster sum of squares: 162.80

BSS/TSS ratio: 75.02%

Number of iterations: 3

9.3.6 pam

library(cluster)
dat <- as.data.frame(scale(mtcars[1:7]))
cl <- pam(dat, 3)
ft <- as_flextable(cl)
ft

variable

1

2

3

size

6

11

15

max.diss

2.22

2.76

3.16

avg.diss

1.06

1.42

1.59

diameter

2.81

3.99

5.06

separation

0.99

1.48

0.99

avg.width

0.41

0.24

0.29

mpg*

0.1509

1.1962

-0.6124

cyl*

-0.1050

-1.2249

1.0149

disp*

-0.5706

-1.2242

0.3637

hp*

-0.5351

-1.1768

0.4859

drat*

0.5675

0.9042

-0.9848

wt*

-0.3498

-1.3105

0.8715

qsec*

-0.4638

0.5883

-0.2511

(*) Centers

The average silhouette width is 0.2959

9.4 xtable

xtable objects can be transformed as flextable objects with function as_flextable().

x <- as.integer(cumsum(1 + round(rnorm(100), 0)))
temp.ts <- ts(x,
  start = c(1954, 7), frequency = 12)
ft <- as_flextable(x = xtable::xtable(temp.ts, digits = 0))
ft

Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

1954

<na>

<na>

<na>

<na>

<na>

<na>

1

1

2

3

5

5

1955

6

7

9

10

12

11

12

13

14

15

16

17

1956

18

20

19

19

21

23

25

26

27

29

31

32

1957

33

35

37

39

41

43

45

46

46

48

50

50

1958

53

54

54

55

55

54

56

59

61

62

64

63

1959

64

65

65

64

64

65

68

70

73

75

76

78

1960

80

80

80

80

80

82

82

83

83

84

86

86

1961

87

88

88

89

92

92

91

92

93

95

97

99

1962

100

103

105

104

105

106

109

112

113

112

<na>

<na>