Chapter 9 Transform objects into flextable

Function as_flextable() is a generic function to cast objects into flextable.

This method exist for model objects such as lm, glm, mgcv::gam or htest but also xtable [@R-xtable].

9.1 Groups as row titles

Package flextable does not support data transformation. A grouped dataset is then needed (and tidy). This has to be done prior to the creation of the object.

Function as_grouped_data will modify data structure so that it will be easy to manage grouped data representation. Repeated consecutive values of group columns will be used to define the title of the groups and will be added as a row title.

Let’s have an example with aggragated data from dataset CO2:

library(data.table)
data_CO2 <- dcast(as.data.table(CO2), 
  Treatment + conc ~ Type, value.var = "uptake", fun.aggregate = mean)
head(data_CO2)

# Key: <Treatment, conc>
#     Treatment  conc   Quebec Mississippi
#        <fctr> <num>    <num>       <num>
# 1: nonchilled    95 15.26667    11.30000
# 2: nonchilled   175 30.03333    20.20000
# 3: nonchilled   250 37.40000    27.53333
# 4: nonchilled   350 40.36667    29.90000
# 5: nonchilled   500 39.60000    30.60000
# 6: nonchilled   675 41.50000    30.53333

as_grouped_data will restructure the dataset:

data_CO2 <- as_grouped_data(x = data_CO2, groups = c("Treatment"))
head(data_CO2)

#    Treatment conc   Quebec Mississippi
# 1 nonchilled   NA       NA          NA
# 2       <NA>   95 15.26667    11.30000
# 3       <NA>  175 30.03333    20.20000
# 4       <NA>  250 37.40000    27.53333
# 5       <NA>  350 40.36667    29.90000
# 6       <NA>  500 39.60000    30.60000

The result is suitable for method as_flextable. A call to this function and few formatting operations are producing the following result:

zz <- flextable::as_flextable(data_CO2) %>%
  bold(j = 1, i = ~ !is.na(Treatment), bold = TRUE, part = "body") %>%
  bold(part = "header", bold = TRUE) %>%
  colformat_double(i = ~ is.na(Treatment), j = "conc", digits = 0, big.mark = "") %>%
  autofit()
zz

conc	Quebec	Mississippi
Treatment: nonchilled
95	15.26667	11.30000
175	30.03333	20.20000
250	37.40000	27.53333
350	40.36667	29.90000
500	39.60000	30.60000
675	41.50000	30.53333
1000	43.16667	31.60000
Treatment: chilled
95	12.86667	9.60000
175	24.13333	14.76667
250	34.46667	16.10000
350	35.80000	16.60000
500	36.66667	16.63333
675	37.50000	18.26667
1000	40.83333	18.73333

Now let’s add nice bars before displaying the figures:

zz <- compose(
  zz,
  i = ~ is.na(Treatment), j = "Quebec",
  value = as_paragraph(minibar(Quebec, height = 0.1), " ", as_chunk(Quebec))
) %>%
  compose(
    i = ~ is.na(Treatment), j = "Mississippi",
    value = as_paragraph(minibar(Mississippi, height = 0.1), " ", as_chunk(Mississippi))
  ) %>%
  align(j = 2:3, align = "left") %>%
  width(width = c(.5, 1.5, 1.5))
zz

conc	Quebec	Mississippi
Treatment: nonchilled
95	15.27	11.30
175	30.03	20.20
250	37.40	27.53
350	40.37	29.90
500	39.60	30.60
675	41.50	30.53
1000	43.17	31.60
Treatment: chilled
95	12.87	9.60
175	24.13	14.77
250	34.47	16.10
350	35.80	16.60
500	36.67	16.63
675	37.50	18.27
1000	40.83	18.73

And finally, add a footnote in the footer part:

add_footer_lines(zz, "dataset CO2 has been used for this flextable")

conc	Quebec	Mississippi
Treatment: nonchilled
95	15.27	11.30
175	30.03	20.20
250	37.40	27.53
350	40.37	29.90
500	39.60	30.60
675	41.50	30.53
1000	43.17	31.60
Treatment: chilled
95	12.87	9.60
175	24.13	14.77
250	34.47	16.10
350	35.80	16.60
500	36.67	16.63
675	37.50	18.27
1000	40.83	18.73
dataset CO2 has been used for this flextable

9.2 Dataset summary

Create a dataset summary with functions summarizor() and as_flextable(). This function is a sugar function that can be used to create clinical “Demographic Tables” but also can be used to present dataset summary when there is a grouping variable.

An option is available to add an Overall category at the end of the produced table.

library(dplyr)
library(flextable)
z <- palmerpenguins::penguins %>% 
  select(-contains("length")) %>% 
  summarizor(
    by = "species",
    overall_label = "Overall")

ft <- as_flextable(z, spread_first_col = TRUE) %>% 
  style(i = ~!is.na(variable), pr_t = fp_text_default(bold = TRUE),
        pr_p = officer::fp_par(text.align = "left", padding = 5, line_spacing = 1.5)) %>% 
  prepend_chunks(i = ~is.na(variable), j = 1, as_chunk("\t")) %>% 
  autofit(add_w = .01)
ft

	Adelie (N=152)	Chinstrap (N=68)	Gentoo (N=124)	Overall (N=344)
island
Biscoe	44 (28.9%)	0 (0.0%)	124 (100.0%)	168 (48.8%)
Dream	56 (36.8%)	68 (100.0%)	0 (0.0%)	124 (36.0%)
Torgersen	52 (34.2%)	0 (0.0%)	0 (0.0%)	52 (15.1%)
bill_depth_mm
Mean (SD)	18.35 (1.22)	18.42 (1.14)	14.98 (0.98)	17.15 (1.97)
Median (IQR)	18.40 (1.50)	18.45 (1.90)	15.00 (1.50)	17.30 (3.10)
Range	15.50 - 21.50	16.40 - 20.80	13.10 - 17.30	13.10 - 21.50
Missing	1 (0.7%)		1 (0.8%)	2 (0.6%)
body_mass_g
Mean (SD)	3 700.66 (458.57)	3 733.09 (384.34)	5 076.02 (504.12)	4 201.75 (801.95)
Median (IQR)	3 700.00 (650.00)	3 700.00 (462.50)	5 000.00 (800.00)	4 050.00 (1 200.00)
Range	2 850.00 - 4 775.00	2 700.00 - 4 800.00	3 950.00 - 6 300.00	2 700.00 - 6 300.00
Missing	1 (0.7%)		1 (0.8%)	2 (0.6%)
sex
female	73 (48.0%)	34 (50.0%)	58 (46.8%)	165 (48.0%)
male	73 (48.0%)	34 (50.0%)	61 (49.2%)	168 (48.8%)
Missing	6 (3.9%)		5 (4.0%)	11 (3.2%)
year
Mean (SD)	2 008.01 (0.82)	2 007.97 (0.86)	2 008.08 (0.79)	2 008.03 (0.82)
Median (IQR)	2 008.00 (2.00)	2 008.00 (2.00)	2 008.00 (2.00)	2 008.00 (2.00)
Range	2 007.00 - 2 009.00	2 007.00 - 2 009.00	2 007.00 - 2 009.00	2 007.00 - 2 009.00

9.3 Models and tests

glm, lm, gam, htest, kmeans and pam objects can be easily converted to flextable:

9.3.1 GLM example

clotting <- data.frame(
    u = c(5,10,15,20,30,40,60,80,100),
    lot1 = c(118,58,42,35,27,25,21,19,18),
    lot2 = c(69,35,26,21,18,16,13,12,12))
as_flextable(glm(lot1 ~ log(u), data = clotting, family = Gamma))

	Estimate	Standard Error	z value	Pr(>\|z\|)
(Intercept)	-0.017	0.001	-17.847	0.0000	***
log(u)	0.015	0.000	36.975	0.0000	***
Signif. codes: 0 <= '*' < 0.001 < '' < 0.01 < '*' < 0.05

(Dispersion parameter for Gamma family taken to be 0.002446059)
Null deviance: 3.513 on 8 degrees of freedom
Residual deviance: 0.01673 on 7 degrees of freedom

9.3.2 LM example

ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
weight <- c(ctl, trt)
lm(weight ~ group) %>% as_flextable()

	Estimate	Standard Error	t value	Pr(>\|t\|)
(Intercept)	5.032	0.220	22.850	0.0000	***
groupTrt	-0.371	0.311	-1.191	0.2490
Signif. codes: 0 <= '*' < 0.001 < '' < 0.01 < '*' < 0.05

Residual standard error: 0.6964 on 18 degrees of freedom
Multiple R-squared: 0.07308, Adjusted R-squared: 0.02158
F-statistic: 1.419 on 18 and 1 DF, p-value: 0.2490

9.3.3 GAM example

library(mgcv)
set.seed(2)

dat <- gamSim(1, n = 400, 
  dist = "normal", scale = 2, verbose = FALSE)
b <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), 
  data = dat)

ft <- as_flextable(b)
ft

Component	Term	Estimate	Std Error	t-value	p-value
A. parametric coefficients	(Intercept)	7.833	0.099	79.303	0.0000	***
Component	Term	edf	Ref. df	F-value	p-value
B. smooth terms	s(x0)	2.500	3.115	6.921	0.0001	***
	s(x1)	2.401	2.984	81.858	0.0000	***
	s(x2)	7.698	8.564	88.158	0.0000	***
	s(x3)	1.000	1.000	4.343	0.0378	*
Signif. codes: 0 <= '*' < 0.001 < '' < 0.01 < '*' < 0.05

Adjusted R-squared: 0.715, Deviance explained 0.725
GCV : 4.051, Scale est: 3.903, N: 400

9.3.4 Hypothesis testing example

x <- rnorm(50)
y <- runif(30)
ks.test(x, y) %>% as_flextable()

statistic	p.value	method	alternative
0.57	0.0000***	Exact two-sample Kolmogorov-Smirnov test	two-sided
Signif. codes: 0 <= '*' < 0.001 < '' < 0.01 < '*' < 0.05

9.3.5 kmeans

cl <- kmeans(scale(mtcars[1:7]), 5)
ft <- as_flextable(cl)
ft

variable	1	2	3	4	5
withinss	2.15	5.03	33.38	4.07	9.58
size	3	5	14	5	5
mpg*	2.0147	0.9108	-0.8281	-0.0582	0.2571
cyl*	-1.2249	-1.2249	1.0149	-0.1050	-0.7769
disp*	-1.2551	-1.0170	0.9874	-0.5703	-0.4244
hp*	-1.2498	-0.7626	0.9120	-0.2696	-0.7714
drat*	1.5214	0.8443	-0.6869	0.4777	-0.3115
wt*	-1.3633	-1.1034	0.7992	-0.1924	-0.1239
qsec*	0.8103	0.0522	-0.6025	-0.3429	1.4915
(*) Centers
Total sum of squares: 217.00
Total within-cluster sum of squares: 54.20
Total between-cluster sum of squares: 162.80
BSS/TSS ratio: 75.02%
Number of iterations: 3

9.3.6 pam

library(cluster)
dat <- as.data.frame(scale(mtcars[1:7]))
cl <- pam(dat, 3)
ft <- as_flextable(cl)
ft

variable	1	2	3
size	6	11	15
max.diss	2.22	2.76	3.16
avg.diss	1.06	1.42	1.59
diameter	2.81	3.99	5.06
separation	0.99	1.48	0.99
avg.width	0.41	0.24	0.29
mpg*	0.1509	1.1962	-0.6124
cyl*	-0.1050	-1.2249	1.0149
disp*	-0.5706	-1.2242	0.3637
hp*	-0.5351	-1.1768	0.4859
drat*	0.5675	0.9042	-0.9848
wt*	-0.3498	-1.3105	0.8715
qsec*	-0.4638	0.5883	-0.2511
(*) Centers
The average silhouette width is 0.2959

9.4 xtable

xtable objects can be transformed as flextable objects with function as_flextable().

x <- as.integer(cumsum(1 + round(rnorm(100), 0)))
temp.ts <- ts(x,
  start = c(1954, 7), frequency = 12)
ft <- as_flextable(x = xtable::xtable(temp.ts, digits = 0))
ft

	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
1954	<na>	<na>	<na>	<na>	<na>	<na>	1	1	2	3	5	5
1955	6	7	9	10	12	11	12	13	14	15	16	17
1956	18	20	19	19	21	23	25	26	27	29	31	32
1957	33	35	37	39	41	43	45	46	46	48	50	50
1958	53	54	54	55	55	54	56	59	61	62	64	63
1959	64	65	65	64	64	65	68	70	73	75	76	78
1960	80	80	80	80	80	82	82	83	83	84	86	86
1961	87	88	88	89	92	92	91	92	93	95	97	99
1962	100	103	105	104	105	106	109	112	113	112	<na>	<na>