Returns a tidy data frame of Oscar Wilde's 7 completed, presented plays with two columns: text, which contains the text of the plays divided into elements of up to about 70 characters each, and play, which contains the titles of the plays as a factor in order of publication.

wilde_plays()

Value

A data frame with two columns: text and play

Details

Users should be aware that there are some differences in usage between the novels as made available by Project Gutenberg. For example, "anything" vs. "any thing", "Mr" vs. "Mr.", and using underscores vs. all caps to indicate italics/emphasis.

Examples

library(dplyr)
#> #> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats': #> #> filter, lag
#> The following objects are masked from 'package:base': #> #> intersect, setdiff, setequal, union
wilde_plays() %>% group_by(play) %>% summarise(total_lines = n())
#> # A tibble: 7 x 2 #> play total_lines #> <fct> <int> #> 1 Vera; Or, The Nihilists 2835 #> 2 Salome 2517 #> 3 The Dutchess of Padua 5470 #> 4 Lady Windermere's Fan 2912 #> 5 A Woman of No Importance 3274 #> 6 An Ideal Husband 4464 #> 7 The Importance of Being Earnest 3091