Rbills
helps you scrape information from PDF files. It is desined to scrape only Rocky Mountain Power
and Summit Energy
documents.
You can use the Rbills
package to:
get_pdf: get all files in a directory or get files that you choose in a directory.
read_pdf_seg: read pdf files from Summit Energy documents.
read_pdf_rmp: read pdf files from Rocky Mountain Power documents.
Theses functions are for use only with the data team and no other people.
And the development version from GitHub with:
There are two main ways to use get_pdf
function:
choose.file = FALSE
is default. The following code is an example.choose.file = TRUE
. The following code is an example.The read_pdf_seg
function only works for Summit Energy
documents. This function provides a table of data that fits the template of a given file. The following code is an example.
path <- system.file("data-raw", package = "Rbills", mustWork = TRUE)
# Choose 'example_gasbill' file.
x <- get_pdf(path, choose.file = TRUE)
gas_table <- read_pdf_seg(path, x)
head(gas_table)
#> meter_id billing_month invoice_date gas_supp_price gas_supp_mmbtu
#> 1 24800033 May 2013 June 5, 2013 2.28 1056
#> gas_supp_ext trans_fuel_price trans_fuel_mmbtu trans_fuel_ext
#> 1 2407.68 2.1 37 77.7