tidyquandl
to Quandl
vignettes/tidyquandl-vs-quandl.Rmd
tidyquandl-vs-quandl.Rmd
This vignette will demonstrate some differences between the tidquandl
package and the earlier Quandl-developed Quandl
package that tidyquandl
aims to supercede.
In Quandl
, API keys are both set and retrieved through the Quandl.api_key()
function. It won’t look anywhere for your key, so you always have to pass it in.
library(Quandl)
#> Loading required package: xts
#> Loading required package: zoo
#>
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#>
#> as.Date, as.Date.numeric
Quandl.api_key(Sys.getenv("QUANDL_API_KEY"))
(notice that Quandl
directly attaches packages to the current workspace, which is generally considered poor practice.)
tidyquandl
replaces this with quandl_key_set()
and quandl_key_get()
(the latter is currently an internal-only function), and quandl_key_get()
will look for a QUANDL_API_KEY
environment variable if no key is passed in as an argument.
library(tidyquandl)
quandl_key_set()
Both functions store the key in the same options
entry (Quandl.api_key
) for the duration of the R session, so if you need to use both packages, you only need to set your key once and can do so with either package.
We’ve replaced the Quandl.datatable()
function with quandl_datatable
. For many queries, the arguments will be identical and the only difference in output will be that tidyquandl
returns a tibble (tbl_df
) while Quandl
returns a plain data.frame
:
Quandl.datatable("WIKI/PRICES", ticker = "AAPL", date.gte = "2018-01-01", date.lt = "2018-01-08", qopts.columns = c("ticker", "date", "close", "volume"))
#> ticker date close volume
#> 1 AAPL 2018-01-02 172.26 25048048
#> 2 AAPL 2018-01-03 172.23 28819653
#> 3 AAPL 2018-01-04 173.03 22211345
#> 4 AAPL 2018-01-05 175.00 23016177
quandl_datatable("WIKI/PRICES", ticker = "AAPL", date.gte = "2018-01-01", date.lt = "2018-01-08", qopts.columns = c("ticker", "date", "close", "volume"))
#> # A tibble: 4 x 4
#> ticker date close volume
#> <chr> <date> <dbl> <dbl>
#> 1 AAPL 2018-01-02 172. 25048048
#> 2 AAPL 2018-01-03 172. 28819653
#> 3 AAPL 2018-01-04 173. 22211345
#> 4 AAPL 2018-01-05 175 23016177
Calling tibble::as_tibble()
on the Quandl
output would make it identical to the tidyquandl
output in this case.
One of the motivating reasons for making tidyquandl
was to handle the case where we want to get information on hundreds of tickers at once.
# 500!
tickers <- c(
"AAPL", "ABC", "AA", "ABBV", "ACE", "AGN", "ADP", "ADSK", "ABT",
"AEE", "ACN", "AEP", "ADBE", "AET", "ADI", "ADM", "ADS", "AIV",
"ADT", "ALL", "AES", "ALTR", "ALXN", "AFL", "AMAT", "AIG", "AMGN",
"AIZ", "AMT", "AKAM", "ALLE", "AME", "AMP", "AMZN", "A", "AN",
"AON", "APD", "APH", "AVB", "AVP", "APA", "APC", "AXP", "AZO",
"BA", "ARG", "BAC", "ATI", "AVY", "BCR", "BEAM", "BEN", "BAX",
"BBBY", "BBT", "BIIB", "BBY", "BLL", "BDX", "BMS", "BRCM", "BF_B",
"BHI", "BSX", "BK", "BWA", "BLK", "BXP", "CA", "BMY", "BRK_B",
"CAT", "BTU", "CBS", "C", "CCE", "CCI", "CAG", "CAH", "CAM",
"CERN", "CB", "CFN", "CBG", "CHK", "CI", "CCL", "CINF", "CELG",
"CL", "CF", "CMCSA", "CME", "CHRW", "CMG", "CMS", "CLF", "CLX",
"COF", "CMA", "COL", "CMI", "COST", "CNP", "CNX", "COG", "COH",
"CSX", "COP", "CTL", "COV", "CPB", "CTXS", "CRM", "CSC", "CVS",
"CSCO", "D", "CTAS", "DD", "CTSH", "CVC", "DGX", "CVX", "DHR",
"DAL", "DIS", "DLPH", "DE", "DFS", "DNB", "DG", "DO", "DHI",
"DOW", "DISCA", "DTE", "DLTR", "DTV", "DNR", "DVA", "DVN", "DOV",
"EBAY", "DPS", "ECL", "DRI", "ED", "DUK", "EMC", "EMN", "EA",
"EMR", "EOG", "EQR", "EFX", "ESRX", "EIX", "EL", "ETFC", "ETN",
"ETR", "EXC", "EQT", "EXPE", "ESV", "FAST", "FB", "FCX", "EW",
"EXPD", "FE", "F", "FISV", "FDO", "FDX", "FLS", "FMC", "FFIV",
"FOXA", "FIS", "FRX", "FITB", "FTI", "FLIR", "FLR", "GCI", "FOSL",
"GGP", "FSLR", "FTR", "GAS", "GM", "GD", "GE", "GNW", "GOOG",
"GHC", "GILD", "GIS", "GRMN", "GLW", "GS", "GWW", "GME", "HAL",
"GPC", "GPS", "HCBK", "HCN", "GT", "HAR", "HOG", "HAS", "HON",
"HBAN", "HPQ", "HCP", "HD", "HRL", "HES", "HIG", "HSP", "HST",
"HOT", "HP", "HUM", "HRB", "ICE", "HRS", "INTU", "HSY", "IPG",
"IR", "IBM", "IFF", "ISRG", "IGT", "ITW", "INTC", "JBL", "JCI",
"IP", "IRM", "JNPR", "IVZ", "JWN", "JEC", "KLAC", "JNJ", "KMI",
"JOY", "KO", "K", "KR", "KEY", "KIM", "KSU", "KMB", "KMX", "LEN",
"KORS", "LH", "KRFT", "KSS", "LLL", "L", "LB", "LM", "LEG", "LMT",
"LO", "LIFE", "LLTC", "LLY", "LUK", "LNC", "M", "MA", "LOW",
"MAC", "LRCX", "LSI", "MAS", "LUV", "MCD", "LYB", "MCK", "MCO",
"MAR", "MDT", "MAT", "MCHP", "MJN", "MMC", "MDLZ", "MMM", "MNST",
"MET", "MON", "MHFI", "MHK", "MPC", "MKC", "MRO", "MSI", "MO",
"MU", "MOS", "MWV", "MRK", "NBL", "MS", "MSFT", "NDAQ", "MTB",
"NEE", "MUR", "NFLX", "MYL", "NI", "NBR", "NLSN", "NE", "NOV",
"NEM", "NSC", "NFX", "NTRS", "NKE", "NUE", "NOC", "NWL", "NRG",
"OI", "NTAP", "OMC", "NU", "ORLY", "NVDA", "PAYX", "PBCT", "NWSA",
"PCAR", "OKE", "ORCL", "PCL", "OXY", "PCP", "PEG", "PBI", "PETM",
"PCG", "PFG", "PCLN", "PDCO", "PHM", "PEP", "PLD", "PFE", "PM",
"PG", "PNC", "PNR", "PH", "POM", "PKI", "PPL", "PLL", "PSA",
"PNW", "PVH", "PPG", "PX", "PRGO", "PRU", "R", "PSX", "RDC",
"PWR", "RF", "PXD", "RHI", "QCOM", "QEP", "RIG", "RL", "RAI",
"ROP", "REGN", "RRC", "RTN", "RHT", "ROK", "SEE", "ROST", "SHW",
"RSG", "SBUX", "SCG", "SCHW", "SNA", "SE", "SNI", "SO", "SIAL",
"SJM", "SRCL", "SLB", "SLM", "SNDK", "STJ", "STT", "STX", "SPG",
"SPLS", "SRE", "STI", "SYK", "SYMC", "STZ", "TAP", "SWK", "TDC",
"SWN", "TE", "SWY", "TEG", "SYY", "T", "TIF", "TJX", "TMK", "TEL",
"TRV", "TGT", "TSN", "THC", "TMO", "TRIP", "TROW", "TYC", "TSO",
"TSS", "UNP", "TWC", "TWX", "TXN", "TXT", "UTX", "UNH", "VAR",
"VFC", "UPS", "URBN", "VMC", "USB", "V", "VZ", "VIAB", "WAG",
"WAT", "VLO", "WDC", "WEC", "VNO", "WFC", "VRSN", "WFM", "VRTX",
"WHR", "VTR", "WIN", "WLP", "WM", "WMB", "WY", "WYN", "WYNN",
"WMT", "WPX", "XLNX", "XOM", "WU", "XRAY", "XRX", "X", "XEL",
"YUM", "XL", "XYL", "YHOO", "ZION", "ZMH", "ZTS", "TSCO", "FLWS",
"SRCE", "FUBC"
)
These are all valid tickers in the “WIKI/PRICES” table, though they might not all overlap in time.
Quandl
can’t handle this very well:
result <- Quandl.datatable("WIKI/PRICES", ticker = tickers, date = "2018-01-02", qopts.columns = c("ticker", "date", "close", "volume"))
#> No encoding supplied: defaulting to UTF-8.
#> Error: <html>
#> <head><title>414 Request-URI Too Large</title></head>
#> <body bgcolor="white">
#> <center><h1>414 Request-URI Too Large</h1></center>
#> <hr><center>nginx/1.12.2</center>
#> </body>
#> </html>
We get an error due to the request being too large, and a poorly-formatted error at that. We could write some code to help split tickers
into smaller batches (how small?), or we could let tidyquandl
worry about that for us:
result <- quandl_datatable("WIKI/PRICES", ticker = tickers, date = "2018-01-02", qopts.columns = c("ticker", "date", "close", "volume"))
nrow(result)
#> [1] 413
head(result)
#> # A tibble: 6 x 4
#> ticker date close volume
#> <chr> <date> <dbl> <dbl>
#> 1 A 2018-01-02 67.6 1047772
#> 2 AA 2018-01-02 55.2 2928905
#> 3 AAPL 2018-01-02 172. 25048048
#> 4 ABBV 2018-01-02 98.4 4559726
#> 5 ABC 2018-01-02 94.0 1137335
#> 6 ABT 2018-01-02 58.8 10112783
The batching is a bit naive and overly-cautious, but I have yet to see it fail due to too many parameters.
Some queries return many results, more than Quandl wants to return for one request. What if we want full price history for Apple and Google prior to 2018?
result <- Quandl.datatable("WIKI/PRICES", ticker = c("AAPL", "GOOGL"), date.lt = "2018-01-01", qopts.columns = c("ticker", "date", "close", "volume"))
#> Warning: This call returns more data. To request more pages, please
#> set paginate=TRUE in your Quandl.datatable() call. For more information
#> see our documentation: https://github.com/quandl/quandl-r/blob/master/
#> README.md#datatables
The Quandl
package is happy to fetch many pages of results, but you have to force it to do this with paginate = TRUE
:
result <- Quandl.datatable("WIKI/PRICES", ticker = c("AAPL", "GOOGL"), date.lt = "2018-01-01", qopts.columns = c("ticker", "date", "close", "volume"), paginate = TRUE)
nrow(result)
#> [1] 12706
With tidyquandl
, you never need to worry about this; all pages of results are fetched and combined for all queries.
result <- quandl_datatable("WIKI/PRICES", ticker = c("AAPL", "GOOGL"), date.lt = "2018-01-01", qopts.columns = c("ticker", "date", "close", "volume"))
nrow(result)
#> [1] 12706
Quandl has recently added the ability to grab metadata about any table through the API. You can’t currently get this with Quandl
(Quandl::metaData()
does something else), but you can with tidyquandl
!
quandl_datatable_meta("WIKI/PRICES")
#> $vendor_code
#> [1] "WIKI"
#>
#> $datatable_code
#> [1] "PRICES"
#>
#> $name
#> [1] "Quandl End of Day Stocks Info"
#>
#> $description
#> NULL
#>
#> $columns
#> name type
#> 1 ticker String
#> 2 date Date
#> 3 open BigDecimal(34,12)
#> 4 high BigDecimal(34,12)
#> 5 low BigDecimal(34,12)
#> 6 close BigDecimal(34,12)
#> 7 volume BigDecimal(37,15)
#> 8 ex-dividend BigDecimal(42,20)
#> 9 split_ratio double
#> 10 adj_open BigDecimal(50,28)
#> 11 adj_high BigDecimal(50,28)
#> 12 adj_low BigDecimal(50,28)
#> 13 adj_close BigDecimal(50,28)
#> 14 adj_volume double
#>
#> $filters
#> [1] "ticker" "date"
#>
#> $primary_key
#> [1] "ticker" "date"
#>
#> $premium
#> NULL
#>
#> $status
#> $status$refreshed_at
#> [1] "2018-03-27T23:54:20.000Z"
#>
#> $status$status
#> [1] "ON TIME"
#>
#> $status$expected_at
#> [1] "*"
#>
#> $status$update_frequency
#> [1] "CONTINUOUS"
Notice this returns a list (where one element is a plain data.frame
), rather than forcing it all into a data.frame
/tibble
. This format is subject to change; could be nice to enter a vector of table names and receive a tibble with one row per table and columns
as a nested tibble (see #34).
Quandl.datatable("FOO/BAR", ticker = "AAPL")
#> Error: {"quandl_error":{"code":"QEPx04","message":"The following datatable 'FOO/BAR' does not exist."}}
quandl_datatable("FOO/BAR", ticker = "AAPL")
#> Request failed [404]. Retrying in 1.4 seconds...
#> Request failed [404]. Retrying in 2.4 seconds...
#> No encoding supplied: defaulting to UTF-8.
#> Error in quandl_api(path, "csv", params): The following datatable 'FOO/BAR' does not exist.
#> Quandl Error Code: QEPx04
#> HTTP Status Code: 404
While we think tidyquandl
is now the best way to query Quandl’s Tables API from R, the Quandl
package has a lot more uses we don’t address, like:
Quandl()
, Quandl.dataset.get()
)Quandl.database.bulk_download_to_file()
, Quandl.database.bulk_download_url()
)Quandl.search()
)We’re open to addressing any of these in tidyquandl
if there’s demand for it; let us know in the issues!