class: center, middle, inverse, title-slide # Time Series Features ## FPP3, Chapter 4 ### Robert W. Walker ### AGSM ### 2021-02-06 --- # An Overview --- ## Packages Getting started ``` library(tidyverse) library(fpp3) library(purrr) library(gganimate) library(seasonal) ``` --- # When things go wrong --- ## The ABS stuff-up --- ```r employed ``` ``` ## # A tsibble: 440 x 4 [1M] ## Time Month Year Employed ## <mth> <ord> <dbl> <dbl> ## 1 1978 Feb Feb 1978 5986. ## 2 1978 Mar Mar 1978 6041. ## 3 1978 Apr Apr 1978 6054. ## 4 1978 May May 1978 6038. ## 5 1978 Jun Jun 1978 6031. ## 6 1978 Jul Jul 1978 6036. ## 7 1978 Aug Aug 1978 6005. ## 8 1978 Sep Sep 1978 6024. ## 9 1978 Oct Oct 1978 6046. ## 10 1978 Nov Nov 1978 6034. ## # … with 430 more rows ``` --- ## The ABS stuff-up [Details:](https://robjhyndman.com/hyndsight/abs-seasonal-adjustment-3/) ```r employed %>% autoplot(Employed) + ggtitle("Total employed") + ylab("Thousands") + xlab("Year") ``` <img src="index_files/figure-html/abs3-1.png" width="576" /> --- ## The ABS stuff-up ```r employed %>% filter(Year >= 2005) %>% autoplot(Employed) + ggtitle("Total employed") + ylab("Thousands") + xlab("Year") ``` <img src="index_files/figure-html/abs4-1.png" width="576" /> --- ## The ABS stuff-up ```r employed %>% filter(Year >= 2005) %>% gg_season(Employed, label = "right") + ggtitle("Total employed") + ylab("Thousands") ``` <img src="index_files/figure-html/abs5-1.png" width="576" /> --- ## The ABS stuff-up ```r employed %>% mutate(diff = difference(Employed)) %>% filter(Month == "Sep") %>% ggplot(aes(y = diff, x = 1)) + geom_boxplot() + coord_flip() + ggtitle("Sep - Aug: total employed") + xlab("") + ylab("Thousands") + scale_x_continuous(breaks = NULL, labels = NULL) ``` <img src="index_files/figure-html/abs6-1.png" width="576" /> --- ## The ABS stuff-up ```r dcmp <- employed %>% filter(Year >= 2005) %>% model(stl = STL(Employed ~ season(window = 11), robust = TRUE)) components(dcmp) %>% autoplot() ``` <img src="index_files/figure-html/abs7-1.png" width="576" /> --- ## The ABS stuff-up ```r components(dcmp) %>% filter(year(Time) == 2013) %>% gg_season(season_year) + ggtitle("Seasonal component") + guides(colour = "none") ``` <img src="index_files/figure-html/abs8-1.png" width="576" /> --- ## The ABS stuff-up ```r components(dcmp) %>% as_tsibble() %>% autoplot(season_adjust) ``` <img src="index_files/figure-html/abs9-1.png" width="576" /> --- ## The ABS stuff-up * August 2014 employment numbers higher than expected. * Supplementary survey usually conducted in August for employed people. * Most likely, some employed people were claiming to be unemployed in August to avoid supplementary questions. * Supplementary survey not run in 2014, so no motivation to lie about employment. * In previous years, seasonal adjustment fixed the problem. * The ABS has now adopted a new method to avoid the bias. --- ## Some Data for Today and General Considerations Panel data. Multiple time series are often described as a panel, a cross-section of time series, or a time series of cross-sections. The data structure has two [non-overlapping] indices. Let's review, and discuss a bit, what exactly we mean. --- ## Extending the Data `fredr` has two accompanying support documents. The first one forms a partial basis for our homework exercise for this week. The second arises from a more general effort to use the nice features of `fredr`. --- ``` US.Employment <- map_dfr( c(rownames(table(us_employment$Series_ID))), ~fredr::fredr_series_observations(.)) save(US.Employment, file="USEmployment.RData") ``` ```r load("USEmployment.RData") us_employment %>% data.frame() %>% group_by(Series_ID) %>% summarise(Title = first(Title)) %>% mutate(series_id = Series_ID) %>% ungroup() %>% select(-Series_ID) -> Names.List US.Employment.T <- left_join(US.Employment, Names.List, by = c("series_id" = "series_id")) %>% mutate(YM = yearmonth(date)) %>% rename(Employed = value) %>% as_tsibble(., index=YM, key=Title) ``` --- ## Additional Features For much of the study of time series, the key issue is one known as stationarity. For now, we will do at least some hand waving to be clarified in chapters 5 and more in 9. But we want to compute things and then build out all the details. Let's take my new retail employment data. --- # A Recreation on New Data ```r EMPN <- US.Employment.T %>% filter(YM > yearmonth("1990-01") & Title=="Retail Trade") %>% as_tsibble(index=YM) EMPO <- us_employment %>% filter(Title=="Retail Trade" & Month > yearmonth("1990-01")) %>% as_tsibble(., index=Month) Plot1 <- ggplot(EMPN, aes(x=YM, y=Employed)) + geom_line(color = "red") + geom_line(data=EMPO, aes(x=Month, y=Employed), inherit.aes=FALSE) Plot1 ``` --- ## Data are Revised Occasionally <img src="index_files/figure-html/P1A-1.png" width="576" /> --- ```r library(patchwork) dcmp <- EMPO %>% model(stl = STL(Employed)) Plot2 <- components(dcmp) %>% autoplot() dcmp <- EMPN %>% model(stl = STL(Employed)) Plot3 <- components(dcmp) %>% autoplot() Plot1 / (Plot2 + Plot3) ``` --- <img src="index_files/figure-html/P2-1.png" width="792" /> --- # Three Sectors ```r USET <- US.Employment.T %>% filter(YM > yearmonth("1990-01"), Title%in%c("Retail Trade","Financial Activities","Manufacturing")) %>% as_tsibble(., index=YM, key=Title) USET %>% autoplot(Employed) ``` <img src="index_files/figure-html/unnamed-chunk-4-1.png" width="576" /> --- ## Retail (season) ```r US.Employment.T %>% filter(YM > yearmonth("1990-01"), Title%in%c("Retail Trade")) %>% as_tsibble(., index=YM) %>% gg_season(Employed) ``` <img src="index_files/figure-html/unnamed-chunk-5-1.png" width="576" /> --- ## Retail (subseries) ```r US.Employment.T %>% filter(YM > yearmonth("1990-01"), Title%in%c("Retail Trade")) %>% as_tsibble(., index=YM) %>% gg_subseries(Employed) ``` <img src="index_files/figure-html/unnamed-chunk-6-1.png" width="576" /> --- ## Retail (lag) ```r US.Employment.T %>% filter(YM > yearmonth("1990-01"), Title%in%c("Retail Trade")) %>% as_tsibble(., index=YM) %>% gg_lag(Employed) ``` <img src="index_files/figure-html/unnamed-chunk-7-1.png" width="576" /> --- ## Manufacturing ```r US.Employment.T %>% filter(YM > yearmonth("1990-01"), Title%in%c("Manufacturing")) %>% as_tsibble(., index=YM) %>% gg_season(Employed) ``` <img src="index_files/figure-html/unnamed-chunk-8-1.png" width="576" /> --- ## Manufacturing ```r US.Employment.T %>% filter(YM > yearmonth("1990-01"), Title%in%c("Manufacturing")) %>% as_tsibble(., index=YM) %>% gg_subseries(Employed) ``` <img src="index_files/figure-html/unnamed-chunk-9-1.png" width="576" /> --- ## Manufacturing ```r US.Employment.T %>% filter(YM > yearmonth("1990-01"), Title%in%c("Manufacturing")) %>% as_tsibble(., index=YM) %>% gg_lag(Employed) ``` <img src="index_files/figure-html/unnamed-chunk-10-1.png" width="576" /> --- ## Financial ```r US.Employment.T %>% filter(YM > yearmonth("1990-01"), Title%in%c("Financial Activities")) %>% as_tsibble(., index=YM) %>% gg_season(Employed) ``` <img src="index_files/figure-html/unnamed-chunk-11-1.png" width="576" /> --- ## Financial ```r US.Employment.T %>% filter(YM > yearmonth("1990-01"), Title%in%c("Financial Activities")) %>% as_tsibble(., index=YM) %>% gg_subseries(Employed) ``` <img src="index_files/figure-html/unnamed-chunk-12-1.png" width="576" /> --- ## Financial ```r US.Employment.T %>% filter(YM > yearmonth("1990-01"), Title%in%c("Financial Activities")) %>% as_tsibble(., index=YM) %>% gg_lag(Employed) ``` <img src="index_files/figure-html/unnamed-chunk-13-1.png" width="576" /> --- ## Features: Summary The features command is the magic tool for tidy summary and statistics for time series in this index/key format. For example, basic summary ```r USET %>% features(Employed, features=list(mean=mean,min=min,max=max,sd=sd,quantile)) ``` ``` ## # A tibble: 3 x 10 ## Title mean min max sd `0%` `25%` `50%` `75%` `100%` ## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 Financial Activit… 7767. 6472 8846 640. 6472 7363 7876 8226. 8846 ## 2 Manufacturing 14554. 11340 17870 2241. 11340 12333. 14219 17088 17870 ## 3 Retail Trade 14746. 12548. 16394. 915. 12548. 14336. 14962. 15387. 16394. ``` --- ### Features: Correlation Features Learning about the time series properties ```r USET %>% features(Employed, features=feat_acf) ``` ``` ## # A tibble: 3 x 8 ## Title acf1 acf10 diff1_acf1 diff1_acf10 diff2_acf1 diff2_acf10 season_acf1 ## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 Financial… 0.990 8.94 0.283 0.165 -0.313 0.415 0.883 ## 2 Manufactu… 0.995 9.35 0.0466 0.128 -0.499 0.505 0.925 ## 3 Retail Tr… 0.951 7.29 0.133 0.377 -0.198 0.305 0.876 ``` ```r USET %>% group_by(Title) %>% ACF(Employed) %>% autoplot() ``` <img src="index_files/figure-html/unnamed-chunk-16-1.png" width="576" /> --- ### For Contrast: Ford Returns ```r library(tidyquant) Ford <- tq_get("F", from="2000-01-01") FordT <- Ford %>% as_tsibble(index=date) FordT %>% autoplot(adjusted) ``` <img src="index_files/figure-html/unnamed-chunk-17-1.png" width="576" /> --- ```r FC <- Ford %>% tq_transmute(adjusted, mutate_fun = periodReturn, period = "monthly") %>% mutate(YM = yearmonth(date)) %>% as_tsibble(., index=YM) FC %>% autoplot(monthly.returns) ``` <img src="index_files/figure-html/unnamed-chunk-18-1.png" width="576" /> --- ## Ford's ACF The 6/7 and 12/13 patterns are interesting.... ```r library(patchwork) FC1 <- FC %>% ACF(monthly.returns) %>% autoplot() FC2 <- FC %>% PACF(monthly.returns) %>% autoplot() FC1+FC2 ``` <img src="index_files/figure-html/unnamed-chunk-19-1.png" width="576" /> --- ## Decomposition Features ```r USET %>% features(Employed, feat_stl) ``` ``` ## # A tibble: 3 x 10 ## Title trend_strength seasonal_streng… seasonal_peak_y… seasonal_trough… ## <chr> <dbl> <dbl> <dbl> <dbl> ## 1 Fina… 0.999 0.812 6 3 ## 2 Manu… 0.999 0.531 8 3 ## 3 Reta… 0.982 0.824 11 3 ## # … with 5 more variables: spikiness <dbl>, linearity <dbl>, curvature <dbl>, ## # stl_e_acf1 <dbl>, stl_e_acf10 <dbl> ``` --- ## With More Data ```r NUSET8k <- US.Employment.T %>% data.frame() %>% group_by(Title) %>% summarise(MaxE = max(Employed)) %>% arrange(desc(MaxE)) %>% filter(MaxE > 8000 & MaxE < 120000) USET8k <- left_join(NUSET8k, US.Employment.T) %>% as_tsibble(., index=YM, key=Title) ``` --- # An Improvement on the Trend/Season <iframe src="st.html" width="800" height="500" seamless="seamless" frameBorder="0"> </iframe> --- The details are at the bottom [for other statistics](https://otexts.com/fpp3/stlfeatures.html). ```r library(kableExtra) USET8k %>% features(Employed, feat_stl) %>% knitr::kable(format="html") %>% scroll_box(width = "100%", height = "300px") ``` <div style="border: 1px solid #ddd; padding: 0px; overflow-y: scroll; height:300px; overflow-x: scroll; width:100%; "><table> <thead> <tr> <th style="text-align:left;position: sticky; top:0; background-color: #FFFFFF;"> Title </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> trend_strength </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> seasonal_strength_year </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> seasonal_peak_year </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> seasonal_trough_year </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> spikiness </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> linearity </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> curvature </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> stl_e_acf1 </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> stl_e_acf10 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Construction </td> <td style="text-align:right;"> 0.9992344 </td> <td style="text-align:right;"> 0.9609181 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 229.8766271 </td> <td style="text-align:right;"> 53917.767 </td> <td style="text-align:right;"> -1083.89554 </td> <td style="text-align:right;"> 0.5727899 </td> <td style="text-align:right;"> 0.5300631 </td> </tr> <tr> <td style="text-align:left;"> Durable Goods </td> <td style="text-align:right;"> 0.9942068 </td> <td style="text-align:right;"> 0.1901103 </td> <td style="text-align:right;"> 9 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 3750.8679821 </td> <td style="text-align:right;"> 1297.216 </td> <td style="text-align:right;"> -40395.39360 </td> <td style="text-align:right;"> 0.7459027 </td> <td style="text-align:right;"> 1.2253160 </td> </tr> <tr> <td style="text-align:left;"> Education and Health Services </td> <td style="text-align:right;"> 0.9998700 </td> <td style="text-align:right;"> 0.6981878 </td> <td style="text-align:right;"> 11 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 8196.8445837 </td> <td style="text-align:right;"> 217957.384 </td> <td style="text-align:right;"> 58541.46490 </td> <td style="text-align:right;"> 0.5322021 </td> <td style="text-align:right;"> 0.6022304 </td> </tr> <tr> <td style="text-align:left;"> Education and Health Services: Health Care </td> <td style="text-align:right;"> 0.9990282 </td> <td style="text-align:right;"> 0.3143228 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 18072.7192847 </td> <td style="text-align:right;"> 46027.690 </td> <td style="text-align:right;"> -433.39830 </td> <td style="text-align:right;"> 0.4985224 </td> <td style="text-align:right;"> 0.5506820 </td> </tr> <tr> <td style="text-align:left;"> Education and Health Services: Health Care and Social Assistance </td> <td style="text-align:right;"> 0.9989230 </td> <td style="text-align:right;"> 0.2723145 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 70415.0863382 </td> <td style="text-align:right;"> 63330.409 </td> <td style="text-align:right;"> 22.27873 </td> <td style="text-align:right;"> 0.5196615 </td> <td style="text-align:right;"> 0.6020996 </td> </tr> <tr> <td style="text-align:left;"> Financial Activities </td> <td style="text-align:right;"> 0.9999733 </td> <td style="text-align:right;"> 0.8640456 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 0.7851155 </td> <td style="text-align:right;"> 78437.507 </td> <td style="text-align:right;"> -272.33652 </td> <td style="text-align:right;"> 0.7198773 </td> <td style="text-align:right;"> 0.8848930 </td> </tr> <tr> <td style="text-align:left;"> Goods-Producing </td> <td style="text-align:right;"> 0.9962894 </td> <td style="text-align:right;"> 0.8047913 </td> <td style="text-align:right;"> 9 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 12648.5279515 </td> <td style="text-align:right;"> 43346.420 </td> <td style="text-align:right;"> -64790.70918 </td> <td style="text-align:right;"> 0.7299629 </td> <td style="text-align:right;"> 1.0572237 </td> </tr> <tr> <td style="text-align:left;"> Government </td> <td style="text-align:right;"> 0.9999110 </td> <td style="text-align:right;"> 0.9791623 </td> <td style="text-align:right;"> 11 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 466.3720488 </td> <td style="text-align:right;"> 190943.166 </td> <td style="text-align:right;"> -19813.98634 </td> <td style="text-align:right;"> 0.5978552 </td> <td style="text-align:right;"> 0.5534737 </td> </tr> <tr> <td style="text-align:left;"> Government: Local Government </td> <td style="text-align:right;"> 0.9998498 </td> <td style="text-align:right;"> 0.9846446 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 361.6591621 </td> <td style="text-align:right;"> 96009.466 </td> <td style="text-align:right;"> -15448.65308 </td> <td style="text-align:right;"> 0.6392240 </td> <td style="text-align:right;"> 0.7131609 </td> </tr> <tr> <td style="text-align:left;"> Government: Local Government Education </td> <td style="text-align:right;"> 0.9998035 </td> <td style="text-align:right;"> 0.9957586 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 42.1496509 </td> <td style="text-align:right;"> 54937.058 </td> <td style="text-align:right;"> -8478.90619 </td> <td style="text-align:right;"> 0.5415432 </td> <td style="text-align:right;"> 0.5148335 </td> </tr> <tr> <td style="text-align:left;"> Leisure and Hospitality </td> <td style="text-align:right;"> 0.9973560 </td> <td style="text-align:right;"> 0.5856421 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 618138.6017445 </td> <td style="text-align:right;"> 136403.031 </td> <td style="text-align:right;"> 22437.16037 </td> <td style="text-align:right;"> 0.5241725 </td> <td style="text-align:right;"> 0.6176190 </td> </tr> <tr> <td style="text-align:left;"> Leisure and Hospitality: Accommodation and Food Services </td> <td style="text-align:right;"> 0.9695570 </td> <td style="text-align:right;"> 0.4433255 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 5950597.7985599 </td> <td style="text-align:right;"> 31625.824 </td> <td style="text-align:right;"> -1145.29901 </td> <td style="text-align:right;"> 0.5008340 </td> <td style="text-align:right;"> 0.5518675 </td> </tr> <tr> <td style="text-align:left;"> Leisure and Hospitality: Food Services and Drinking Places </td> <td style="text-align:right;"> 0.9741965 </td> <td style="text-align:right;"> 0.3712467 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 3677463.4844264 </td> <td style="text-align:right;"> 29897.866 </td> <td style="text-align:right;"> -367.27495 </td> <td style="text-align:right;"> 0.4774776 </td> <td style="text-align:right;"> 0.5049604 </td> </tr> <tr> <td style="text-align:left;"> Manufacturing </td> <td style="text-align:right;"> 0.9963853 </td> <td style="text-align:right;"> 0.4103339 </td> <td style="text-align:right;"> 9 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 5877.1607931 </td> <td style="text-align:right;"> -8503.497 </td> <td style="text-align:right;"> -64156.40673 </td> <td style="text-align:right;"> 0.7680516 </td> <td style="text-align:right;"> 1.2343823 </td> </tr> <tr> <td style="text-align:left;"> Private Service-Providing </td> <td style="text-align:right;"> 0.9996557 </td> <td style="text-align:right;"> 0.5175363 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 17071086.3318019 </td> <td style="text-align:right;"> 906515.882 </td> <td style="text-align:right;"> 115991.19165 </td> <td style="text-align:right;"> 0.5470875 </td> <td style="text-align:right;"> 0.6429675 </td> </tr> <tr> <td style="text-align:left;"> Professional and Business Services </td> <td style="text-align:right;"> 0.9998321 </td> <td style="text-align:right;"> 0.6562861 </td> <td style="text-align:right;"> 11 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4127.3430793 </td> <td style="text-align:right;"> 187094.395 </td> <td style="text-align:right;"> 41549.15807 </td> <td style="text-align:right;"> 0.6288984 </td> <td style="text-align:right;"> 0.8486224 </td> </tr> <tr> <td style="text-align:left;"> Professional and Business Services: Administrative and Support Services </td> <td style="text-align:right;"> 0.9949367 </td> <td style="text-align:right;"> 0.7960556 </td> <td style="text-align:right;"> 11 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 20889.0709374 </td> <td style="text-align:right;"> 21353.735 </td> <td style="text-align:right;"> -8020.12234 </td> <td style="text-align:right;"> 0.6198112 </td> <td style="text-align:right;"> 0.8347590 </td> </tr> <tr> <td style="text-align:left;"> Professional and Business Services: Administrative and Waste Services </td> <td style="text-align:right;"> 0.9952160 </td> <td style="text-align:right;"> 0.7990821 </td> <td style="text-align:right;"> 10 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 21933.7465993 </td> <td style="text-align:right;"> 22523.211 </td> <td style="text-align:right;"> -8018.84650 </td> <td style="text-align:right;"> 0.6203460 </td> <td style="text-align:right;"> 0.8420384 </td> </tr> <tr> <td style="text-align:left;"> Professional and Business Services: Professional and Technical Services </td> <td style="text-align:right;"> 0.9995486 </td> <td style="text-align:right;"> 0.6751413 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 207.6969552 </td> <td style="text-align:right;"> 29126.709 </td> <td style="text-align:right;"> -1378.48713 </td> <td style="text-align:right;"> 0.6276247 </td> <td style="text-align:right;"> 0.7556229 </td> </tr> <tr> <td style="text-align:left;"> Retail Trade </td> <td style="text-align:right;"> 0.9997119 </td> <td style="text-align:right;"> 0.8748516 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 6500.9636969 </td> <td style="text-align:right;"> 135656.221 </td> <td style="text-align:right;"> -6874.57165 </td> <td style="text-align:right;"> 0.5121305 </td> <td style="text-align:right;"> 0.4866360 </td> </tr> <tr> <td style="text-align:left;"> Trade, Transportation, and Utilities </td> <td style="text-align:right;"> 0.9997406 </td> <td style="text-align:right;"> 0.8359340 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 22975.5085045 </td> <td style="text-align:right;"> 211632.790 </td> <td style="text-align:right;"> -7663.28497 </td> <td style="text-align:right;"> 0.5685737 </td> <td style="text-align:right;"> 0.6108265 </td> </tr> </tbody> </table></div> --- ### `coef_hurst` A measure of the degree to which adjacent observations depend on one another over time. Generically, this statistic takes values between zero and one with one indicating very high levels of dependence through time. ```r USET %>% features(Employed, coef_hurst) ``` ``` ## # A tibble: 3 x 2 ## Title coef_hurst ## <chr> <dbl> ## 1 Financial Activities 1.00 ## 2 Manufacturing 1.00 ## 3 Retail Trade 0.999 ``` --- ## Middling for Ford ```r FC %>% features(monthly.returns, features=coef_hurst) ``` ``` ## # A tibble: 1 x 1 ## coef_hurst ## <dbl> ## 1 0.500 ``` --- # `feat_spectral` ```r USET %>% features(Employed, feat_spectral) ``` ``` ## # A tibble: 3 x 2 ## Title spectral_entropy ## <chr> <dbl> ## 1 Financial Activities 0.263 ## 2 Manufacturing 0.180 ## 3 Retail Trade 0.412 ``` ```r FC %>% features(monthly.returns, features=feat_spectral) ``` ``` ## # A tibble: 1 x 1 ## spectral_entropy ## <dbl> ## 1 0.988 ``` --- # The Absence of Correlation Ljung-Box modifies the idea in the Box-Pierce statistic for assessing whether or not a given series [or transformation thereof] is essentially uncorrelated. In both cases, we will get to the details next week [chapter 5]. For now, the idea is simply that `\(k\)` squared autocorrelations will sum to a chi-squared distribution with `\(k\)` degrees of freedom. Large correlations reveal dependence. ```r USET %>% features(Employed, features=list(box_pierce, ljung_box)) ``` ``` ## # A tibble: 3 x 5 ## Title bp_stat bp_pvalue lb_stat lb_pvalue ## <chr> <dbl> <dbl> <dbl> <dbl> ## 1 Financial Activities 365. 0 368. 0 ## 2 Manufacturing 368. 0 371. 0 ## 3 Retail Trade 337. 0 339. 0 ``` ```r FC %>% features(monthly.returns, features=list(box_pierce, ljung_box)) ``` ``` ## # A tibble: 1 x 4 ## bp_stat bp_pvalue lb_stat lb_pvalue ## <dbl> <dbl> <dbl> <dbl> ## 1 0.00601 0.938 0.00608 0.938 ``` --- # `feat_pacf` ```r USET %>% features(Employed, feat_pacf) ``` ``` ## # A tibble: 3 x 5 ## Title pacf5 diff1_pacf5 diff2_pacf5 season_pacf ## <chr> <dbl> <dbl> <dbl> <dbl> ## 1 Financial Activities 0.987 0.712 1.00 -0.0555 ## 2 Manufacturing 0.994 0.238 0.791 0.0348 ## 3 Retail Trade 1.08 0.834 1.06 -0.0188 ``` ```r FC %>% features(monthly.returns, features=feat_pacf) ``` ``` ## # A tibble: 1 x 4 ## pacf5 diff1_pacf5 diff2_pacf5 season_pacf ## <dbl> <dbl> <dbl> <dbl> ## 1 0.00696 0.665 1.28 0.117 ``` --- # Unit Roots The stationarity issue from earlier is given much attention. Can we reasonably think of characteristics as fixed? There are three means of assessment with details to Chapter 9. ```r USET %>% features(Employed, features=list(unitroot_kpss, unitroot_pp, unitroot_ndiffs, unitroot_nsdiffs)) %>% knitr::kable(format="html") ``` <table> <thead> <tr> <th style="text-align:left;"> Title </th> <th style="text-align:right;"> kpss_stat </th> <th style="text-align:right;"> kpss_pvalue </th> <th style="text-align:right;"> pp_stat </th> <th style="text-align:right;"> pp_pvalue </th> <th style="text-align:right;"> ndiffs </th> <th style="text-align:right;"> nsdiffs </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Financial Activities </td> <td style="text-align:right;"> 4.634610 </td> <td style="text-align:right;"> 0.01 </td> <td style="text-align:right;"> -1.1928191 </td> <td style="text-align:right;"> 0.1000000 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Manufacturing </td> <td style="text-align:right;"> 5.676845 </td> <td style="text-align:right;"> 0.01 </td> <td style="text-align:right;"> -0.9375425 </td> <td style="text-align:right;"> 0.1000000 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> Retail Trade </td> <td style="text-align:right;"> 3.909833 </td> <td style="text-align:right;"> 0.01 </td> <td style="text-align:right;"> -2.6362729 </td> <td style="text-align:right;"> 0.0890691 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> </tr> </tbody> </table> ```r FC %>% features(monthly.returns, features=list(unitroot_kpss, unitroot_pp, unitroot_ndiffs, unitroot_nsdiffs)) ``` ``` ## # A tibble: 1 x 6 ## kpss_stat kpss_pvalue pp_stat pp_pvalue ndiffs nsdiffs ## <dbl> <dbl> <dbl> <dbl> <int> <int> ## 1 0.0890 0.1 -15.9 0.01 0 0 ``` --- # Tiling [A reminder](https://davisvaughan.github.io/slider/) ```r USET %>% features(Employed, features=list(var_tiled_mean, var_tiled_var)) ``` ``` ## # A tibble: 3 x 3 ## Title var_tiled_mean var_tiled_var ## <chr> <dbl> <dbl> ## 1 Financial Activities 1.02 0.0000411 ## 2 Manufacturing 1.03 0.0000923 ## 3 Retail Trade 0.922 0.0136 ``` ```r FC %>% features(monthly.returns, features=list(var_tiled_mean, var_tiled_var)) ``` ``` ## # A tibble: 1 x 2 ## var_tiled_mean var_tiled_var ## <dbl> <dbl> ## 1 0.154 2.77 ``` --- # Detecting Shifts ```r USET %>% features(Employed, features=list(shift_level_max, shift_var_max, shift_kl_max)) %>% kable(format="html") ``` <table> <thead> <tr> <th style="text-align:left;"> Title </th> <th style="text-align:right;"> shift_level_max </th> <th style="text-align:right;"> shift_level_index </th> <th style="text-align:right;"> shift_var_max </th> <th style="text-align:right;"> shift_var_index </th> <th style="text-align:right;"> shift_kl_max </th> <th style="text-align:right;"> shift_kl_index </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Financial Activities </td> <td style="text-align:right;"> 370.750 </td> <td style="text-align:right;"> 229 </td> <td style="text-align:right;"> 24036.55 </td> <td style="text-align:right;"> 233 </td> <td style="text-align:right;"> 0.2990911 </td> <td style="text-align:right;"> 227 </td> </tr> <tr> <td style="text-align:left;"> Manufacturing </td> <td style="text-align:right;"> 1558.667 </td> <td style="text-align:right;"> 228 </td> <td style="text-align:right;"> 417019.89 </td> <td style="text-align:right;"> 235 </td> <td style="text-align:right;"> 0.5217082 </td> <td style="text-align:right;"> 227 </td> </tr> <tr> <td style="text-align:left;"> Retail Trade </td> <td style="text-align:right;"> 777.350 </td> <td style="text-align:right;"> 226 </td> <td style="text-align:right;"> 788930.74 </td> <td style="text-align:right;"> 354 </td> <td style="text-align:right;"> 1.8412503 </td> <td style="text-align:right;"> 227 </td> </tr> </tbody> </table> ```r FC %>% features(monthly.returns, features=list(shift_level_max, shift_var_max, shift_kl_max)) %>% kable(format="html") ``` <table> <thead> <tr> <th style="text-align:right;"> shift_level_max </th> <th style="text-align:right;"> shift_level_index </th> <th style="text-align:right;"> shift_var_max </th> <th style="text-align:right;"> shift_var_index </th> <th style="text-align:right;"> shift_kl_max </th> <th style="text-align:right;"> shift_kl_index </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 0.2581199 </td> <td style="text-align:right;"> 110 </td> <td style="text-align:right;"> 0.1938302 </td> <td style="text-align:right;"> 113 </td> <td style="text-align:right;"> 37.93281 </td> <td style="text-align:right;"> 112 </td> </tr> </tbody> </table> --- # Crossings and Flat Spots ```r USET %>% features(Employed, features=list(n_crossing_points, longest_flat_spot)) %>% kable(format="html") ``` <table> <thead> <tr> <th style="text-align:left;"> Title </th> <th style="text-align:right;"> n_crossing_points </th> <th style="text-align:right;"> longest_flat_spot </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Financial Activities </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 40 </td> </tr> <tr> <td style="text-align:left;"> Manufacturing </td> <td style="text-align:right;"> 11 </td> <td style="text-align:right;"> 52 </td> </tr> <tr> <td style="text-align:left;"> Retail Trade </td> <td style="text-align:right;"> 31 </td> <td style="text-align:right;"> 10 </td> </tr> </tbody> </table> ```r FC %>% features(monthly.returns, features=list(n_crossing_points, longest_flat_spot)) %>% kable(format="html") ``` <table> <thead> <tr> <th style="text-align:right;"> n_crossing_points </th> <th style="text-align:right;"> longest_flat_spot </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 121 </td> <td style="text-align:right;"> 8 </td> </tr> </tbody> </table> --- # ARCH What proportion of the current squared residual is explained by the prior squared residual? This reports `\(R^2\)`; if the variance explained is large, volatility is persistent. **There is a chi-square statistic also.** ```r USET %>% features(Employed, features=stat_arch_lm) %>% kable(format="html") ``` <table> <thead> <tr> <th style="text-align:left;"> Title </th> <th style="text-align:right;"> stat_arch_lm </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Financial Activities </td> <td style="text-align:right;"> 0.9894145 </td> </tr> <tr> <td style="text-align:left;"> Manufacturing </td> <td style="text-align:right;"> 0.9721737 </td> </tr> <tr> <td style="text-align:left;"> Retail Trade </td> <td style="text-align:right;"> 0.9167522 </td> </tr> </tbody> </table> ```r FC %>% features(monthly.returns, features=stat_arch_lm) %>% kable(format="html") ``` <table> <thead> <tr> <th style="text-align:right;"> stat_arch_lm </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 0.0509165 </td> </tr> </tbody> </table> --- ## The Box-Cox ```r USET %>% features(Employed, features=guerrero) %>% kable(format="html") ``` <table> <thead> <tr> <th style="text-align:left;"> Title </th> <th style="text-align:right;"> lambda_guerrero </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Financial Activities </td> <td style="text-align:right;"> 0.9481456 </td> </tr> <tr> <td style="text-align:left;"> Manufacturing </td> <td style="text-align:right;"> 1.0369662 </td> </tr> <tr> <td style="text-align:left;"> Retail Trade </td> <td style="text-align:right;"> 1.1860464 </td> </tr> </tbody> </table> ```r FC %>% features(monthly.returns, features=guerrero) %>% kable(format="html") ``` <table> <thead> <tr> <th style="text-align:right;"> lambda_guerrero </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 0.6848889 </td> </tr> </tbody> </table> ```r USET %>% features(Employed, features=guerrero) ``` ``` ## # A tibble: 3 x 2 ## Title lambda_guerrero ## <chr> <dbl> ## 1 Financial Activities 0.948 ## 2 Manufacturing 1.04 ## 3 Retail Trade 1.19 ``` --- # Filtered Manufacturing ```r USET %>% filter(Title=="Manufacturing") %>% mutate(Filt = box_cox(Employed, 1.0369662)) %>% select(YM,Filt,Employed) %>% pivot_longer(c(Filt,Employed)) %>% autoplot(value) ``` <img src="index_files/figure-html/unnamed-chunk-36-1.png" width="576" /> --- ```r USET %>% filter(Title=="Financial Activities") %>% autoplot(box_cox(Employed, 0.9481456)) ``` <img src="index_files/figure-html/unnamed-chunk-37-1.png" width="576" /> --- ```r USET %>% filter(Title=="Retail Trade") %>% autoplot(box_cox(Employed, 1.1860464)) ``` <img src="index_files/figure-html/unnamed-chunk-38-1.png" width="576" /> --- ```r FC %>% features(monthly.returns, features=guerrero) ``` ``` ## # A tibble: 1 x 1 ## lambda_guerrero ## <dbl> ## 1 0.685 ``` ```r FC %>% autoplot(box_cox(monthly.returns, 0.6857523)) ``` <img src="index_files/figure-html/unnamed-chunk-39-1.png" width="576" /> --- # Australian Tourism [The example is great.](https://otexts.com/fpp3/exploring-australian-tourism-data.html) --- # Principal Components Let's walk through this example.