install.packages("tseries")
install.packages("stats")
library(tseries)
library(stats)
library(fUnitRoots)
## Loading required package: urca
## Loading required package: timeDate
## Loading required package: timeSeries
## Loading required package: fBasics
##
## Rmetrics Package fBasics
## Analysing Markets and calculating Basic Statistics
## Copyright (C) 2005-2014 Rmetrics Association Zurich
## Educational Software for Financial Engineering and Computational Science
## Rmetrics is free software and comes with ABSOLUTELY NO WARRANTY.
## https://www.rmetrics.org --- Mail to: info@rmetrics.org
##
## Attaching package: 'fUnitRoots'
## The following objects are masked from 'package:urca':
##
## punitroot, qunitroot, unitrootTable
\(e_t \sim i.i.d(0,\sigma^2)\)
Podmnozinou bílého sumu, je Gaussovský bílý sum
\(e_t\sim N(0,\sigma^2)\)
n=1000
WN=rnorm(n,0,7)
plot.ts(WN)
Naprídklad AR(k) proces
\(y_t=\alpha_0+\alpha_1y_{t-1}+...+\alpha_{t-k}y_{t-k}+e_t\)
\(y_t=0.5y_{t-1}+e_t\)
ar_proces=arima.sim(list(order = c(1,0,0), ar = 0.5), n = 1000)
plot.ts(ar_proces)
\(p_t=p_{t-1}+\epsilon_t\)
N=1000
w=rnorm(N)
RW=cumsum(w)
plot.ts(RW)
Autokorelacní funkce ACF
\(\rho_k=\dfrac{Cov(y_t,y_{t-k})}{Var(y_t)}\)
Parcilní autokorelacní funkce PACF.
Princip je, ze se postupne navysuje pocet zpozdení a a testuje se kazdý parametr pro nove pridanou promennenou.
\(y_t=\alpha_0+\alpha_1y_{t-1}+e_t\)
\(y_t=\alpha_0+\alpha_1y_{t-1}+\alpha_2y_{t-2}+e_t\)
\(y_t=\alpha_0+\alpha_1y_{t-1}+\alpha_2y_{t-2}+\alpha_3y_{t-3}+e_t\)
\(y_t=\alpha_0+\alpha_1y_{t-1}+\alpha_2y_{t-2}+\alpha_3y_{t-3}...+\alpha_{t-k}y_{t-k}+e_t\)
Bilý sum
par(mfrow=c(1,2))
acf(WN)
pacf(WN)
AR(1) proces
par(mfrow=c(1,2))
acf(ar_proces)
pacf(ar_proces)
RW proces
par(mfrow=c(1,2))
acf(RW)
pacf(RW)
data=pacf(RW)
data=data$lag
Vyhodnocení pomocí Ljung-Box testu.
Box.test(ar_proces, lag = 4, type ="Ljung-Box")
##
## Box-Ljung test
##
## data: ar_proces
## X-squared = 354.32, df = 4, p-value < 2.2e-16
n=1000
e=rnorm(n,0,70)
t=1:1000
x=rnorm(n,10,20)
y=20+1.2*t+0.5*x+e
plot.ts(y)
par(mfrow=c(1,2))
acf(y)
pacf(y)
Pro testování stacionarity pouzíváme napríklad ADF test. Tento package vybírá nejvhodnejsí zpozdení a tvar procesu. To nemusí být uplne vhodné
adf.test(y)
## Warning in adf.test(y): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: y
## Dickey-Fuller = -10.158, Lag order = 9, p-value = 0.01
## alternative hypothesis: stationary
Balícek fUnitRoots obsahuje velmi hezky zpracované testy na unit root
urdfTest(y, lags = 1, type = "ct", doplot = TRUE)
##
## Title:
## Augmented Dickey-Fuller Unit Root Test
##
## Test Results:
##
## Test regression trend
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -250.597 -50.366 -2.078 51.950 249.045
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 27.405656 4.657215 5.885 5.45e-09 ***
## z.lag.1 -1.002456 0.044853 -22.350 < 2e-16 ***
## tt 1.201379 0.054318 22.117 < 2e-16 ***
## z.diff.lag 0.002718 0.031729 0.086 0.932
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 71.14 on 994 degrees of freedom
## Multiple R-squared: 0.4998, Adjusted R-squared: 0.4983
## F-statistic: 331.1 on 3 and 994 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -22.3497 166.7136 249.7543
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau3 -3.96 -3.41 -3.12
## phi2 6.09 4.68 4.03
## phi3 8.27 6.25 5.34
##
## Description:
## Sun Apr 29 21:42:46 2018 by user: Lukas
urersTest(y, type ="DF-GLS", model = "trend",lag.max = 4, doplot = TRUE)
##
## Title:
## Elliott-Rothenberg-Stock Unit Root Test
##
## Test Results:
##
## Test of type DF-GLS
## detrending of series with intercept and trend
##
## Residuals:
## Min 1Q Median 3Q Max
## -245.928 -40.003 9.056 58.660 257.229
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## yd.lag -0.80037 0.06548 -12.223 < 2e-16 ***
## yd.diff.lag1 -0.16986 0.05988 -2.837 0.00465 **
## yd.diff.lag2 -0.14084 0.05293 -2.661 0.00792 **
## yd.diff.lag3 -0.09989 0.04401 -2.270 0.02344 *
## yd.diff.lag4 -0.05668 0.03174 -1.786 0.07445 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 72.18 on 990 degrees of freedom
## Multiple R-squared: 0.4858, Adjusted R-squared: 0.4832
## F-statistic: 187 on 5 and 990 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -12.2225
##
## Critical values of DF-GLS are:
## 1pct 5pct 10pct
## critical values -3.48 -2.89 -2.57
##
## Description:
## Sun Apr 29 21:42:46 2018 by user: Lukas
Nejznámejsím procesem obsahující stochastický trend je proces náhodné procházky (RW)
\(p_t=p_{t-1}+\epsilon_t\)
Integrovaný proces rádu 1. Proces obsahuje tzv. jeden jednotkový koren. Prakticky to znamená, ze pro získání stacionární casové rady je potreba provést jednu diferenci. Pro proces I(2) by bylo nutné provést 2 diference. V ekonomii se takový proces vyskytuje vyjímecne.
adf.test(RW)
##
## Augmented Dickey-Fuller Test
##
## data: RW
## Dickey-Fuller = -0.60984, Lag order = 9, p-value = 0.9768
## alternative hypothesis: stationary
Jedním z problému nestacionárních casových rad, je moznost vzniku zdánlivé regrese
set.seed(100)
N=1000
w=rnorm(N)
RW=cumsum(w)
x2=rnorm(N)
RW2=cumsum(x2)
regrese=lm(RW~RW2)
summary(regrese)
##
## Call:
## lm(formula = RW ~ RW2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -21.572 -6.849 -1.709 2.845 28.521
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.04025 0.35948 -0.112 0.9109
## RW2 0.11360 0.06540 1.737 0.0827 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.95 on 998 degrees of freedom
## Multiple R-squared: 0.003014, Adjusted R-squared: 0.002015
## F-statistic: 3.017 on 1 and 998 DF, p-value: 0.08268
V ideálním prípade by se mela residua chovat jako bílý sum.
residua=regrese$residuals
adf.test(residua)
##
## Augmented Dickey-Fuller Test
##
## data: residua
## Dickey-Fuller = -1.4519, Lag order = 9, p-value = 0.8103
## alternative hypothesis: stationary
set.seed(5)
N=32
x=rnorm(N,0,1)
sezon=rep(c(1,0,0,0),8)
y=x+2*sezon
plot.ts(y)
adf.test(y)
##
## Augmented Dickey-Fuller Test
##
## data: y
## Dickey-Fuller = -1.6539, Lag order = 3, p-value = 0.7064
## alternative hypothesis: stationary
A nyní odfiltrujeme sezoní slozku pomocí dummy promenných
y_sez=lm(y~sezon)
fit=y_sez$fitted.values
adf.test(fit) # casova rada ocistena od sezonosti je jiz stacionarni
## Warning in summary.lm(res): essentially perfect fit: summary may be
## unreliable
## Warning in adf.test(fit): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: fit
## Dickey-Fuller = -1.0024e+16, Lag order = 3, p-value = 0.01
## alternative hypothesis: stationary
Proces se zlomem
N1=100
N2=100
ar_proces1=arima.sim(list(order = c(1,0,0), ar = 0.5), n = N1)
ar_proces2=arima.sim(list(order = c(1,0,0), ar = 0.5), n = N2)+10
proces=c(ar_proces1,ar_proces2)
plot.ts(proces)
adf.test(proces)
##
## Augmented Dickey-Fuller Test
##
## data: proces
## Dickey-Fuller = -2.1855, Lag order = 5, p-value = 0.4984
## alternative hypothesis: stationary
set.seed(250)
n=1000
e=rnorm(n,0,70)
t=1:1000
x=rnorm(n,10,20)+0.02*t
y=20+1.2*t+e
regrese=lm(y~x)
summary(regrese)
##
## Call:
## lm(formula = y ~ x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -737.47 -275.22 -15.78 280.48 791.18
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 528.4333 15.1854 34.799 <2e-16 ***
## x 4.4555 0.5141 8.666 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 343.3 on 998 degrees of freedom
## Multiple R-squared: 0.06999, Adjusted R-squared: 0.06906
## F-statistic: 75.11 on 1 and 998 DF, p-value: < 2.2e-16
regrese2=lm(y~x+t)
summary(regrese2)
##
## Call:
## lm(formula = y ~ x + t)
##
## Residuals:
## Min 1Q Median 3Q Max
## -213.287 -50.409 0.618 48.040 193.155
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 15.553340 4.672845 3.328 0.000905 ***
## x 0.075235 0.110779 0.679 0.497205
## t 1.205489 0.008103 148.768 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 71.31 on 997 degrees of freedom
## Multiple R-squared: 0.9599, Adjusted R-squared: 0.9598
## F-statistic: 1.194e+04 on 2 and 997 DF, p-value: < 2.2e-16
Stochastický trend a pridání deterministického trednu do regrese
set.seed(4)
N=1000
w=rnorm(N)
RW=cumsum(w)
x2=rnorm(N)
RW2=cumsum(x2)
t=seq(1:N)
regrese=lm(RW~RW2)
summary(regrese)
##
## Call:
## lm(formula = RW ~ RW2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -21.403 -9.550 -2.244 9.306 26.564
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -7.06929 0.39927 -17.70 <2e-16 ***
## RW2 -0.70471 0.03525 -19.99 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.53 on 998 degrees of freedom
## Multiple R-squared: 0.286, Adjusted R-squared: 0.2853
## F-statistic: 399.7 on 1 and 998 DF, p-value: < 2.2e-16
regrese=lm(RW~RW2+t)
summary(regrese)
##
## Call:
## lm(formula = RW ~ RW2 + t)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.7533 -2.0703 0.0622 2.5524 9.2601
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 13.2573031 0.2547982 52.031 <2e-16 ***
## RW2 0.1431759 0.0146198 9.793 <2e-16 ***
## t -0.0484489 0.0005237 -92.515 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.725 on 997 degrees of freedom
## Multiple R-squared: 0.9255, Adjusted R-squared: 0.9254
## F-statistic: 6193 on 2 and 997 DF, p-value: < 2.2e-16
regrese=lm(diff(RW)~diff(RW2))
summary(regrese)
##
## Call:
## lm(formula = diff(RW) ~ diff(RW2))
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.8393 -0.6441 0.0128 0.6556 3.2252
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.03530 0.03070 -1.150 0.250
## diff(RW2) 0.02559 0.03076 0.832 0.406
##
## Residual standard error: 0.9699 on 997 degrees of freedom
## Multiple R-squared: 0.0006939, Adjusted R-squared: -0.0003085
## F-statistic: 0.6923 on 1 and 997 DF, p-value: 0.4056
Nasimuloval jsem 6 procesu, kdy se pokusíme si jej odhadnout
library(quantmod)
getSymbols("AAPL")
## 'getSymbols' currently uses auto.assign=TRUE by default, but will
## use auto.assign=FALSE in 0.5-0. You will still be able to use
## 'loadSymbols' to automatically load data. getOption("getSymbols.env")
## and getOption("getSymbols.auto.assign") will still be checked for
## alternate defaults.
##
## This message is shown once per session and may be disabled by setting
## options("getSymbols.warning4.0"=FALSE). See ?getSymbols for details.
##
## WARNING: There have been significant changes to Yahoo Finance data.
## Please see the Warning section of '?getSymbols.yahoo' for details.
##
## This message is shown once per session and may be disabled by setting
## options("getSymbols.yahoo.warning"=FALSE).
## [1] "AAPL"
price_A=AAPL$AAPL.Adjusted
chartSeries(price_A)
getSymbols("WMT")
## [1] "WMT"
price_W=WMT$WMT.Adjusted
chartSeries(price_W)
adf.test(price_A)
##
## Augmented Dickey-Fuller Test
##
## data: price_A
## Dickey-Fuller = -2.0386, Lag order = 14, p-value = 0.562
## alternative hypothesis: stationary
adf.test(price_W)
##
## Augmented Dickey-Fuller Test
##
## data: price_W
## Dickey-Fuller = -2.6818, Lag order = 14, p-value = 0.2897
## alternative hypothesis: stationary
zdanliva_regrese=lm(price_A~I(lag(price_W)))
summary(zdanliva_regrese)
##
## Call:
## lm(formula = price_A ~ I(lag(price_W)))
##
## Residuals:
## Min 1Q Median 3Q Max
## -54.487 -11.694 0.296 10.500 62.546
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -95.13140 1.43956 -66.08 <2e-16 ***
## I(lag(price_W)) 2.78089 0.02437 114.09 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 20.37 on 2847 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.8205, Adjusted R-squared: 0.8205
## F-statistic: 1.302e+04 on 1 and 2847 DF, p-value: < 2.2e-16
vynos_A=diff(log(price_A), lag=1)
vynos_W=diff(log(price_W), lag=1)
vynos_A=na.trim(vynos_A)
vynos_W=na.trim(vynos_W)
adf.test(vynos_A)
## Warning in adf.test(vynos_A): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: vynos_A
## Dickey-Fuller = -13.228, Lag order = 14, p-value = 0.01
## alternative hypothesis: stationary
adf.test(vynos_W)
## Warning in adf.test(vynos_W): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: vynos_W
## Dickey-Fuller = -14.198, Lag order = 14, p-value = 0.01
## alternative hypothesis: stationary
regrese=lm(vynos_A~I(lag(vynos_W)))
summary(regrese)
##
## Call:
## lm(formula = vynos_A ~ I(lag(vynos_W)))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.198137 -0.009035 0.000000 0.010180 0.128794
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.0010571 0.0003763 2.809 0.00501 **
## I(lag(vynos_W)) -0.0399141 0.0301065 -1.326 0.18502
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.02008 on 2846 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.0006172, Adjusted R-squared: 0.0002661
## F-statistic: 1.758 on 1 and 2846 DF, p-value: 0.185