What is a stochastic process?
A stochastic process (= random process) is a collection of random variables which is used to describe the evolution of some random value/system over time. It’s the probabilistic counterpart to a deterministic process.
In the simple case of discrete time, a stochastic process amounts to a sequence of random variables known as time series.
What is a time series?
A time series is an ordered sequence of values of a variable at equally spaced intervals. Example: Measurements of tmeperature every hour, measurement of height every 5 mins
Would would one measure the displacement of a bridge?
measurement upon the bridge with accelerometer
TLS/IBIS-S underneath the bridge directly under accelerometer
The resulting raw data is then sampled (averaged over n values) with a specific sampling rate (time over which is sampled) & a sampling period (n*sampling rate).
What is time series analysis used for?
process and quality control
deformation monitoring
structural health monitoring (vibration measurements of structures)
Earth Science (plate tectonics, sea level change)
What are the methods for time series?
time domain methods: analyze process directly as it evolves over time; how does process depend on its past values, is the process stationary…?
frequency domain methods: analysze the process in terms of cycles and oscillations using Fourier methods; what frequencies (cycles) are present in the process?
What are statistic quantities?
Statistic quantities are numerical values that summarize or describe data or random variables.
Examples: arithmetic mean, standard deviation, etc.
How does stationarity affect the computation of statistic quantities?
Computation of statistic quantities only makes sense if the time series is stationary, so if there are no systematic changes present.
What does stationarity in this context mean?
A time series is stationary if probability distribution doesn’t change when shifted in time or space, mean & variance also don’t change -> no systematic changes present!
In the real world, most data isn’t perfectly stationary -> weak stationary.
Are the following time series stationary or not?
a) trend -> not stationary
b) (weak) stationary with an outlier
c) changing levels -> not stationary
d) seasonality -> not stationary
e) trend -> not stationary
f) changing levels -> not stationary
g) cycles which are aperiodic (not predictable) -> stationary
h) seasonality -> not stationary
i) seasonality & trend -> not stationary
What are examples of statistic quantities?
linear dependence
covarince
correlation coefficient
autocovariance
autocorrelation
What does covariance describe?
How two variables change together (measure of linear dependency between x- and y-values). When scaled with standard deviation: correlation coefficient
What is the correltion coefficient?
The correlation coefficient is the normalized version of covariance & ranges from -1 to 1.
Estimate the correlation coefficients from the following time series.
From left to right:
1) strong positive correlation, maybe r = 0.9
2) moderately strong negative correlation, maybe r = -0.8
3) almost no correlation, maybe r = 0.01
Note that the slope doesn’t matter here (outside from telling apart negative/positive). The correlation coefficient does depend on how much “noise” is in the data (a lot = lower r, none = high r).
What is auto covariance?
Autocovariance is the covariance within the same time series (how values relate to their own past), so instead of looking at the relation of x -> y, we look at x -> x-1.
What is autocorrelation?
Autocorrelation is the normalized version of autocovariance, shows how strongly a time series relates to their own past. Gives values -1 to 1 for every time lag tau.
High r_tau = strong dependence on its past
low r_tau = noise
oscillating r_tau = periodic behavior
What is cross-correlation?
Cross-correlation describes how well two signals resemble each other (large correlation = nearly the same). It’s the normalized cross-covariance.
In time series analysis, as applied in statistics, the cross correlation between two time series describes the normalised cross covariance function.
How do we check for randomness in a time series?
For example by using a correlogram for autocorrelation. If random, autocorrelations should be near zero for any and all time-lag separations. If non-random, then one or more of the autocorrelations will be significantly non-zero.
Given these (auto-)correlograms, what do they say about the data?
1) low correlations without patterns
2) high positive correlations that only slowly decline with increasing lags. This indicates a lot of autocorrelation = time series depends strongly on its past
3) The slow decrease in the ACF as the lags increase is due to the trend, while the “scalloped” shape is due to the seasonality
4) all values are close to 0 -> noise
5) oscillating autocorrelation -> periodic behavior of time series
When do we obtain the maximum correlation?
If y(t) is shifted by 3 units.
How do we have to shift one of the time series to obtain the maximum correlation?
Compute the cross correlation for different time delays. Where the cross correlation is highest = maximum correlation.
The maximum time delay ishere tau >= n/4.
If the time delay is < 0: influence from Y -> X
if the time delay is > 0: influence from X -> Y
How does a autocorrelation function for a stationary series look like?
If the series is stationary, you have mainly pure noise left, thus, the acf would only show one peak for tau=0 and all other values will be around zero. Therefore, the only information you will get from an acf about a stationary series, is that it is stationary.
-> very quick decay, one peak at tau = 0, then only low autocorr. values around 0.
What components is a time series composed of?
trend m(long term systematic change of time series)
long term cyclic influence k(e.g. business cycle recession -> recovery -> growth -> decline)
short term cyclic influence s(e.g. seaonal component that repeats every year)
random variable u (all effects that can’t de described by the others)
In total: x = m + k + s + u
trend & long term cyclic influence are often summarized as long term behavior g.
long & short term cyclic influence = cyclic influence z.
What is the functional model of a time series?
m = trend
k = long term cyclic behavior
s = short term cyclic behavior
u = noise
However, if long term trend & variation of value increases as well:
x = g x s x u
where g = m + k
What is a functional model for this time series?
where g = m + k (trend & long term cyclic behavior)
Here, no trend. If the cycles are long or short term depends on time scale.
What are the component models for a time series?
Two different component models can be applied:
global component model
local component model
What is the global component model?
For the global component model, we apply a (non-)linear regression model for the entire time series. The parameters are estimated via least squares adjustment.
-> this is what we did in Adjustment Theory
What is the local component model?
In the locl component model, the model is applied piecewise to the data. The components are extracted using filters.
What are the 4 ways of doing time series analysis/decomposition of time series?
regression analysis: determination of a function that describes the trend m/sometimes the long term behavior g
filtering: no specific functional model, new values are computed as mean values. Some filters are used to smooth out components
elimination of seasonal component: Special approaches for determination of seasonal component s depending on the underlying problem
harmonic analysis: approximation of a time series by sums of trigonometric functions, e.g. Fourier Analysis
What is the trend in a time series?
Trend in a time series is a slow, gradual change in some property of the series over the whole interval under investigation. Trend can be defined as long term change in the mean, but can also refer to change in other statistical properties.
Why is trend determination useful in time series analysis?
detrending (remove trend from time series in order to make it stationary/determine other effects)
extrapolation (predict future behavior of time series)
When determining a functional model for the trend in a time series, how do we know which degree is appropriate?
-> do a statistical test
What are some linear vs. non-linear trend models?
linear:
straight line, parabola
non-linear:
exponential, logarithmic
What is an appropriate functional model for these time series (with equation)? Give one example for real-world data for time series 3 - 6.
1) straight line: x = a + bt
2) parabola: x = at^2 + bx + c
3) exponential growth: x = a*e^bt, b > 0, example: population growth
4) exponential decay (decreasing): x = a*e^-bt, b > 0, example: radioactive decay
5) logistic growth: x = a/(1+b*e^-ct), c > 0, example: sales & advertising, population growth if there’s no capacity for unlimioted growth
6) logarithmic model: x = a + b*ln(t), example: magnitude of earthquakes (Richter scale)
What does Smoothing mean in the context of Time Series Analysis?
Smoothing means elimination of irregular fluctuation via local approximation. Advantage of local approximation: → We can apply polynomials with a low degree. We can do this by filtering.
What are some filtering techniques?
simple moving average
weighted moving average
exponential moving average
causal moving average
centered moving average
How does the simple moving average filter work?
The moving average filter computes new values depending on if the order tau is even or odd.
For uneven order 𝜏 the moving average is the arithmetic mean of the 𝜏 values nearest to the considered point 𝑖.
For even 𝜏 we consider the (𝜏 + 1) nearest neighbors, but we multiply the both values that are far away with ½.
This gives us a new, shorter time series.
How should the order tau be chosen for the moving average filter?
Moving average yields “smoothin” of a time series, to smooth out fluctuations with periods below the length of the interval tau.
E.g. when period of short term cyclic behavior is chosen, this component is smoothed out (deterministic component). We get the smooth component g (trend + long term cyclic behavior).
If a period of long term cyclic behavior is chosen, this and evrything below is smoothed out so that only the trend remains.
So it depends on which component we want to smooth out! (Noise always gets smoothed out somehow)
How does the weighted moving average filter work?
The weighted MA is an average with different weight factors to data at different positions in the sample window. The WMA is a weighted average of the last 𝑛 values where the weighting decreases by 1 with each previous value.
Therefore, i has the highest weight, i-1 has a weight that’s a little lower, and i-5 is even lower, etc.
How does the exponential moving average filter work?
An EMA applies weighting factor which decrease exponentially. The weighting for each point decreases exponentially, but never reaching zero.
So t get’s the highest weight, and the more we go back, the lower the weight is (in an exponential way).
How can repeating phenomena be described?
period: how long a complete cycle is
frequency: how often a phenomenon repeats itself within a time unit (= cycles per time unit); frequency = 1/period
What is the period and frequency for this time series?
period: 10 years
frequency: 1/10 per year
What is the period and frequency of this time series?
Period: 2pi
Frequency: 1/2 pi
When is a function periodic?
When its graph doesn’t change if we shift the time series by p units to the left or the right, so that evry integer multiple of p is a period of the function.
A function is non-periodic when this condition can only be fulfilled for p = 0.
Are these graphs periodic?
1) yes
2) no
3) no
period: 1/2
frequency: 2
What are harmonic oscillations?
Harmonic oscillations are superpositions of sine and cosine waves. Even very complex periodic functions can be described by superposition (addition) of harmonic oscillations with different frequencies.
What does Fourier analysis do?
Fourier analysis breaks a complicated signal down into simple waves/their frequencies. It switches perspective to: Which waves are inside this signal? When Fourier analysis is done, we get a frequency spectrum where low frequencies = slow changes and high frequencies = fast changes (since high frequency = long period and vice versa). The height of the frequency determines the strength of each frequency.
From a 100 observations, a max of 50 frequncies can be found.
What are the downsides of Fourier Analysis? What is the advantage?
We can’t evaluate time series with gaps (need to fill them in first!). We aren’t able to find arbitrary (random) frequencies.
The advantage is that we don’t need starting values.
How do we get from the time domain to the frequency domain?
Fourier Analysis
Least squares spectral analysis
Compare Fourier Analysis and Least Squares Spectral analysis.
Fourier doesn’t need starting values, while LSA needs very good starting values.
When there are gaps in the time series/not constant delta(T), Fourier can’t be applied. LSA can work with this.
With LSA, we find frequencies that are really present in out data, while Fourier does an approximation & doesn’t find arbitrary frequencies.
What’s the difference between a symmetric and an asymmetric kernel?
asymmetric: puts more weights on one side -> only uses past OR future values -> phase shift
symmetric: weights look the same on both sides -> zero phase
What are boundary conditions?
Boundary conditions define how a signal is extended, since the new time series after filtering will be smaller than the original one. Boundary conditions specify how to deal with the boundary so that the new time series maybe isn’t smaller than the original one.
For example: Zero padding, periodic extension, symmetric extension
Explain the boundary conditions zero padding, periodic extension, mirror (symmetric extension).
Zero padding: assumes the signal is zero outside the observation interval, often used for finite, non-periodic signals (linear convolution)
periodic extension: assumes the signal repeats periodically (circular convolution), often used for Fourier Transform
Mirror (symmetric) extension: commonly used in image processing, preserves continuity at the boundaries
What is a Kalman Filter?
The Kalman Filter is a multiple input, multiple output digital filter that estimates the states of a system based on its noisy outputs in real time. The states are all the variables needed to describe a system behavior as a function of time.
It yields a recursive solution of a linear filter problem for discrete data. It can run in real time.
It’s mainly used in navigation.
How does the filter circle of the Kalman filter look like?
The time update projects the current state estimate ahead in time. Then, the measurement update adjusts the projected estimate by an actual measurement (innovation).
What is the difference betwen amplitude & phase spectrum?
Both can be calculated using Fourier Analysis.
The Amplitude spectrum measures how strong each frequency is.
The Phase spectrum indicates how each frequency component is shifted in time - it shows the phase of each frequency.
What is a low-pass filter?
A low-pass filter is a filter which keeps low frequencies and reduces high frequencies. It preserves the trend and smooths out rapid fluctuations.
Example: Centered moving average.
Last changeda day ago