Results_sst_uwnd_Greta

=**PART 1**=
 * 1) What is the fundamental (lowest) frequency possible in this time series? The lowest possible frequency possible for these time series is constrained by the length of time. Specifically, the time series will allow one to have one cycle per length of the entire time series, T. Therefore, the fundamental frequency is: (2*pi)/240 (months^-^1): fundFreq=(2*pi)/length(ts1); % 0.0262 cycles/month.
 * 2) What is the //bandwidth// or spectral resolution (Δf) of the spectrum you will create from it? The bandwidth (spectral resolution) is the "frequency step" or Δf. It makes sense then that this would be exactly equal to the fundamental frequency and is 0.0262 cycles/month.
 * 3) What is the Nyquist (highest resolvable) frequency in this time series? The Nyquist frequency is defined as the highest frequency possible and so is defined by your time step. The highest frequency possible is to have one cycle per two timesteps (since it is not possible to have a cycle during only one timestep, it must be defined by two). Therefore, the Nyquist frequency equals (2*pi)/(2 months) (since Δt is 1 month) or simply pi: nyqFreq=pi; %3.1416 cycles/month.
 * 4) Based on the above, make a 1D frequency array f to use as the x axis on your plots in part 2. The 1D frequency array will be from the fundamental frequency to the nyquist frequency with a Δf of the fundFreq: Since I will be halving my spectrum later, I will simply define f now as ranging from the fundFreq to the nyqFreq.: f=fundFreq:fundFreq:nyqFreq;

=PART 2=


 * 1. Plot the spectrum** as a power spectral density PSD = Δ(variance)/Δf = Pow//Δf vs. frequency f. Label the axes with the right values and units. Area under the curve should be proportional to total power (total variance). Since it's variance in discrete bins, you should ideally use a bar plot or plotting symbols, not just a line plot connecting the "points".
 * You may want to center it on 0 frequency (by shifting the array) to show a symmetric spectrum with positive and negative frequencies.
 * Or you may prefer to just half the spectrum (PSD vs. the absolute value of f -- remember to double the positive frequency part of Pow so area = variance).
 * You may also choose to rebin f and PSD to coarser spectral bands, if the plot is too noisy.




 * 2. Plot the spectrum** as an indefinite integral (cumulative power) vs. period or log period.



3. **Plot the spectrum** as f*power vs. log(f). Area under the curve should still be proportional to total power (variance).




 * 4. Plot the spectrum** as f*power vs. log(period). Area under the curve should still be proportional to total power (variance).




 * 5. Plot the spectrum** as log(Pow) vs. log(f).



Why this way? Area under the curve is no longer meaningful. The reason to plot a spectrum this way is to see if it looks like a straight line. If the slope is -1, you have Pink Noise [] aka [|Flicker noise] aka []. If the slope is -2, you have [|Brownian noise (hear an acoustic sample here!)]. Slopes of -3 or -5/3 are predicted for KE (velocity variance) by 3D and 2D turbulence theory (based solely on scaling arguments). No matter what the slope, a straight line implies a power law, although the [|implications of finding a power law may be less profound than they appear].

6. **Plot the spectrum in your favorite format, after rebinning** f and Pow to coarser frequency bins. (Rebinning commands are in HW3 question 4).



7. **Plot the spectrum in your favorite format,** **overplotting** the PSD you get //when you **pad the ends of the time series with zeroes**//. This will highlight the errors associated with making your time series as if it were periodic.
 * to do this part, just make a new data array tspad = [ts*0, ts-mean(ts), ts*0] (IDL) or [ts*0 ts-mean(ts) ts*0]; (Matlab)
 * also make a new frequency array corresponding to this longer series.
 * Adjust the variance of the padded time series spectrum so that it overlays the unpadded spectrum well.
 * At high frequencies, the two should be almost identical.



3. Significance testing of peaks: Overplot a red noise spectrum and its 95% significance level (the F test).

 * 1) **Estimate** your lag-1 autocorrelation value r1 for your field1.
 * 2) Use r1 to **create and** power spectrum of an autoregressive (AR(1)) or "red noise" process with the same r1 and same variance as your series.
 * 3) Use the [|F test] to **overplot** a line indicating the 99% significance level for spectral peaks.

For this part, I was curious about how significant the smaller peaks were for both SST and uwnd. That is why I decided to plot not only the 99% but also the 95% and 90% significance levels. In particular, I was surprised to see that the peak close to a frequency of 1 in both plots was significant at the 99% level for uwnd, but less than 90% significant for SST.

Background info for 3.2 and 3.3: //creating the power spectrum of the AR(1) process, your null hypothesis ("Red Noise"), and its F test//

 * //For continuous AR(1) red noise, the autocorrelation r(lag) decays away exponentially with lag: r = exp(-lag/T)//
 * //The time constant T////for this exponential decay is thus T = -(//Δ//t)/ln( r1 ) where r1 is your lag-1 autocorrelation.//
 * The power spectrum of red noise is P_red(f) = 2T/(1 + f 2 T 2 ). Compute this from your f array and rescale its total variance to match yours.
 * For the F test 99% significance threshold, the curve you plot is just your red noise null hypothesis curve times a factor given in [|this table] (Ftest99.vonStorchZwiers.png) (from appendix G of the [|vonStorch and Zwiers ebook]).
 * Background: The F statistic applies to the ratio of variances between 2 processes. The null hypothesis process (red noise here) is exact and analytic, not estimated from data, so it has "infinite" degrees of freedom. Your data spectrum has 2 degrees of freedom (DOFs) per fundamental frequency interval (bandwidth), so I highlighed the 2 DOF number in the table. If you rebin your spectrum to a coarser bandwidth, then you have 4 or 6 or 8 or 10 or more DOFs per bin. Noise will tend to disappear with this bin-averaging, so the threshold for a peak to achieve statistical significance gets lower (you get to use a smaller factor from the F test table, there is another page in the book for DOFs greater than 10). Real peaks arising from physical processes (like ENSO) will tend to produce variance in many nearby frequencies, so a real spectral peak will not shrink as fast with rebinning (or averaging), and may still exceed the threshold for 99% significance.
 * Much more info and discussion (by the professor I learned it from...): www.atmos.washington.edu/~dennis/552_Notes_6b.pdf

4. Cross-spectrum of your 2 variables. (see section 3.1.2 of Hsieh handout).

 * 1. Compute the 2 FFTs** of your 2 time series. xhat = fft(ts1); yhat = fft(ts2); **and plot the spectrum of your field 2** in your favorite depiction from Part 2.6 above, if you didn't already.

Already done, see above.


 * 2. Compute the cross-spectrum** by complex multiplication: Cross = xhat .* conj(yhat)


 * 3. Separate** the cross spectrum into its real and imaginary parts.
 * R and I in Hsieh (handout) section 3.1.2
 * I often see them called P(f) and Q(f) (the "in-phase" and "quadrature" parts). Quadrature means 90 degrees out of phase: sin and cos components.


 * 4. Plot a cumulative spectrum of the in-phase part P (or R) as in 2.2 above.** Show that it ends up at the covariance of ts1 and ts2, mean( (ts1-mean(ts1)) * (ts2-mean(ts2)) ). What timescales contribute the most to the overall correlation (or covariance) of your 2 time series, according to this plot?

The covariance of my ts1 and ts2 is 0.9697. The cumulative total sum of the real part of the covariance is 0.4849, half the total covariance. I believe that this makes sense since I only used half of the frequencies (only the positive frequencies) to calculate my cumulative total sum. The shortest frequencies contribute the most to my overall covariance.


 * 5. Plot the squared coherency spectrum** (or just "coherence" in lazy language you will often hear). It is (P 2 +Q 2 ) /(xPow 2 ) /(yPow 2 ). Why is it always 1?? (Hsieh 3.33-3.36)



The squared coherency, in this case, is always one since it represents how the different wavenumbers between my two timeseries interact with each other. Since we everywhere looking at the same wavenumber, then the squared coherency spectrum is everywhere equal to one.


 * 6. Plot the squared coherency spectrum after rebinning** P, Q, xPow and yPow to coarser frequency bins.
 * Reasoning: For physically real phenomena operating in a general, broad frequency band (like ENSO), the variables x and y will have the same phase relationship for all frequencies since they are physically linked, so averaging (rebinning) won't decrease coherency much. For random, non physically linked fluctuations of x and y, the in-phase and quadrature components will both be random (positive at one frequency, negative at the next), so the averaging will weaken the coherency when the phase relationship is random. See Hsieh handout, (3.37).

Now the squared coherency is not everywhere equal to one. By averaging (in this case over four bins), we are able to see whether the covarying aspect that we are seeing between SST and uwnd is essentially random. An especially weak coherency after averaging corresponds to an especially random covariance. Therefore, frequencies where the averaged coherency remains strong reveals that the covariance between the two variables at the frequency is not random. In this case, there is a strong covariance on close to monthly timescale.


 * 7. Plot the phase spectrum** arctan(Q/P), and interpret the phase relationship between your variables in a frequency band where there is strong coherence (like ENSO) by showing this phase relationship using time series plots zoomed in to one dominant case of this strong oscillation (like an ENSO event).



The phase spectrum is so noisy that it is difficult to see what is happening between SST and uwnd on a monthly time scale. In that area though, it does seem to be "more" negative than other parts of the frequency spectrum.

5. 2D FFT

 * 1) **Compute the 2D FFT (xhat) of your primary field's time-longitude section (fft2 in Matlab, fft in IDL).**
 * 2) **Display this 2D fft** as an image or contour plot, as a function of frequency f and zonal wavenumber k.
 * You have an f array: you just need to make a k array, following the logic in question 1 above in the x direction.
 * Shift the Power array (with fftshift in Matlab, or the shifting keyword in IDL8's FFT routine) to put low frequencies in the middle of the image.
 * You may want to display the log or square root of power, so the lowest frequencies don't dominate so strongly and you see more structure.
 * You should probably zoom in on the lowest frequency region.


 * 1) Can you interpret this spectrum directly in terms of the size and orientation of stripes seen in the time-longitude section image? >>
 * 2) Can you interpret this spectrum in terms of the rebinned variance diagram from HW3, problem 4?

This plot serves to emphasize what was discovered in HW3, problem 4. Essentially, we see that averaging over time space (frequency) reduces the variance much more than averaging over spatial domains (wavenumber). In essence, the variance is spread out a lot in frequency, but very little in wavenumber.


 * 3. Show a filtered data image**: Multiply the xhat 2-dimensional array by zero (mask it out) in some spectral region. Maybe low pass (exclude wavenumbers and frequencies > 10 times the fundamental). Or try band pass: can you find and zero out the El Nino peak? Anything that interests you. Explore, then teach us what you learned. Transform back to x-t space with ifft in Matlab or fft(xhat, /inverse) and display filtered and unfiltered data.

For this part, I decided to simply do a single low and single high pass filter (vs. doing a band pass). Though I think it is interesting to filter out the El Nino cyle, I wanted to do something slightly different. I was curious what SST would look like if I took out only the 20 smallest frequency phenomena (middle pane), and 20 largest (right). What is interesting to me about these plots is how similar they look to the original plot (left). Most of the structure is coming from intermediate frequencies.