A new insight into Wavelet Transforms using the concepts of Frame Theory

Wavelet analysis being a relatively new subject of study is being explored, all around the globe, using various mathematical tools, currently available. This paper is a humble attempt to provide a comprehensive study of the same, by means of exhaustive mathematical analysis. Since frame theory has been established as a standard notion in applied mathematics, so it was used as the analytical tool to explain the formation, purpose and use of wavelets. The theoretical explanation follows the mathematical analysis, which is an attempt to give picture the „theorems‟, „definitions‟ and the „lemmas‟.


INTRODUCTION
A standard way of modelling both the physical and the virtual worlds is by writing systems of equations. General systems of equations are hard to deal with in a systematic fashion: they are tough to solve practically, and also difficult to anal yse theoretically. The so called linear system turns out to be a nice exception: they are relatively straightforward to solve practically, and their theoretical analysis is supported by a rich and powerful theory. This paper covers the short-time Fourier transform, which is frequently utilized for non-stationary signal analysis. Although a powerful tool, it has some limitations in analysing time-localized events. The wavelet transform has similarities with the short-time Fourier transform, but it also possesses a time-localization property that generally renders it superior for analysing non-stationary phenomena. In this paper, we review the Fourier and short-time Fourier transforms, discuss some often desirable properties that the short-time Fourier transform does not possess, and introduce the wavelet transform. Moreover, the concepts of wavelet transforms are analysed using the frame theory concepts.

TIME FREQUENCY ATOMS
The basic property of the Fourier transform is that any signal and its Fourier transform when mapped onto the σ X σ domain is localised to a box of minimum area half i.e, a signal localised in time cannot have compact support in frequency and vice-versa [13]. The mathematical proofs and explanations supporting this claim are given below.

Heisenberg Boxes[6]
The slice of information provided by < , > is represented in a time-frequency plane ( , ) by a region whose location and width depends on the time-frequency spread of [13] .Since, We interpret || || 2 as a probability distribution centred at, The spread around is measured by the variance, The Parseval"s Identity proves that, The centre frequency of is therefore defined by And its spread around is, It can be proved rigorously that the 2 2 for any signal which obeys the Fourier transformability criteria is greater than = 1 2 .Consequently there is a limit to the amount of localization that a signal can achieve in both the time and frequency domains leading to the representation of a signal as boxes in the plane called Heisenberg Boxes.
This is shown as in the figure below. M a r c h 1 1 , 2 0 1 4 whose width along time is and whose width along frequency is .The Heisenberg uncertainty Theorem proves that the area of the rectangle is at least σ ω σ t ≥ 1 2 [6] This limits the joint resolution of in time and frequency. The time-frequency plane must be manipulated carefully because a point ( 0 , 0 ) is ill-defined. There is no function that is perfectly well concentrated at a point 0 and a frequency 0 .Only rectangles with area at least 1 2 may correspond to time-frequency atoms.

THE FOURIER TRNSFORM AND THE SHORT TIME FOURIER TRANSFORM
The Fourier transform is best suited to analyse stationary periodic functions-those that exactly repeat themselves once per period, without modification [15]. It provides a single spectrum for the whole signal. For non-stationary signals we are interested in the frequencies that are dominant at any given time. For example, we perceive a musical melody as a succession of notes, each with its own frequency spectrum, rather than as one big signal with an overall spectrum. To analyse such signals, we may turn to the short-time Fourier transform.
The short-time Fourier transform (or STFT) of a function at some time t is the Fourier transform of that function as examined through some time-limited window centred on t. A different Fourier transform exists for each position t of the window. These transforms, produced by sliding the examination window along in time, constitute the STFT.
If the examination window simply omits the signal outside the window, two problems are encountered. One is the sudden change in the power spectrum as a discontinuity enters or leaves the window, compounded by a lack of sensitivity to the position of the discontinuity within the window. The other problem is spectral leakage: if some component of the signal has a cycle time which is not an integral divisor of the window width, the transform exhibits spurious response at many frequencies. These problems are ameliorated by attenuating samples away from the centre of the window, by a "windowing function," g. An example of a windowing function is the Gaussian, − 2 , for some constant a. mathematically, the STFT at time T is given by: The response of the STFT, centered at time = 0 , to an impulse function − 0 occurring at time t = 0 is given by: The power spectrum of the STFT is 2 0 − 0 . As shown in figure , the power spectrum is the same for all frequencies.
The cross-section of the transform at constant frequency produces a time-reversed copy of the windowing function. Thus, the width (standard deviation) of the windowing function limits the accuracy with which the impulse can be located in time.
Although the STFT windowing function's width is constant, its impact varies with frequency. At high frequencies the number of waves in a window is high, producing good accuracy in frequency measurement; yet the window width prevents good localization of signal discontinuities, which the high frequencies otherwise could provide. Narrowing the window width to accommodate more precise time-localization of discontinuities causes other problems. A narrow window width is inappropriate at low frequencies, because a narrow windowing function spans fewer cycles. It distorts the signal noticeably over. One wavelength, degrading accuracy of frequency measurement. Indeed, wavelengths longer than the window width cannot be measured. From these considerations it seems advantageous to let the windowing function be broad for analysing low frequencies and narrow for high frequencies.
For example let us analyse the following wave using STFT, there are four frequency components at different times. The interval 0 to 250 ms is a simple sinusoid of 300 Hz, and the other 250 ms intervals are sinusoids of 200 Hz, 100 Hz, and 50 Hz, respectively. Apparently, this is a non-stationary signal. We will be using a Gaussian window w(t) to evaluate the STFT, where,  First we will compute the STFT with a=.01, the STFT plot is shown below. Note that the four peaks are well separated from each other in time. Also note that, in frequency domain, every peak covers a range of frequencies, instead of a single frequency value. Now the STFT of the above mentioned wave with a=.0001 will be: These examples should have illustrated the implicit problem of resolution of the STFT Narrow windows give good time resolution, but poor frequency resolution. Wide windows give good frequency resolution, but poor time resolution; furthermore, wide windows may violate the condition of stationarity.

CHOICE OF WINDOWS
As we have seen before using the stft the σ X σ domain is divided into an infinite no. of Heisenberg boxes the union of which gives a entire domain as proved by Banach Tarski Theorem. But we wish to reconstruct this space by using Heisenberg"s boxes of constant area but of different lengths and widths. The main motivation behind is that according to the uncertainty principle the greater the localisation in time the smaller the localisation in the frequency domain, we wish to utilize this property using a new transform with a kernel of compact support, the support of which can be dilated depending on a dilatation parameter, so that we may control the localization in time of the windowed transform, at will. This leads us to the continuous wavelet transform.

WAVELET TRANSFORM
To analyse signal structures it is necessary to use time frequency atoms with varying time supports. The Wavelet transform decomposes signals over dilated and translated wavelets. a wavelet is a function with zero average [6]: It is normalised and centred in the neighbourhood of = 0.a family of time-frequency atoms is obtained by scaling by and translating it by: These atoms remain normalised. The Continuous Wavelet Transform of a function 2 (ℝ) is thus given by: Its time frequency resolution depends on the time-frequency spread of the wavelet atoms , . We suppose that With the change in variable = − we see that: Similarly with respect to the Fourier of denoted by it is centred at . The energy spread of a wavelet time frequency atom , thus corresponds to a Heisenberg box centred at ( , ) of size along the time axis and along frequency.
The area of the rectangle remains equal to at all scales but the resolution in time and frequency becomes dependent on .
We have thus obtained the expression for the CWT which can be used to obtain windowed Fourier transforms of varying time support for any Fourier transformable signal. Now intuitively we can say that the plane can be divided into many such Heisenberg boxes using wavelet atoms of varying scale which gives us the entire frequency characterisation of the signal at any time. This is exactly the approach used in the Discrete Wavelet Transform in which we sample the shift and scaling parameter to get orthogonal expansions of wavelet bases along which the projections of the signal are taken. Understanding of this method requires the Frame Theory approach, the basic theorems of which are provided in the following section.

FRAME THEORY APPROACH
The Fourier transform has been a major tool in analysis for over 100 years. However, it solely provides frequency information, and hides (in its phases) information concerning the moment of emission and duration of a signal. D. Gabor resolved this problem in 1946 [92] by introducing a fundamental new approach to signal decomposition. Gabor"s approach quickly became the paradigm for this area, because it provided resilience to additive noise, quantization, and transmission losses as well as an ability to capture important signal characteristics. Unbeknownst to Gabor, he had discovered the fundamental properties of a frame without any of the formalism.
In 1952, Duffin and Schaeffer were studying some deep problems in nonharmonic Fourier series for which they required a formal structure for working with highly overcomplete families of exponential functions in L2[0, 1]. For this, they introduced the notion of a Hilbert space frame, in which Gabor"s approach is now a special case, falling into the area of timefrequency analysis . Much later-in the late 1980s-the fundamental concept of frames was revived by Daubechies, Grossman and Mayer [3][5][6] [8], who showed its importance for data processing. Traditionally, frames were used in signal and image processing, nonharmonic Fourier series, data compression, and sampling theory. But today, frame theory has ever-increasing applications to problems in both pure and applied mathematics, physics, engineering, and computer science, to name a few.
Thus a typical frame possesses more frame vectors than the dimension of the space, and each vector in the space will have infinitely many representations with respect to the frame. It is this redundancy of frames which is key to their significance for applications. The role of redundancy varies depending on the requirements of the applications at hand. First, redundancy gives greater design flexibility, which allows frames to be constructed to fit a particular problem in a manner not possible by a set of linearly independent vectors.. A second major advantage of redundancy is robustness. By spreading the information over a wider range of vectors, resilience against losses (erasures) can be achieved. Erasures are, for instance, a severe problem in wireless sensor networks when transmission losses occur or when sensors are intermittently fading out. A further advantage of spreading information over a wider range of vectors is to mitigate the effects of noise in the signal.

DEFINATION OF FRAMES
The definition of a (Hilbert space) frame originates from early work by Duffin and Schaeffer on nonharmonic Fourier series. The main idea, is to weaken Parseval"s identity and yet still retain norm equivalence between a signal and its frame coefficients.Some of the basic results of finite frame theory are given here without proof.For proofs of the same refer to [16].

=1
for all x ϵℋ , if there exist constants and such that, The following notions are related to a frame :-.
M be a family of vectors in ℋ N .

FRAME AND OPERATORS
We set ℓ 2 = ℓ 2 (1,2, … ). Note that this space in fact coincides with ℝ ℂ or endowed with the standard inner product and the associated Euclidean norm. The analysis, synthesis, and frame operators determine the operation of a frame when analyzing and reconstructing a signal.
Analysis and Synthesis Operators: -Two of the main operators associated with a frame are the analysis and synthesis operators. The analysis operator-as the name suggests-analyzes a signal in terms of the frame by computing its frame coefficients. We start by formalizing this notion.
Definition [16] Let( ) =1 be a family of vectors in ℋ . Then the associated analysis operator ∶ ℋ → ℓ 2 is defined by : In the following lemma we derive two basic properties of the analysis operator.
LemmaLet ( ) =1 be a sequence of vectors inℋ with associated analysis operator T .
(i) We have:

Hence, Let( ) =1 is a frame for ℋ if and only if is injective.
ii) The adjoint operator of * ∶ ℓ 2 → ℋ which is given by: * ( ) = Definition: Let ( ) =1 be a sequence of vectors in ℋ with associated analysis operator . Then the associated synthesis operator is defined to be the adjointoperator * .
The next result summarizes some basic, yet useful, properties of the synthesis operator. We observe that the analysis operator maps from a space of dimension N to a space of dimension M where necessarily ≥ and again the synthesis operator maps from a space of dimension M to a space of dimension N. Hence both operators are non-invertible. To get an invertible expression which is our final goal, we concatenate the two operators which maps from a space of dimension N to a space of dimension N and has both the properties of T and T* i.e. it is injective as well as surjective thus bijective hence invertible. Hence using this operator we can decompose a vector into its frame coefficients and recover the entire information from these coefficients using the inverse operator. This operator is thus called the Frame operator.

FRAME THEORY
The frame operator might be considered the most important operator associated with a frame. Although it is "merely" the concatenation of the analysis and synthesis operators, it encodes crucial properties of the frame, as we will see in the sequel. Moreover, it is also fundamental for the reconstruction of signals from frame coefficients.

Fundamental properties:
The precise definition of the frame operator associated with a frame is as follows.

Definition Let( ) =1 be a sequence of vectors in ℋ with associated analysis operator T .Then the associated frame operator ∶ ℋ → ℋ is defined as
A few very important properties of the frame operator are : (i). Let( ) =1 be a sequence of vectors in ℋ with associated frame operator S,then M a r c h 1 1 , 2 0 1 4 Clearly, the frame operator = * is self-adjoint and positive. The most fundamentalproperty of the frame operator-if the underlying sequence of vectors formsa frame-is its invertibility, which is crucial for the reconstruction formula.
(ii).The frame operator S of a frame ( ) =1 with frame bounds A and B is a positive, self-adjoint invertible operator satisfying · ≤ ≤ · (iii).If ( ) =1 is a Parseval Frame then both are isometries .
The analysis of a signal is typically performed by merely considering its frame coefficients.However, if the task is transmission of a signal, the ability to reconstruct thesignal from its frame coefficients and also to do so efficiently becomes crucial.An exact reconstruction strategy utilises the fdact that the Frame operator is invertible.Due to this invertibility the Inverse of the Frame operator itself forms a fame operator which can be associated with a frame called the dual frame. The definitions are given below.
Theorem . Let( ) =1 be a sequence of vectors in ℋ with associated frame operator S,then for every ℋ we have This theorem leads directly to the result that the sequence If ( −1 ) =1 forms a frame for ℋ with frame bounds

Formallthi s can be written as a proposition :
Let Let( ) =1 be a sequence of vectors in ℋ with associated frame operator S,then the sequence ( −1 ) =1 is a frame in ℋ with frame bounds −1 −1 and with frame operator −1 .
This new frame is called the Canonical Dual Frame. [16] Definition : Let ( ) =1 and ( ) =1 be frames in ℋ and let Be the analysis operators of the two frames respectively.then the following conditions are equivalent : With respect to the last property in case of parseval frames we see that due to isometric property of the analysis operator the synthesis operator of the dual frame is the same as the analysis operator.thus it is seen that for Parseval Frames the canonical dual frame and the original frame are the same.
Mathematician Ingrid Daubechies proved that Wavelet frames(explained shortly) are parseval frames. It is on this property that we base the entirety of our treatment of the DWT. [2] The treatment of frame theory as given above deals entirely with finite dimensional Hilbert spaces.But for application to wavelets we will need to use subsets of the general.Hilbert space which are infinite dimensional by nature.The extension of finite frames and all results regarding these can be extended to infinite dimensional frames using transfinite theorems such as Zorn"s lemma and other more involved mathematical reasoning. The proof of this extension is beyond the scope of this paper. We will thus suffice to say that all results regarding finite frames which have been previously explored are equally applicable to infinite frames.
An intuitive explanation of this can be given as follows : A Riesz basis be it finite or infinite dimensional is given by the definition | | 2 ≤ | < , > | 2 ≤ | | 2 for all ℋ .Notice here we have used and not ℋ . This definition of the Riesz basis is extremely similar to that of finite frames and it is indeed true that all properties regarding finite fames can be extended to such a basis.
That wavelet bases are indeed Riesz bases was also proved by Ingrid Daubechies in her paper [2].

MULTIRESOLUTION ANALYSIS
In our previous treatment of the CWT we mentioned that intuitively it can be conceived that the entire plane be converged using Heisenberg boxes of varying localisation in time and frequency. This is essentially same as saying that we want to construct orthonormal wavelet bases that are complete in ℋ. One way to do this is to sample the scaling parameter and M a r c h 1 1 , 2 0 1 4 the shifting parameter in the CWT as → 0 and → 0 .Where m and n are integers . Thus the continuous wavelet transform becomes

For all
we can cover the entire plane. It is convenient to characterise as 2 .Where, j is an integer [6] [7]. Each 2 − is called a resolution.But this approach leads us to the novel idea of multiresolutions described as follows.The term multiresolutions was first used by Mallat and much more comprehensive analysis can be found in his book [6] and the paper by Meyer [7] .
Our search for orthogonal wavelets begins with multiresolution approximations. For . , Thus this theorem provides us with the result: There exist A>0 and B such that : This existence theorem provides the mathematical basis for the existence of wavelet frames. It proves that wavelet bases provide a stable signal representation.It can be proved that wavelet bases as we have constructed them are Parseval frames as they are orthonormal.For computer analysis we use finite frames along a single resolution and characterise the signal with the addition of an error function. The details of fat any resolution 2 is simply a projection of f onto a subspace M a r c h 1 1 , 2 0 1 4 of 2 (ℝ) . This projection may be formally represented by a projection operator : 2 ℝ → . Furthermore, there exists another projection operator : 2 ℝ → such that f is the smoothed version of f.
As j increases the resolution of the smoothed version of f becomes coarser, and consequently, the finer detail information is contained in the scales corresponding to low values of . For any multiresolution analysis the Wjare orthogonal both to each other and to . In addition, assuming that 0 ⊂ 2 ℝ we have 2 ℝ =⊕ 1 ⨁ And hence we may write = f + 1 as our final expression.

COMPUTATION OF CWT
Let us compute the CWT of a similar signal for which we computed the STFT, comprising 4 frequencies 30, 20,10,5 Hz. The signal is shown below: Figure 8:A non-stationary signal.
Once the mother wavelet is chosen the computation starts with s=1 and the continuous wavelet transform is computed for all values of s, smaller and larger than ``1''. In this study, some finite intervals of values for s were used, as will be described later in this chapter. For convenience, the procedure will be started from scale s=1 and will continue for the increasing values of s. This first value of s will correspond to the most compressed wavelet. As the value of s is increased, the wavelet will dilate.
The wavelet is placed at the beginning of the signal at the point which corresponds to time=0. The wavelet function at scale ``1'' is multiplied by the signal and then integrated over all times. The final result is the value of the continuous wavelet transform at time zero and scale s=1. In other words, it is the value that corresponds to the point tau =0, s=1 in the time-scale plane. The wavelet at scale s=1 is then shifted towards the right by tau amount to the location t=tau, and the above equation is computed to get the transform value at t=tau, s=1 in the time-frequency plane.
This procedure is repeated until the wavelet reaches the end of the signal. One row of points on the time-scale plane for the scale s=1 is now completed. Then, s is increased by a small value. Note that, this is a continuous transform, and therefore, both tau and s must be incremented continuously. In DWT this corresponds to sampling the time-scale plane. The above procedure is repeated for every value of s. Every computation for a given value of s fills the corresponding single row of the time-scale plane. When the process is completed for all desired values of s, the CWT of the signal has been calculated.
The figure below illustrates the entire process step by step. At every location, it is multiplied by the signal. Obviously, the product is nonzero only where the signal falls in the region of support of the wavelet, and it is zero elsewhere. By shifting the signal in time, the signal is localized in time and changing the scale of the signal, s, the signal localized in frequency. M a r c h 1 1 , 2 0 1 4  Unlike the STFT which has a constant resolution at all times and frequencies, the WT has a good time and poor frequency resolution at high frequencies, and good frequency and poor time resolution at low frequencies. In Figure 11, lower scales (higher frequencies) have better scale resolution (narrower in scale, which means that it is less ambiguous what the exact value of the scale) which correspond to poorer frequency resolution. Similarly, higher scales have scale frequency resolution (wider support in scale, which means it is more ambitious what the exact value of the scale is), which correspond to better frequency resolution of lower frequencies.
Biorthogonal wavelets: At first we need to construct an orthogonal family of wavelets that are applicable to images. In the analysis so far we have not dealt with any particular family of wavelets but it was clear that we based our analysis exclusively with signals in the time domain in mind. The main difference between signals in the time domain and images is that signals in the time domain have only one degree of freedom in the sense that their amplitude is a function of only one variable that is time.
For images the amplitude of the signal is a function of two independent coordinates x and y.Thus analysis of images requires two independent wavelets, one along the x the other along the y direction, the concatenation of which form a family of orthogonal wavelets to which the multiresolution analysis can be extended exactly as it was done previously.
The biorthogonal family of wavelets offers just such a family of wavelets.
The construction is as follows: Let ( ) =1 be a frame in ℋ ,not necessarily a parseval frame, and let ( ) =1 be its Canonical Dual.Then if we create a new family of wavelets using the pairs = < , > for all M then this family of wavelets creates a Parseval Frame of orthogonal wavelets.
Also it can be proved that the new family of wavelets forms a Riesz basis. The proof of this statement can be found in a paper by Ingrid Daubechies. Thus we may use this new family of wavelets having two degrees of freedom in independent orthogonal directions and which forms a Riesz basis to apply the DWT to an image.
In case of the application of the biorthonormal wavelets to an image we state some formal properties; The last property is taken as the defining equation for biorthonormality and the two wavelet families ( ) and ( ′ ′ ) individually form Riesz bases.

CONCLUSION
In this paper we have used the methods and theorems of frame theory to describe the construction of orthogonal wavelet frames. We have put the theory of wavelets, specifically the discrete wavelet transform in the context of vector algebra using frame theory. This analysis makes it easier to understand the DWT in a mathematical context. Algorithms for the efficient implementation of wavelet analysis in signal processing and image processing can be derived directly from this analysis. The requirement is being only of changing the kind of wavelet based on specific applications, like, edge detection in images using bi-orthogonal wavelet bases as well as texture mapping of images.