Learning from Tensors: Tensor Learning for Tensorial Data Analysis
Access status:
USyd Access
Type
ThesisThesis type
Doctor of PhilosophyAuthor/s
Bai, MingyuanAbstract
Data with rich spatial information are commonly acquired in the real-world. These data are often represented by multi-way arrays, i.e., tensors. For those also with temporal information, they can be sketched as tensorial time series. Tensorial data, including tensorial time series ...
See moreData with rich spatial information are commonly acquired in the real-world. These data are often represented by multi-way arrays, i.e., tensors. For those also with temporal information, they can be sketched as tensorial time series. Tensorial data, including tensorial time series are closely related to “big data”, because they are often with the “4V” features where data are in the large volumes and variety, require the high velocity to process them and can be with veracity caused by outliers, noises, missing values, etc., in practice. Existing methods either flatten tensors into vectors or impose strong assumptions. The former can cause the extreme large number of parameters and fail to process large volume data with the high velocity, whereas the latter cannot effectively deal with the challenge from the veracity and variety. This thesis consists of three topics which can form a pipeline to analyse tensorial data, including tensorial time series, with efficacy and other desired characteristics, to address the “4V” features. Firstly, for data preprocessing, we proposed a dimensionality reduction model Tensor-Train Parameterisation for Ultra Dimensionality Reduction (TTPUDR) specifically for ultra-dimensional data which are converted from tensors. Also, they have dimensions larger than the number of samples, which violates the assumption of many past methods. TTPUDR efficiently and effectively captures complicated spatial information in these data, avoids the curse-of-dimensionality problem and copes with extreme outliers. In the second and the third topics, we proposed a series of tensor neural differential equations to exploit complicated nonlinear spatial and temporal information for tensorial time series prediction, including irregular ones with unequally-spaced time steps which violate the equidistance assumption on time steps of many existing methods. For models proposed in all three topics, their efficacy is proved with theoretical guarantees. In numerical experiments, all proposed models outperform the existing models and demonstrate their efficiency and effectiveness on complicated spatial information and/or temporal information analysis in tensorial data, including tensorial time series.
See less
See moreData with rich spatial information are commonly acquired in the real-world. These data are often represented by multi-way arrays, i.e., tensors. For those also with temporal information, they can be sketched as tensorial time series. Tensorial data, including tensorial time series are closely related to “big data”, because they are often with the “4V” features where data are in the large volumes and variety, require the high velocity to process them and can be with veracity caused by outliers, noises, missing values, etc., in practice. Existing methods either flatten tensors into vectors or impose strong assumptions. The former can cause the extreme large number of parameters and fail to process large volume data with the high velocity, whereas the latter cannot effectively deal with the challenge from the veracity and variety. This thesis consists of three topics which can form a pipeline to analyse tensorial data, including tensorial time series, with efficacy and other desired characteristics, to address the “4V” features. Firstly, for data preprocessing, we proposed a dimensionality reduction model Tensor-Train Parameterisation for Ultra Dimensionality Reduction (TTPUDR) specifically for ultra-dimensional data which are converted from tensors. Also, they have dimensions larger than the number of samples, which violates the assumption of many past methods. TTPUDR efficiently and effectively captures complicated spatial information in these data, avoids the curse-of-dimensionality problem and copes with extreme outliers. In the second and the third topics, we proposed a series of tensor neural differential equations to exploit complicated nonlinear spatial and temporal information for tensorial time series prediction, including irregular ones with unequally-spaced time steps which violate the equidistance assumption on time steps of many existing methods. For models proposed in all three topics, their efficacy is proved with theoretical guarantees. In numerical experiments, all proposed models outperform the existing models and demonstrate their efficiency and effectiveness on complicated spatial information and/or temporal information analysis in tensorial data, including tensorial time series.
See less
Date
2022Rights statement
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
The University of Sydney Business SchoolDepartment, Discipline or Centre
Discipline of Business AnalyticsAwarding institution
The University of SydneyShare