Training Theory of Variational Quantum Machine Learning
Access status:
Open Access
Type
ThesisThesis type
Doctor of PhilosophyAuthor/s
Zhang, KainingAbstract
Recent advancements in machine learning have revolutionised research across various fields. Despite their success, conventional learning techniques are hindered by their significant computational resources and energy requirements. Prompted by recent experimental breakthroughs in ...
See moreRecent advancements in machine learning have revolutionised research across various fields. Despite their success, conventional learning techniques are hindered by their significant computational resources and energy requirements. Prompted by recent experimental breakthroughs in quantum computing, variational quantum machine learning (QML) – machine learning integrated with variational quantum circuits (VQCs) – has emerged as a promising alternative. Nonetheless, the theoretical framework underpinning the advantages of variational QML is still rudimentary. Specifically, the training of VQCs faces several challenges, such as the barren plateau problem, where the gradient diminishes exponentially with an increasing qubits. A related issue arises in variational QML training, where the convergence rate is exponentially small. In this thesis, we present theoretically guaranteed solutions to these challenges. First, we construct innovative circuit architectures to address the vanishing gradient problem in deep VQCs. We propose quantum controlled-layer and quantum ResNet structures, demonstrating that the expected gradient norm's lower bound is unaffected by the increase in qubits and circuit depth. Next, we introduce an initialization strategy to mitigate the vanishing gradient issue in general deep quantum circuits. We prove that Gaussian-initialized parameters ensure the gradient norm's decay rate remains inversely polynomial despite the increase in qubit numbers and circuit depth. Finally, we propose a novel and effective theory for analysing the training of quantum neural networks with moderate depths. We prove that, under certain randomness conditions in the circuits and datasets, training converges linearly with a rate inversely proportional to the dataset size. Our approach surpasses previous results, achieving exponentially larger convergence rates with modest depth, or conversely, requiring exponentially less depth for equivalent rates.
See less
See moreRecent advancements in machine learning have revolutionised research across various fields. Despite their success, conventional learning techniques are hindered by their significant computational resources and energy requirements. Prompted by recent experimental breakthroughs in quantum computing, variational quantum machine learning (QML) – machine learning integrated with variational quantum circuits (VQCs) – has emerged as a promising alternative. Nonetheless, the theoretical framework underpinning the advantages of variational QML is still rudimentary. Specifically, the training of VQCs faces several challenges, such as the barren plateau problem, where the gradient diminishes exponentially with an increasing qubits. A related issue arises in variational QML training, where the convergence rate is exponentially small. In this thesis, we present theoretically guaranteed solutions to these challenges. First, we construct innovative circuit architectures to address the vanishing gradient problem in deep VQCs. We propose quantum controlled-layer and quantum ResNet structures, demonstrating that the expected gradient norm's lower bound is unaffected by the increase in qubits and circuit depth. Next, we introduce an initialization strategy to mitigate the vanishing gradient issue in general deep quantum circuits. We prove that Gaussian-initialized parameters ensure the gradient norm's decay rate remains inversely polynomial despite the increase in qubit numbers and circuit depth. Finally, we propose a novel and effective theory for analysing the training of quantum neural networks with moderate depths. We prove that, under certain randomness conditions in the circuits and datasets, training converges linearly with a rate inversely proportional to the dataset size. Our approach surpasses previous results, achieving exponentially larger convergence rates with modest depth, or conversely, requiring exponentially less depth for equivalent rates.
See less
Date
2023Rights statement
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of Engineering, School of Computer ScienceAwarding institution
The University of SydneyShare