Paper The following article is Open access

Polynomial T-depth quantum solvability of noisy binary linear problem: from quantum-sample preparation to main computation

, , , , , and

Published 13 October 2022 © 2022 The Author(s). Published by IOP Publishing Ltd on behalf of the Institute of Physics and Deutsche Physikalische Gesellschaft
, , Citation Wooyeong Song et al 2022 New J. Phys. 24 103014 DOI 10.1088/1367-2630/ac94ef

1367-2630/24/10/103014

Abstract

The noisy binary linear problem (NBLP) is known as a computationally hard problem, and therefore, it offers primitives for post-quantum cryptography. An efficient quantum NBLP algorithm that exhibits a polynomial quantum sample and time complexities has recently been proposed. However, the algorithm requires a large number of samples to be loaded in a highly entangled state and it is unclear whether such a precondition on the quantum speedup can be obtained efficiently. Here, we present a complete analysis of the quantum solvability of the NBLP by considering the entire algorithm process, namely from the preparation of the quantum sample to the main computation. By assuming that the algorithm runs on 'fault-tolerant' quantum circuitry, we introduce a reasonable measure of the computational time cost. The measure is defined in terms of the overall number of T gate layers, referred to as T-depth complexity. We show that the cost of solving the NBLP can be polynomial in the problem size, at the expense of an exponentially increasing logical qubits.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Owing to their simplicity, linear problems have been studied in various applications in science and engineering [1, 2]. However, if noise is added, it becomes exponentially difficult to solve the problems. One such challenging problem, called a noisy binary linear problem (NBLP), is defined as follows: given a set $\mathfrak{S}=\left\{\left(\mathbf{a},{b}_{\mathbf{a}}\right)\right\}$ with sampled inputs a = a0 a1...an−1 ∈ {0, 1}n and outputs ba = as + ea (mod 2) ∈{0, 1}, the problem is to determine the 'secret' structure of s = s0 s1...sn−1 ∈ {0, 1}n for all samples in the presence of noise ea B(η), where B(η) is a Bernoulli distribution (specifically, ea = 0 with probability $\frac{1}{2}+\eta $ and ea = 1 with probability $\frac{1}{2}-\eta $) and $\eta \in (0,\frac{1}{2}]$. This problem is difficult to solve, and we have no better than sub-exponential sample/time complexities in classical computation [3]. This problem has thus served as a useful primitive in modern post-quantum cryptography [4].

Recently, Cross et al [5] and Grilo et al [6] have opened the possibility that quantum computation (QC) could solve a class of NBLPs by exponentially reducing the sample/time complexities. The key feature of the proposed algorithms is the use of a quantum-superposed sample, which is defined as

Equation (1)

where $\left\vert (\mathbf{a},{b}_{\mathbf{a}})\right\rangle =\left\vert \mathbf{a}\right\rangle \left\vert {b}_{\mathbf{a}}\right\rangle $, and $\mathfrak{R}\subseteq \mathfrak{S}$ is a set of arbitrary chosen samples, and $\left\vert \mathfrak{R}\right\vert $ is the cardinality of $\mathfrak{R}$. The algorithm repeatedly loads, processes, and tests the quantum sample $\left\vert \psi \right\rangle $ until the solution s is confirmed. A crucial condition for achieving a quantum speedup is that the number of samples (a, ba ) in $\left\vert \psi \right\rangle $ should scale exponentially with n; in other words, $\left\vert \mathfrak{R}\right\vert $ should be O(2n ). A conventional approach has hence been to employ a black-box operation (as in equation (1)), often called oracle, for accessing the quantum sample. However, when $\left\vert \mathfrak{R}\right\vert $ is large, such an approach is not feasible because it would be costly and difficult to prepare and use a (largely-)superposed quantum sample [7]. In the worst case, such an approach could offset the quantum speedup achieved [8]. Therefore, although the fullest use of the quantum sample to efficiently solve the NBLP is possible in QC, it is not clear whether the hardness of the NBLP can be completely overcome. Accordingly, the security level of post-quantum cryptography has not been determined so far.

In this paper, we present a complete analysis of the quantum solvability of the NBLP. In the analysis, we consider two essential and independent processes of the algorithm. One is loading the samples $(\in \mathfrak{S})$ into an highly entangled state $\left\vert \psi \right\rangle $, which is denoted by ${\mathcal{P}}_{\left\vert \psi \right\rangle }$. We design an optimal circuitry of ${\mathcal{P}}_{\left\vert \psi \right\rangle }$ by parallelising the layers of some expensive (i.e., T) quantum gates in the fault-tolerant level. The other process is the application of the main algorithm kernel ${\mathcal{P}}_{A}$, which is an optimised set of elementary gate operations. We analyse an extendable form of ${\mathcal{P}}_{A}$, which can cover multiple problems, and apply the result to a binary setting. The studies on ${\mathcal{P}}_{\left\vert \psi \right\rangle }$ and ${\mathcal{P}}_{A}$ have been independently performed thus far in separate contexts. For example, a recipe of optimisation of the process, similar to ${\mathcal{P}}_{\left\vert \psi \right\rangle }$, has been studied for a fixed architecture [9], which is designed to localize the error propagation [10]. Likewise, the algorithms (i.e., ${\mathcal{P}}_{A}$ in our case) have been analysed based on a prior assumption of the quantum-sample accessibility; hence separately without any consideration of ${\mathcal{P}}_{\left\vert \psi \right\rangle }$. However, ${\mathcal{P}}_{\left\vert \psi \right\rangle }$ and ${\mathcal{P}}_{A}$ are systematically combined to form the quantum NBLP algorithm, and they should be studied together in a single framework 8 . Thus, we analyse the number of repetitions of ${\mathcal{P}}_{\left\vert \psi \right\rangle }+{\mathcal{P}}_{A}$ required to determine the solution s in consideration of the interconnection between ${\mathcal{P}}_{\left\vert \psi \right\rangle }$ and ${\mathcal{P}}_{A}$. In the analysis, the exponential reduction in the quantum-sample complexity is derived based on a crucial condition of the solution test which has been overlooked in the previous works. This analysis allows us to account for the overall resource-consuming aspect, thereby facilitating a more comprehensive discussion on the quantum solvability of the NBLP.

The analysis is conducted in the context of the fault-tolerant QC (FTQC) under the assumption that an effective quantum error-correction code is embedded. Logical operations in FTQC consist of the Clifford and non-Clifford group gates. Clifford group gates contain Hadamard gate, π/4-rotation gate, controlled-NOT gate, etc. Logical Clifford gates usually exhibit low cost because they can be implemented transversally [12] (in the case of the topological codes, the code deformation scheme can be used [13, 14]). On the other hand, logical non-Clifford gates, typically π/8-rotation, say T gate, have lack of the transversal run, and they are much more costly to implement than any logical Clifford gates [12, 15]. This is mainly because the logical non-Clifford gates are executed via so-called the magic state distillation and its cost is large [16, 17]. Therefore, the runtime of a quantum algorithm is well approximated by the depth of the logical non-Clifford gates [1820]. In this context, we here consider the overall number of non-Clifford gate layers, particularly those of T or T gates—which is called T-depth complexity [21]. In particular, we analyse a reasonable runtime cost of the quantum NBLP algorithm, denoted by C, as follows:

Equation (2)

where S is the number of repetitions of ${\mathcal{P}}_{\left\vert \psi \right\rangle }+{\mathcal{P}}_{A}$ for the completion of the algorithm.

We analyse the (I) T-depth of ${\mathcal{P}}_{\left\vert \psi \right\rangle }$, (II), T-depth of ${\mathcal{P}}_{A}$, and (III) repetitions S and finally evaluated C. We note (again) that the analyses of (I), (II), and (III) are interrelated, and the quantum solvability of the NBLP cannot be described through an individual analysis of (I), (II), and (III). By managing the issues which would arise in such a comprehensive analysis, we show that the runtime cost C of the quantum NBLP algorithms can be polynomial in n. Further analysis reveals that a tradeoff between the depth and width of the entire algorithm circuit exists in solving the quantum NBLP.

2. Algorithm overview

We briefly outline the entire procedure of the quantum NBLP algorithm.

  • (A.1)  
    A state $\left\vert \psi \right\rangle $ of a quantum sample is prepared in the form $\left\vert \psi \right\rangle =\frac{1}{\sqrt{{2}^{q}}}{{\sum }_{\mathbf{a}}}^{\prime }\left\vert \mathbf{a}\right\rangle \left\vert {b}_{\mathbf{a}}\right\rangle $, where the summation ${{\sum }_{\mathbf{a}}}^{\prime }$ is of only the inputs in $\mathfrak{S}$, and $q\leqslant n=\lceil {\mathrm{log}}_{2}\left\vert \mathfrak{S}\right\vert \rceil $; q can be regarded as the factor that determines the size of a quantum sample $\left\vert \psi \right\rangle $. Here, by the term 'size of a quantum sample', we mean the number of the (classical) pairs (a, ba ) to be quantum-superposed in constituting $\left\vert \psi \right\rangle $. ⌈x⌉ is the ceiling of x, i.e., the smallest number greater than or equal to x.
  • (A.2)  
    Given a quantum sample $\left\vert \psi \right\rangle $, we run ${\mathcal{P}}_{A}$. Formally, ${\mathcal{P}}_{A}$ is given as the Bernstein–Vazirani (BV) kernel and is given by
    Equation (3)
    where QFTd is the d-dimensional quantum Fourier transform (QFT): ${\text{QFT}}_{d}\left\vert j\right\rangle =\frac{1}{\sqrt{d}}{\sum }_{k=0}^{d-1}{\omega }^{jk}\left\vert k\right\rangle $ with $\omega ={\text{e}}^{\text{i}\frac{2\pi }{d}}$. In NBLPs, ${\mathcal{P}}_{A}$ becomes ${\text{QFT}}_{d=2}^{\otimes n+1}={\hat{H}}^{\otimes n+1}$, where $\hat{H}$ is the Hadamard transform: $\left\vert j\right\rangle \to \frac{1}{\sqrt{2}}{\sum }_{k}{(-1)}^{jk}\left\vert k\right\rangle $ (j, k = 0, 1). The output state ${\hat{H}}^{\otimes n+1}\left\vert \psi \right\rangle $ is expressed as
    Equation (4)
    where k ∈ {0, 1}n and k ∈ {0, 1}.
  • (A.3)  
    We measure the qubit state $\left\vert {k}^{\star }\right\rangle $. Here, if we measure k = 0, no information on s can be retrieved from the remaining state, which is given by
    Equation (5)
    and the failure is returned. Otherwise (i.e., if k = 1), we obtain the remaining state
    Equation (6)
    By solving equation (6), we obtain the candidate k. Here, the true solution s can be obtained (i.e., k = s) with probability P(k = s|k = 1), and the most exact form of the probability is P(k = s|k = 1, {ea }). However, we drop the dependence on {ea } because the errors occur completely at random.
  • (A.4)  
    Repeating (A.1)–(A.3), we determine the most frequently measured k as the true solution s, which is referred to as 'majority voting'. The condition of the majority voting is analysed later. Figure 1 shows a schematic of the algorithm. Additional mathematical details are provided in appendix A.

Figure 1.

Figure 1. Schematic of the algorithm. The algorithm uses the superposed quantum sample defined in equation (1) and the kernel of QFT. In the algorithm, a candidate fraction $\tilde{\mathbf{s}}$ is obtained and used to perform a majority voting test (for details, see the main text, or references [5, 6]).

Standard image High-resolution image

3. Analysis (I): resource counts for ${\mathcal{P}}_{\left\vert \psi \right\rangle }$

Let us consider a scenario where the data, denoted by D γ , are addressed (or indexed) by the symbols γ . The addressing (or indexing) is arbitrary and the database (or table) of D γ are unsorted. We define the state of the entire data, say $\left\vert T\right\rangle $, as

Equation (7)

Here, we note that $\left\vert T\right\rangle $ is not a superposition state, and each data $\left\vert {D}_{\boldsymbol{\gamma }}\right\rangle $ is deterministic (or equivalently, classical [22]). We also emphasise that the state $\left\vert T\right\rangle $ itself is not computable. Our approach for analysing ${\mathcal{P}}_{\left\vert \psi \right\rangle }$ [or step (A.1)] is to adopt the following machinery:

Equation (8)

where $\left\vert \boldsymbol{\gamma }\right\rangle $ denotes the address and $\left\vert \text{null}(D)\right\rangle $ is the null state in which the data brought from $\left\vert T\right\rangle $ are duplicated; hereafter, $\mathfrak{R}$ denotes the space of the address.

Now, we present an outline of how the machinery in equation (8) can be used to prepare $\left\vert \psi \right\rangle $. First, by letting $\left\vert \mathfrak{R}\right\vert ={2}^{q}$, we can express the address symbol γ as a q-tuple of a binary number: γ = γ0 γ1...γq−1, where γj ∈ {0, 1} and j = 0, 1, ..., q − 1. Subsequently, we set $\left\vert {D}_{\boldsymbol{\gamma }=\mathbf{a}}\right\rangle =\left\vert \mathbf{a}\right\rangle \left\vert {b}_{\mathbf{a}}\right\rangle $ for all samples in $\mathfrak{S}$. Such a setting is possible by matching the symbol γ is matched to the input a. Then, from the address state $\frac{1}{\sqrt{{2}^{q}}}{\sum }_{\boldsymbol{\gamma }\in \mathfrak{R}}\left\vert \boldsymbol{\gamma }\right\rangle $, the machinery of equation (8) can provide the address-data entangled state as

Equation (9)

where the data $\left\vert {b}_{\mathbf{a}}\right\rangle $ are taken from $\left\vert T\right\rangle $ and the summation ${{\sum }_{\mathbf{a}}}^{\prime }$ (in equations (4)–(6)) can be replaced by ${\sum }_{\mathbf{a}\in \mathfrak{R}}$. Lastly, we can retrieve $\left\vert \psi \right\rangle $ by disregarding $\left\vert T\right\rangle $.

A naive approach for implementing the process described above is to directly load the data $\left\vert {b}_{\mathbf{a}}\right\rangle $ by using the Toffoli gates. However, such a data loading scheme requires an exponentially increasing T-depth with the address qubit size q, because the 2q Toffoli gates should be sequentially applied (see figure 2(a)). Therefore, our design strategy for acquiring an efficient machinery (equation (8)) is to use the unary (one-hot) address encoding [23], as depicted in figure 2(b). The unary bases can be written as

Equation (10)

The unary representation does not use all available Hilbert-space, and its advantages over the binary representation is that it simplifies the circuit structure [24]. To implement this approach, we consider two subdivided processes: (1) binaryunary (de)coupling and (2) data loading. In the subprocess (1), the unary addresses are correlated with the binary addresses. For example, for four addresses (i.e., q = 2) one can consider

Equation (11)

Figure 2.

Figure 2. Two schematics of the machinery in equation (8). (a) A naive approach for the data loading, where 2q Toffoli gates should be implemented sequentially. In this scheme, it is impractical to reduce the T-depth of O(2q ). (b) Our scheme designing for implementing equation (8); the unary addresses are used to bring the data. Since each unary qubit is correlated with the different data (as seen in the red dashed box), the Toffoli gates can be parallelised using ancillary qubits. Thus, if the binary–unary (de)coupling is not demanding, the advantage is straightforward (see the main text).

Standard image High-resolution image

The circuit for this example is presented in figure 2(b). Subprocess (2) duplicates the data $\left\vert {b}_{\mathbf{a}}\right\rangle $ in $\left\vert T\right\rangle $ by using the unary addresses. Lastly, by decoupling the unary address qubits, we can obtain equation (9). The decoupling is equivalent to the subprocess (1). Note that the unary address qubits, each of which is to be correlated with another data qubit, can easily be parallelised. Parallelisation reduces the T-depth complexity of the data loading considerably (as described below). Thus, if the cost of the binary–unary coupling is low, the advantage of this approach is apparent [23].

Let us now analyse subprocesses (1) and (2).

  • (1)  
    Binaryunary (de)coupling. For the analysis of subprocess (1), let us recall the circuit of the four-address example, shown in figure 2(b). The circuit comprises Toffoli and CNOT gates. Such a circuit structure can be generalised for arbitrary q address qubits, as shown in figure 3(a), where each green box contains the gates conditioned on the lth binary address qubit. The gate arrangement in the boxes can be designed generally as in figure 3(b), where 2l − 2 of Toffoli gates are used in lth box. It directly imposes a large T-depth. Thus, to minimise the depth of the circuit, we should compress the CNOT and Toffoli gates [21, 25]. For this, we can design an optimised circuit of each (lth) green box by decomposing the Toffoli gates into Clifford gates and parallelising $\hat{T}$ and ${\hat{T}}^{{\dagger}}$ gates, as shown in figure 3(c) (termed 'four T-depth optimisation'). It reduces the T-depth of the entire process to polynomial; specifically, to 4(q − 1).
  • (2)  
    Data loading. In the data loading circuit, 2q Toffoli gates should be used to duplicate the data $\left\vert {D}_{\boldsymbol{\gamma }}\right\rangle $ in $\left\vert T\right\rangle $ into the computable space. Thus, in the naive approach, a T-depth of O(2n ) is required. However, since the control qubits of the Toffoli gates are each assigned one to one unary qubit in our scheme, the Toffoli gates can be implemented in parallel. This is because of the availability of the unary address. Such implementation immediately leads to the parallelisation of the T gates. To avoid any restriction being imposed on the overall circuit optimisation by the control-qubit sharing of the Toffoli gates, we use the extra ancillary qubits (denoted by E1, E2, ...), as in figure 4(a). Then, every Toffoli gates can be parallelised, and the T-depth complexity can be optimised as O(1). The detailed technique is shown in figure 4(b).

Figure 3.

Figure 3. Circuit for the binary–unary (de)coupling. (a) Generalisation of the circuit for the q = 2 example, which comprises Toffoli and CNOT gates. (b) The generalisation part of the red box are given. (c) This part (i.e., the red box) can be optimised to achieve four T-depths overall, where the Toffoli gates are decomposed to the gates in Clifford library [25] and they can be parallelised.

Standard image High-resolution image
Figure 4.

Figure 4. (a) Circuit for data loading; the extra ancillary qubits (denoted by E1, E2, ...) are used to avoid any constraint on the parallelisation. (b) Parallelisation of data qubits sharing Toffoli gates. Note that the gates inside the blue dashed boxes can be operated in parallel, and thus, constant T-depth is possible.

Standard image High-resolution image

On the basis of the above analysis, our first result can be stated as

Resource estimation RE 1. Resource counts for implementing ${\mathcal{P}}_{\left\vert \psi \right\rangle }$ are as follows: the T-depth complexity of ${\mathcal{P}}_{\left\vert \psi \right\rangle }$, denoted by ${T}_{D,{\mathcal{P}}_{\left\vert \psi \right\rangle }}$, is bounded by O(4n) with $q\leqslant n=\lceil {\mathrm{log}}_{2}\left\vert \mathfrak{S}\right\vert \rceil $. The total number of logical qubits required to implement ${\mathcal{P}}_{\left\vert \psi \right\rangle }$ is determined to be

Equation (12)

where ωadr = q, ωa = 2q , ωextra = 2q , and ωD = 1; these variables denote the number of logical qubits for the binary address, unary address, extra ancillary system and data.

4. Analysis (II): resource counts for ${\mathcal{P}}_{A}$

Next, we consider the resource for ${\mathcal{P}}_{A}$. By considering the formal definition of the BV kernel (as given in equation (3)), we start by investigating the T-depth of an arbitrary l-qubit QFT. Usually, the quantum circuit for an l-qubit QFT can be synthesised with controlled-${\hat{R}}_{k}$ gates and $\hat{H}$, where ${\hat{R}}_{k}$ denotes the single-qubit rotation and is given by ${\hat{R}}_{k}=\left\vert 0\right\rangle \left\langle 0\right\vert +{\text{e}}^{\text{i}\pi {\theta }_{k}}\left\vert 1\right\rangle \left\langle 1\right\vert $. Typically, an ideal QFT circuit requires $\frac{l(l-1)}{2}=O({l}^{2})$ controlled-${\hat{R}}_{k}$ gates with ${\hat{H}}^{\otimes l}$, with θk = 2k  (k = 1, 2, ..., l − 1). In practice, however, an l-qubit QFT can be implemented within a small fixed error Δ, with θk = 2k  (k = 1, 2, ..., β) and 2 ⩽ βl − 1. Therefore, the (so-called) approximate-QFT (AQFT) is performed using $\frac{(2l-\beta )(\beta -1)}{2}=O(l\beta )$ controlled-${\hat{R}}_{k}$ gates. However, the condition β < l − 1 implies that a finite error Δ is unavoidable because the rotation angles θk smaller than the threshold value 2β are discarded, limiting the choice of β. The lower bound of the order of β is O(log l) (chapater 5 of reference [26]).

To realise an l-qubit AQFT circuit in a fault-tolerant manner, we can consider β = O(log l). Then, all controlled-${\hat{R}}_{k}$ gates with θk ⩽ 2O(log l) are discarded with an error bounded by Δ, and the controlled-${\hat{R}}_{k}$ gate counts are reduced from O(l2) to $O(l\,\mathrm{log}\,\frac{l}{{\Delta}})$ [27]. The remaining controlled-${\hat{R}}_{k}$ gates are decomposed into Clifford + T gates, with the decomposition involving fault-tolerance overhead. Consequently, we can obtain an l-qubit AQFT circuit in which the number of T (or T) gates is $O(l\,\mathrm{log}\,\frac{l}{{\Delta}}\times \mathrm{log}(\frac{l\mathrm{log}\frac{l}{{\Delta}}}{{\Delta}}))$, which allows the T-count of O(l log2l). For all effective QC (specifically, for Δ ≻ l2l ), we can neglect the dependence on Δ. By noting that the T-depth is upper bounded by the T-count in general, we obtain

Equation (13)

where ${T}_{C,{\text{AQFT}}_{{2}^{l}}}$ denotes the T-count of l-qubit AQFT. Note that in theory, ${T}_{C,{\text{AQFT}}_{{2}^{l}}}$ can be reduced more, namely from O(l log2(l)) to O(l log l), by using a semi-classical AQFT [28]. Very recently, Nam et al proposed a fully coherent AQFT that can have a T-count of O(l log l) [29].

On the basis of the above analysis, we obtained the second result, which is as follows.

Resource estimation RE 2. We can implement ${\mathcal{P}}_{A}$ in the NBLP, with a constant scaling of ${T}_{D,{\mathcal{P}}_{A}}$. The number of (logical) qubits required to execute ${\mathcal{P}}_{A}$ is only O(n).

The estimation can be validated as follows. In the NBLP (i.e., a binary problem), ${\mathcal{P}}_{A}$ is the (n + 1)-fold product of the Hadamard transform: ${\mathcal{P}}_{A}={\text{QFT}}_{d=2}^{\otimes n+1}={\hat{H}}^{\otimes n+1}$. Hence, the number of logical qubits is n + 1. Although the circuit may be operated with some additional ancilla qubits, ${W}_{{\mathcal{P}}_{A}}$ scales as O(n). This implies zero T-depth complexity since controlled ${\hat{R}}_{k}$ gates are not required. Hence, RE 2 holds. This result is a straightforward consequence of ${\mathcal{P}}_{A}={\hat{H}}^{\otimes n+1}$. However, an analysis of AQFT would be useful, particularly when the BV kernel is applied to a general problem setting, such as a noisy multinary linear problem.

5. Majority-voting conditions

Before analysing (III), we derive the condition for majority voting [performed in (A.4)], which has not been considered in the previous studies despite the algorithm's performance being influenced by it. First, we calculate the probability PS = P(k = s) that k measured at (A.3) is equal to the true solution s. By substituting k = s into equation (6), we obtain

Equation (14)

where $P({k}^{\star }=1)=\frac{1}{2}$. Here, we apply a useful concentration bound, the so-called Chernoff–Hoeffding (CH) inequality [30]: for tO(2q ),

Equation (15)

where ${\mathcal{U}}_{\mathbf{a}}={(-1)}^{{e}_{\mathbf{a}}}$, $\bar{\mathcal{U}}=\frac{1}{{2}^{q}}{{\sum }_{\mathbf{a}}}^{\prime }{\mathcal{U}}_{\mathbf{a}}$, and $\mathbb{E}({\mathcal{U}}_{\mathbf{a}})$ denotes the expectation of ${\mathcal{U}}_{\mathbf{a}}$. If we assume that the order of q is greater than O(log2n), the right-hand side term in equation (15) is negligible, and $P\left(\left\vert \bar{\mathcal{U}}-\mathbb{E}({\mathcal{U}}_{\mathbf{a}})\right\vert \geqslant t\right)=0$ for a large n. Note that we have used the following definition [D]: if a factor is as small as O(en ), the factor can be negligible for a large n and can be set to zero. We then obtain the following expression:

Equation (16)

Using equations (14) and (16), we can obtain the lower bound of PS such that

Equation (17)

where we have used $\mathbb{E}({\mathcal{U}}_{\mathbf{a}})=\left(\frac{1}{2}+\eta \right)-\left(\frac{1}{2}-\eta \right)=2\eta $.

We then consider the probability PF = P(ks) that the measured k is not equal to the solution s. For convenience, we represent P(ks) as $P(\mathbf{k}=\tilde{\mathbf{s}})$, where $\tilde{\mathbf{s}}=\mathbf{s}+\boldsymbol{\phi }$. ϕ = ϕ0 ϕ1...ϕn−1 is an arbitrary n-tuple of binary numbers ϕj ∈ {0, 1}, except for ϕ = 00...0. Then, from equation (6), PF can be calculated as

Equation (18)

Here, we recall the CH inequality in equation (15) and let ${\mathcal{U}}_{\mathbf{a}}={(-1)}^{\mathbf{a}\cdot \boldsymbol{\phi }+{e}_{\mathbf{a}}}$ and $\bar{\mathcal{U}}=\frac{1}{{2}^{q}}{{\sum }_{\mathbf{a}}}^{\prime }{\mathcal{U}}_{\mathbf{a}}$. It should be noted that, in this case, $\mathbb{E}({\mathcal{U}}_{\mathbf{a}})=0$ because a ϕ and a ϕ + ea are either 0 or 1 with probability $\frac{1}{2}$. Because O(q) is greater than O(log2n) and ${e}^{-\frac{1}{2}{2}^{q}{t}^{2}}$ is negligible by the definition [D], we have $P\left(\left\vert \bar{\mathcal{U}}\right\vert \geqslant t\right)=0$. Hence, we can write

Equation (19)

By using equations (18) and (19), the upper bound for PF is obtained as follows:

Equation (20)

We can finally specify the conditions required for the majority voting to be valid:

Equation (21)

If this condition is not satisfied; the possibility of a 'false' solution $\tilde{\mathbf{s}}$ being identified in (A.4) cannot be ruled out.

6. Analysis (III): number of repetitions S

Lastly, we determine the number of repetitions S. Let us assume that a candidate solution k is obtained, completing (A.1)–(A.3). The process is then repeated until M candidates are collected, and finally the most frequently occurring k is chosen from the candidates at (A.4). We assign xk = 1 (or xk = 0) when the true solution s (or a false solution $\tilde{\mathbf{s}}$) is measured after (A.1)–(A.3). Let X be the number of times that the true solution k = s is determined among the M candidates. Then, we have $X={\sum }_{k=1}^{M}{x}_{k}$ because all values of xk are independent. In such a setting, we can use a statistical inequality, namely the Chernoff bound [31]: for any epsilon > 0,

Equation (22)

where $\mu =\mathbb{E}({\mathbb{1}}_{\mathbf{k}=\mathbf{s}})=M{P}_{S}$, and ${\mathbb{1}}_{\mathbf{k}=\mathbf{s}}$ is the indicator function of k = s. By letting $2{e}^{-\frac{{{\epsilon}}^{2}}{2+{\epsilon}}}\leqslant \delta $ with δ ∈ (0, 1], we can derive the following theorem:

Equation (23)

where $\bar{X}=\frac{X}{M}=\frac{1}{M}{\sum }_{k=1}^{M}{x}_{k}$ and epsilon' = epsilonPS (here, we consider a slightly weaker bound. The tight bound is given by $M\geqslant \frac{2+{{\epsilon}}^{\prime }}{{{\epsilon}}^{\prime 2}}\,\mathrm{ln}\,\frac{2}{\delta }$). This theorem implies that if we use more than $M=\frac{3}{{{\epsilon}}^{\prime 2}}\,\mathrm{ln}\,\frac{2}{\delta }$ samples, $\bar{X}$ can be estimated within the interval $\left[{P}_{S}-{{\epsilon}}^{\prime },{P}_{S}+{{\epsilon}}^{\prime }\right]$ with a probability of at least 1 − δ. This is sometimes referred to as the sampling theorem. Since the Chernoff bound gives the minimal (Bayesian) error probability when discriminating between 'a priori' and 'observations', the sampling theorem translates into the following statement: majority voting allows the identification of the true solution s with at least $M=\frac{3}{{{\epsilon}}^{\prime 2}}\,\mathrm{ln}\,\frac{2}{\delta }$ repetitions of (A.1)–(A.3), provided the following condition is satisfied:

Equation (24)

We point out that PS,infPF,sup is greater than 0 owing to the majority-voting condition in equation (21).

Furthermore, by noting that S is the number of repetitions of (A.1)–(A.3), we achieve our third result, which is as follows.

Resource estimation RE 3. Given the constants t, epsilon, and δ, the number of repetitions S is given by

Equation (25)

where we have assumed that S = 2M because half of the trials of (A.1)–(A.3) will return a failure with k = 0 (note that the factor 2 has no influence on the order of S). The following crucial conditions should be satisfied:

Equation (26)

where the former is acquired from the majority-voting condition in equation (21), and the latter is derived using epsilon' = epsilonPS epsilonPS,inf and equation (24).

Note that ${\mathcal{P}}_{\left\vert \psi \right\rangle }$ boots up only when ${\mathcal{P}}_{A}$ runs with a single use of $\left\vert \psi \right\rangle $, and it is straightforward to determine that S corresponds to the quantum-sample complexity 9 . Accordingly, RE 3 shows that the reduction in the complexity depends on the size of the superposition, that is, $\left\vert \mathfrak{R}\right\vert ={2}^{q}$. For example, if we use the fullest (exponential-scale) superposition of the sample with $\left\vert \mathfrak{R}\right\vert =\left\vert \mathfrak{S}\right\vert ={2}^{n}$ (or equivalently, q = n), S becomes $O({{\epsilon}}^{-2}{\left\vert 2\eta -t\right\vert }^{-4}\mathrm{ln}\,{\delta }^{-1})$, which is consistent with the results of reference [5]. The opposite extreme case can also be considered, that is, using a non-superposed sample $\left\vert \psi \right\rangle =\left\vert \mathbf{a}\right\rangle \left\vert {b}_{\mathbf{a}}\right\rangle $ with $\left\vert \mathfrak{R}\right\vert =1$ (or equivalently, q = 0), which still allows quantum parallelism to be processed by the BV kernel. However, in this case, PS becomes exponentially small with n (equation (17)) and is therefore negligible (based on the definition [D]). Hence, a majority-voting condition cannot be established. Moreover, the order of q is at least O(log2n). Note that if q = O(log2n), the polynomial quantum-sample complexity cannot be achieved, that is, S = O(4n−log n ).

7. Discussion

From the results of REs 1, 2, and 3, we can draw the following conclusion: the cost C , defined in equation (2), can be a polynomial of the problem size n. The first step to achieve the polynomial-scaling C is the optimisation of the machinery of ${\mathcal{P}}_{\left\vert \psi \right\rangle }$ by using the unary (one-hot) sample input. Such a technique has been used to parallelise the expensive quantum gates in various contexts [23, 24]. In our case, the focus is on reducing the layers of T and T gates in the context of FTQC. The second key enabler for our result is the BV kernel in the main computation ${\mathcal{P}}_{A}$, which leads to a considerable reduction in the quantum-sample complexity. However, note that the unary qubit encoding is useful for ${\mathcal{P}}_{\left\vert \psi \right\rangle }$, while not at all for ${\mathcal{P}}_{A}$. Thus, we need to transform the input from unary into binary to efficiently run ${\mathcal{P}}_{A}$. In summary, the polynomial T-depth quantum solvability of NBLPs can successfully be addressed by allowing ${\mathcal{P}}_{\left\vert \psi \right\rangle }$ and ${\mathcal{P}}_{A}$ to use favorable encodings. Note further that such a result can be achieved when the two computational features, i.e., in ${\mathcal{P}}_{\left\vert \psi \right\rangle }$ and ${\mathcal{P}}_{A}$, are analysed in a single framework. This approach will be a milestone towards confirming the quantum advantages in various tasks (for example, those of the quantum machine learning) that one should consider from the largely-superposed data to main process [3335].

Another insight owing to our comprehensive analysis of ${\mathcal{P}}_{\left\vert \psi \right\rangle }+{\mathcal{P}}_{A}$ is the depth-width tradeoff in the NBLP. It can be specified by equations (12) and (25): roughly, (depth) × (width)2O(4n ). For example, if q = n, the polynomial quantum-sample complexity can be obtained. Here, we indicate that this suggests an exponential scale of the number of logical qubits (RE 1). However, such a large consumption of the logical qubits is an intrinsic problem rather than a weakness that our scheme entails, since the quantum (data) register (e.g., as large as $\left\vert {\Xi}\right\rangle $) is initialized even in the original implementations [5, 6, 11]. 10 If we attempt to reduce the number of the qubits to a polynomial in n, for example, by letting q = O(log n), an exponential reduction in the quantum-sample complexity cannot be achieved; and hence, the polynomial T-depth.

A further improvement can be achieved by developing a more efficient error-correcting code or a more efficient sample preparation scheme, which would reduce the level of noisy physical qubits.

Acknowledgments

WS and JB thank Nana Liu for the discussions. This work was supported by the National Research Foundation of Korea (Nos. NRF-2021M3E4A1038213, NRF-2021R1I1A1A01042199, NRF-2020M3E4A1077861, NRF-2019M3E4A1079666, and NRF-2019R1A2C2005504), and the Ministry of Science, ICT and Future Planning (MSIP) by the Institute of Information and Communications Technology Planning and Evaluation grant funded by the Korean government (No. 2019-0-00003, 'Research and Development of Core Technologies for Programming, Running, Implementing and Validating of Fault-Tolerant Quantum Computing System'), (No. 2020-0-00890, 'Development of trusted node core and interfaces for the interoperability among QKD protocols') and interfaces for the interoperability among QKD protocols') and (No. 2022-0-00463, Development of a quantum repeater in optical fiber networks for quantum internet). WS acknowledges the KIST research program (2E31021). YL and JJP was supported by a KIAS Individual Grant (CG073301 and CG075502) at the Korea Institute for Advanced Study. MSK acknowledges financial support from the Samsung GRC grant, KIAS visiting professorship, and EPSRC Quantum Computing and Simulations Hub grant.

Data availability statement

All data that support the findings of this study are included within the article (and any supplementary files).

Appendix A.: Additional details of the algorithm

In the absence of noise (linear function learning). To understand the operation of the algorithm, let us consider the case of no noise, which is often referred to as 'linear function learning'. Given the sample state,

Equation (A.1)

with ea = 0 (or equivalently, $\eta =-\frac{1}{2}$), the QFTs are applied, such that

Equation (A.2)

where QFTd=2 is the Hadamard transform: $\left\vert j\right\rangle \to \frac{1}{\sqrt{2}}{\sum }_{k}{(-1)}^{jk}\left\vert k\right\rangle $ (j, k = 0, 1). The output state is expressed as follows:

Equation (A.3)

Subsequently, we measured the state $\left\vert {k}^{\star }\right\rangle $. If k = 1 is measured, we can achieve the final state as the true solution

Equation (A.4)

The proof is simple. Firstly, we can set $\omega ={\text{e}}^{\text{i}\frac{2\pi }{d}}=(-1)$ with d = 2, and the probability amplitude $\frac{1}{\sqrt{2}}$ is eliminated by the measurement of $\left\vert {k}^{\star }\right\rangle $. Then, by using the discrete delta function

Equation (A.5)

we can prove that equation (A.4) is true. For a simpler analysis, here we assumed q = n: i.e., ${{\sum }_{\mathbf{a}}}^{\prime }={\sum }_{\mathbf{a}\in {\left\{0,1\right\}}^{n}}$. If k = 0 is measured, we cannot retrieve any information of s; that is, the algorithm returns a failure.

In the presence of noise (NBLP). Given the sample state, that is,

Equation (A.6)

with non-zero noise η ≠ 0, the n + 1 QFTs were applied as described above. We then attain the following output state:

Equation (A.7)

which is equal to equation (4) of the main manuscript. Note that we cannot use the delta function in equation (A.5) because unlike equation (A.3), $\left\vert \mathbf{k}\right\rangle $ and $\left\vert {k}^{\star }\right\rangle $ are not perfectly correlated with the error term ea k. Thus, equation (A.7) allows a candidate $\mathbf{k}=\tilde{\mathbf{s}}$ that is generally not equal to the true solution s. We can calculate the success probability, denoted by PS = P(k = s), by substituting k = s k into equation (A.7):

Equation (A.8)

where we use ${\left\vert \left\langle {k}^{\star }\vert 1\right\rangle \right\vert }^{2}=\frac{1}{2}$. This is equal to equation (14) in the main manuscript.

Footnotes

  • In this context, it was recently pointed out that for the discussion of the quantum solvability of noisy linear problem, not only the sample/time complexity but the superposition size of the prepared quantum-sample should be considered together [11].

  • Such a framework we analysed the sample complexity S is borrowed, for example, from the (noisy) probably-approximately-correct learning model [32].

  • 10 

    We note further that the number of logical qubits would arguably be less important than the depth of quantum circuit in terms of the algorithm speed, because the logical qubit is by definition scalable. Such an issue in the quantum NBLP can be alleviated by using the divide–conquer-strategy [11] or a concept of multi-core quantum processing unit [36].

Please wait… references are loading.