524 lines
26 KiB
TeX
524 lines
26 KiB
TeX
% vim: ft=tex
|
|
\section{Implementation}
|
|
|
|
This chapter discusses how the concepts introduced before are implemented into
|
|
a simulator. Futher the infrastructure around the simulation and some tools are
|
|
explained.
|
|
|
|
The implementation is written as the \lstinline{python3} package \lstinline{pyqcs} \cite{pyqcs}. This allows
|
|
users to quickly construct circuits, apply them to states and measure
|
|
amplitudes. Full access to the state (including intermediate states) has been
|
|
priorized over execution speed. To keep the simulation speed as high as
|
|
possible under these constraints some parts are implemented in \lstinline{C}.
|
|
|
|
\subsection{Dense State Vector Simulation}
|
|
|
|
\subsubsection{Representation of Dense State Vectors}
|
|
|
|
Recalling \eqref{eq:ci} any $n$-qbit state can be represented as a $2^n$
|
|
component vector in the integer state basis. This representation has some
|
|
useful features when it comes to computations:
|
|
|
|
\begin{itemize}
|
|
\item{The projection on the integer states is trivial.}
|
|
\item{For any qbit $j$ and $0 \le i \le 2^n-1$ the coefficient $c_i$ is part of the $\ket{1}_j$ amplitude iff
|
|
$i \& (1 << j)$ and part of the $\ket{0}_j$ amplitude otherwise.}
|
|
\item{For a qbit $j$ the coefficients $c_i$ and $c_{i \hat{ } (1 << j)}$ are the conjugated coefficients.}
|
|
\end{itemize}
|
|
|
|
Where $\hat{}$ is the binary XOR, $\&$ the binary AND and $<<$ the binary
|
|
leftshift operator.
|
|
|
|
While implementing the dense state vectors two key points were allowing
|
|
a simple and readable way to use them and simple access to the states by users
|
|
that want more information than an abstracted view could allow. To meet both
|
|
requirements the states are implemented as Python objects providing abstract
|
|
features such as normalization checking, checking for sufficient qbit number
|
|
when applying a circuit, computing overlaps with other states, a stringify
|
|
method and stored measurement results. To store the measurement results
|
|
a NumPy \lstinline{int8} array \cite{numpy_array} is used; this is called the
|
|
classical state. The Python states also have a NumPy \lstinline{cdouble} array
|
|
that stores the quantum mechanical state. Using NumPy arrays has the advantage
|
|
that access to the data is simple and safe while operations on the states can
|
|
be implemented in \lstinline{C} \cite{numpy_ufunc} providing a considerable
|
|
speedup \ref{ref:benchmark_ufunc_py}.
|
|
|
|
This quantum mechanical state is the component vector in integer basis
|
|
therefore it has $2^n$ components. Storing those components is acceptable in
|
|
a range from $1$ to $30$ qbits; above this range the state requires space in
|
|
the order of $1 \mbox{ GiB}$ which is in the range of usual RAM sizes for
|
|
personal computers. For higher qbit numbers moving to high performance
|
|
computers and other simulators is necessary.
|
|
|
|
\subsubsection{Gates}
|
|
|
|
Gates on dense state vectors are implemented as NumPy Universal Functions
|
|
(ufuncs) \cite{numpy_ufunc} mapping a classical and a quantum state to a new
|
|
classical state, a new quantum state and a $64 \mbox{ bit}$ integer indicating
|
|
what qbits have been measured. Using ufuncs has the great advantage that
|
|
managing memory is done by NumPy and an application programmer just has to
|
|
implement the logic of the function. Because ufuncs are written in
|
|
\lstinline{C} they provide a considerable speedup compared to an implementation
|
|
in Python \ref{ref:benchmark_ufunc_py}.
|
|
|
|
The logic of gates is usually easy to implement using the integer basis. The
|
|
example below implements the Hadamard gate \ref{ref:singleqbitgates}:
|
|
|
|
\lstinputlisting[title={Implementation of the Hadamard Gate in C}, language=C, firstline=153, lastline=178, breaklines=true]{../pyqcs/src/pyqcs/gates/implementations/basic_gates.c}
|
|
|
|
A basic set of gates is implemented in PyQCS:
|
|
|
|
\begin{itemize}
|
|
\item{Hadamard $H$ gate.}
|
|
\item{Pauli $X$ or \textit{NOT} gate.}
|
|
\item{Pauli $Z$ gate.}
|
|
\item{The $S$ phase gate.}
|
|
\item{$Z$ rotation $R_\phi$ gate.}
|
|
\item{Controlled $X$ gate: $CX$.}
|
|
\item{Controlled $Z$ gate: $CZ$.}
|
|
\item{The "measurement gate" $M$.}
|
|
\end{itemize}
|
|
|
|
To allow the implementation of possible hardware related gates the class
|
|
\lstinline{GenericGate} takes a unitary $2\times2$ matrix as a NumPy
|
|
\lstinline{cdouble} array and builds a gate from it.
|
|
|
|
\subsubsection{Circuits}
|
|
\label{ref:pyqcs_circuits}
|
|
|
|
As mentioned in \ref{ref:quantum_circuits} quantum circuits are central in
|
|
quantum programming. In the implementation great care was taken to make
|
|
writing circuits as convenient and readable as possible. Users will almost
|
|
never access the actual gates that perform the operation on a state; instead
|
|
they will handle circuits.\\ Circuits can be applied to a state by multiplying
|
|
them from the left on a state object:
|
|
|
|
\begin{lstlisting}[language=Python]
|
|
new_state = circuit * state
|
|
\end{lstlisting}
|
|
|
|
|
|
The elementary gates such as $H, R_\phi, CX$ are implemented as single gate
|
|
circuits and can be constructing using the built-in generators. The generators
|
|
take the act-qbit as first argument, parameters such as the control qbit or an
|
|
angle as second argument:
|
|
|
|
\begin{lstlisting}[language=Python, breaklines=true, caption={Using Single Gate Circuits}]
|
|
In [1]: from pyqcs import CX, CZ, H, R, Z, X
|
|
...: from pyqcs import State
|
|
...:
|
|
...: state = State.new_zero_state(2)
|
|
...: intermediate_state = H(0) * state
|
|
...:
|
|
...: bell_state = CX(1, 0) * intermediate_state
|
|
|
|
In [2]: bell_state
|
|
Out[2]: (0.7071067811865476+0j)*|0b0>
|
|
+ (0.7071067811865476+0j)*|0b11>
|
|
\end{lstlisting}
|
|
|
|
Large circuits can be constructed using the binary OR operator \lstinline{|} in
|
|
an analogy to the pipeline operator on many *NIX shells. As usual circuits are
|
|
read from left to right similar to pipelines on *NIX shells:
|
|
|
|
|
|
%\adjustbox{max width=\textwidth}{
|
|
\begin{lstlisting}[language=Python, breaklines=true, caption={Constructing Circuits Using \lstinline{|}}]
|
|
In [1]: from pyqcs import CX, CZ, H, R, Z, X
|
|
...: from pyqcs import State
|
|
...:
|
|
...: state = State.new_zero_state(2)
|
|
...:
|
|
...: # This is the same as
|
|
...: # circuit = H(0) | CX(1, 0)
|
|
...: circuit = H(0) | H(1) | CZ(1, 0) | H(1)
|
|
...:
|
|
...: bell_state = circuit * state
|
|
|
|
In [2]: bell_state
|
|
Out[2]: (0.7071067811865477+0j)*|0b0>
|
|
+ (0.7071067811865477+0j)*|0b11>
|
|
\end{lstlisting}
|
|
%}
|
|
|
|
A quick way to generate circuits programatically is to use the \lstinline{list_to_circuit}
|
|
function:
|
|
|
|
%\adjustbox{max width=\textwidth}{
|
|
\begin{lstlisting}[language=Python, breaklines=true, caption={Constructing Circuits Using Python Lists}]
|
|
In [1]: from pyqcs import CX, CZ, H, R, Z, X
|
|
...: from pyqcs import State, list_to_circuit
|
|
...:
|
|
...: circuit_CX = list_to_circuit([CX(i, i-1) for i in range(1, 5)])
|
|
...:
|
|
...: state = (H(0) | circuit_CX) * State.new_zero_state(5)
|
|
|
|
In [2]: state
|
|
Out[2]: (0.7071067811865476+0j)*|0b0>
|
|
+ (0.7071067811865476+0j)*|0b11111>
|
|
|
|
\end{lstlisting}
|
|
%}
|
|
|
|
\subsection{Graphical State Simulation}
|
|
|
|
\subsubsection{Graphical States}
|
|
|
|
For the graphical state $(V, E, O)$ the list of vertices $V$ can be stored implicitly
|
|
by demanding $V = \{0, ..., n - 1\}$. This leaves two components that have to be stored:
|
|
The edges $E$ and the vertex operators $O$. Storing the vertex operators is
|
|
done using a \lstinline{uint8_t} array. Every local Clifford operator is
|
|
associated with an integer ranging from $0$ to $24$, their order is
|
|
|
|
\begin{equation}
|
|
\begin{aligned}
|
|
&\left(\begin{matrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2}\\\frac{\sqrt{2}}{2} & - \frac{\sqrt{2}}{2}\end{matrix}\right),
|
|
\left(\begin{matrix}1 & 0\\0 & i\end{matrix}\right),
|
|
\left(\begin{matrix}1 & 0\\0 & 1\end{matrix}\right),
|
|
\left(\begin{matrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2}\\\frac{\sqrt{2} i}{2} & - \frac{\sqrt{2} i}{2}\end{matrix}\right), \\
|
|
&\left(\begin{matrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2} i}{2}\\\frac{\sqrt{2}}{2} & - \frac{\sqrt{2} i}{2}\end{matrix}\right),
|
|
\left(\begin{matrix}1 & 0\\0 & -1\end{matrix}\right),
|
|
\left(\begin{matrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2} i}{2}\\\frac{\sqrt{2} i}{2} & \frac{\sqrt{2}}{2}\end{matrix}\right),
|
|
\left(\begin{matrix}\frac{\sqrt{2}}{2} & - \frac{\sqrt{2}}{2}\\\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2}\end{matrix}\right), \\
|
|
&\left(\begin{matrix}1 & 0\\0 & - i\end{matrix}\right),
|
|
\left(\begin{matrix}\frac{\sqrt{2}}{2} & - \frac{\sqrt{2}}{2}\\\frac{\sqrt{2} i}{2} & \frac{\sqrt{2} i}{2}\end{matrix}\right),
|
|
\left(\begin{matrix}\frac{\sqrt{2}}{2} & - \frac{\sqrt{2} i}{2}\\\frac{\sqrt{2}}{2} & \frac{\sqrt{2} i}{2}\end{matrix}\right),
|
|
\left(\begin{matrix}\frac{\sqrt{2}}{2} & - \frac{\sqrt{2} i}{2}\\\frac{\sqrt{2} i}{2} & - \frac{\sqrt{2}}{2}\end{matrix}\right), \\
|
|
&\left(\begin{matrix}\frac{1}{2} + \frac{i}{2} & \frac{1}{2} - \frac{i}{2}\\\frac{1}{2} - \frac{i}{2} & \frac{1}{2} + \frac{i}{2}\end{matrix}\right),
|
|
\left(\begin{matrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2}\\- \frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2}\end{matrix}\right),
|
|
\left(\begin{matrix}0 & 1\\1 & 0\end{matrix}\right),
|
|
\left(\begin{matrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2}\\- \frac{\sqrt{2} i}{2} & \frac{\sqrt{2} i}{2}\end{matrix}\right), \\
|
|
&\left(\begin{matrix}0 & 1\\i & 0\end{matrix}\right),
|
|
\left(\begin{matrix}\frac{1}{2} - \frac{i}{2} & \frac{1}{2} + \frac{i}{2}\\- \frac{1}{2} + \frac{i}{2} & \frac{1}{2} + \frac{i}{2}\end{matrix}\right),
|
|
\left(\begin{matrix}0 & i\\1 & 0\end{matrix}\right),
|
|
\left(\begin{matrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2} i}{2}\\- \frac{\sqrt{2} i}{2} & - \frac{\sqrt{2}}{2}\end{matrix}\right), \\
|
|
&\left(\begin{matrix}\frac{1}{2} - \frac{i}{2} & - \frac{1}{2} + \frac{i}{2}\\- \frac{1}{2} + \frac{i}{2} & - \frac{1}{2} + \frac{i}{2}\end{matrix}\right),
|
|
\left(\begin{matrix}0 & -1\\1 & 0\end{matrix}\right),
|
|
\left(\begin{matrix}\frac{\sqrt{2}}{2} & - \frac{\sqrt{2}}{2}\\- \frac{\sqrt{2} i}{2} & - \frac{\sqrt{2} i}{2}\end{matrix}\right),
|
|
\left(\begin{matrix}\frac{1}{2} - \frac{i}{2} & \frac{i \left(-1 + i\right)}{2}\\- \frac{1}{2} + \frac{i}{2} & \frac{i \left(-1 + i\right)}{2}\end{matrix}\right).
|
|
\end{aligned}
|
|
\end{equation}
|
|
|
|
The edges are stored in an adjacency matrix
|
|
|
|
\begin{equation}
|
|
A = (a_{i,j})_{i,j = 0, ..., n-1}
|
|
\end{equation}
|
|
|
|
\begin{equation}
|
|
\begin{aligned}
|
|
a_{i,j} = \left\{ \begin{array}{c} 1 \mbox{, if } \{i,j\} \in E\\
|
|
0 \mbox{, if } \{i,j\} \notin E \end{array}\right.
|
|
.
|
|
\end{aligned}
|
|
\end{equation}
|
|
|
|
Recalling some operations on the graph as described in
|
|
\ref{ref:dynamics_graph}, \ref{ref:meas_graph} or Lemma \ref{lemma:M_a} one
|
|
sees that it is important to efficiently access and modify the neighbourhood of
|
|
a vertex. To ensure good performance when accessing the neighbourhood while
|
|
keeping the required memory low a linked list-array hybrid is used to store
|
|
the adjacency matrix. For every vertex the neighbourhood is stored in a sorted
|
|
linked list (which is a sparse representation of a column vector) and these
|
|
adjacency lists are stored in a length $n$ array.
|
|
|
|
Using this storage method all operations including searching and toggling edges
|
|
inherite their time complexity from the sorted linked list.
|
|
|
|
\subsubsection{Operations on Graphical States}
|
|
|
|
Operations on Graphical States are divided into three classes: Local Clifford
|
|
operations, the CZ operation and measurements. The graphical states are
|
|
implemented in \lstinline{C} and are exported to python3 in the class
|
|
\lstinline{RawGraphState}. This class has three main methods to implement the
|
|
three classes of operations.
|
|
|
|
\begin{description}
|
|
\item[\hspace{-1em}]{\lstinline{RawGraphState.apply_C_L}\\
|
|
This method implements local Clifford gates. It takes the qbit index
|
|
and the index of the local Clifford operator (ranging form $0$ to $23$).}
|
|
\item[\hspace{-1em}]{\lstinline{RawGraphState.apply_CZ}\\
|
|
Applies the $CZ$ gate to the state. The first argument is the
|
|
act-qbit, the second the control qbit (note that this is just for
|
|
consistency to the $CX$ gate).}
|
|
\item[\hspace{-1em}]{\lstinline{RawGraphState.measure}\\
|
|
Using this method one can measure a qbit. It takes the qbit index
|
|
as first argument and a floating point (double precision) random
|
|
number as second argument. This random number is used to decide the
|
|
measurement outcome in non-deterministic measurements. This method
|
|
returns either $1$ or $0$ as a measurement result.}
|
|
\end{description}
|
|
|
|
Because this way of modifying the state is rather unconvenient and might lead to many
|
|
errors the \lstinline{RawGraphState} is wrapped by the pure python class\\
|
|
\lstinline{pyqcs.graph.state.GraphState}. It allows the use of circuits as
|
|
described in \ref{ref:pyqcs_circuits} and provides the method
|
|
\lstinline{GraphState.to_naive_state} to convert the graphical state to a dense
|
|
vector state.
|
|
|
|
\subsubsection{Pure C Implementation}
|
|
|
|
Because python tends to be rather slow \cite{benchmarkgame} and might not run on any architecture
|
|
a pure \lstinline{C} implementation of the graphical simulator is also provided.
|
|
It should be seen as a reference implementation that can be extended to the needs
|
|
of the user.
|
|
|
|
This implementation reads byte code from a file and executes it. The execution is
|
|
always done in three steps:
|
|
|
|
\begin{enumerate}[1]
|
|
\item{Initializing the state according the the header of the bytecode file.}
|
|
\item{Applying operations given by the bytecode to the state. This includes local
|
|
Clifford gates, $CZ$ gates and measurements (the measurement outcome is ignored).}
|
|
\item{Sampling the state according the the description given in the header of the byte code
|
|
file and writing the sampling results to either a file or \lstinline{stdout}. }
|
|
\end{enumerate}
|
|
|
|
\subsection{Utilities}
|
|
|
|
To make both using the simulators more convenient and to help with using them
|
|
in as scientific or educational context several utilities have been written.
|
|
This chapter explains some of them.
|
|
|
|
\subsubsection{Sampling and Circuit Generation}
|
|
|
|
The function \lstinline{pyqcs.sample} provides a simple way to sample from
|
|
a state. Copies of the state are made when necessary and the results are
|
|
returned in a \lstinline{collections.Counter} object. Several qbits can be
|
|
sampled at once; they can be passed to the function either as an integer which
|
|
will be interpreted as a bit mask and the least significant bit will be sampled
|
|
first. When passing the qbits to sample as a list of integers the integers are
|
|
interpreted as qbit indices and are measured in the order they appear.
|
|
|
|
If the keyword argument \lstinline{keep_states} is \lstinline{True} the
|
|
sampling function will include the resulting states in the result. At the
|
|
moment this works for dense vectors only. Checking for equality on graphical
|
|
states has yet to be implemented but has $NP$ computational hardness
|
|
\cite{dahlberg_ea2019}.
|
|
|
|
Writing circuits out by hand can be rather painful. The function\\
|
|
\lstinline{pyqcs.list_to_circuit} Converts a list of circuits to a circuit.
|
|
This is particularly helpful in combination with python's
|
|
\lstinline{listcomp}:
|
|
|
|
\begin{lstlisting}[caption={Generating a Large Circuit Efficiently}]
|
|
circuit_H = list_to_circuit([H(i) for i in range(nqbits)])
|
|
\end{lstlisting}
|
|
|
|
The module \lstinline{pyqcs.util.random_circuits} provides the method described
|
|
in \ref{ref:performance} to generate random circuits for both graphical and
|
|
dense vector simulation. Using the module \lstinline{pyqcs.util.random_graphs}
|
|
one can generate random graphical states which is more performant than using
|
|
random circuits.
|
|
|
|
The function \lstinline{pyqcs.util.to_circuit.graph_state_to_circuit} converts
|
|
graphical states to circuits (mapping the $\ket{0b0..0}$ to this state).
|
|
Using these circuits the graphical state can be copied or converted to a
|
|
dense vector state. Further it is a way to optimize circuits and later run them on
|
|
other simulators. Also the circuits can be exported to \lstinline{qcircuit} code
|
|
(see below) which is a relatively readable way to represent graphical states.
|
|
|
|
\subsubsection{Exporting and Flattening Circuits}
|
|
|
|
Circuits can be drawn using the \LaTeX package \lstinline{qcircuit}; all
|
|
circuits in this documents use \lstinline{qcircuit}. To visualize the circuits
|
|
built using \lstinline{pyqcs} the function\\
|
|
\lstinline{pyqcs.util.to_diagram.circuit_to_diagram} can be used to generate
|
|
\lstinline{qcircuit} code that can be used in \LaTeX documents or exported to
|
|
PDFs directly. The diagrams produced by this function are not optimized and the
|
|
diagrams can be unnecessary long. Usually this can be fixed easily by editing
|
|
the produced code manually.
|
|
|
|
The circuits constructed using the \lstinline{|} operator have a tree structure
|
|
which is rather unconvenient when optimizing circuits or exporting them.
|
|
The function \\
|
|
\lstinline{pyqcs.util.flatten.flatten} converts a circuit
|
|
to a list of single gate circuits that can be analyzed or exported easily.
|
|
|
|
|
|
\subsection{Performance}
|
|
\label{ref:performance}
|
|
|
|
To test the performance and compare it to the dense vector simulator the python
|
|
module is used. Although the pure \lstinline{C} implementation has potential
|
|
for better performance the python module is better comparable to the dense
|
|
vector simulator which is a python module as well.
|
|
|
|
For performance tests (and for tests against the dense vector simulator) random
|
|
circuits are used. Length $m$ circuits are generated from the probability space
|
|
|
|
|
|
\begin{equation}
|
|
\Omega = \left(\{1, ..., 4n\} \otimes \{1, ..., n-1\} \otimes [0, 2\pi)\right)^{\otimes m}
|
|
\end{equation}
|
|
|
|
with the uniform distribution. The continous part $[0, 2\pi)$ is unused when
|
|
generating random circuits for the graphical simulator; when generating random
|
|
circuits for dense vector simulations this is the argument $\phi$ of the
|
|
$R_\phi$ gate.
|
|
|
|
For $m=1$ an outcome is mapped to a gate using
|
|
|
|
\begin{equation}
|
|
\begin{aligned}
|
|
F(i, k, x) = \left\{\begin{array}{cc} X(i - 1) & \mbox{, if } i \le n \\
|
|
H(i - n - 1) & \mbox{, if } i \le 2n\\
|
|
S(i - 2n - 1) & \mbox{, if } i \le 3n\\
|
|
CZ(i - 3n - 1, k - 1) & \mbox{, if } k \le i - 3n - 1 \\
|
|
CZ(i - 3n - 1, k) & \mbox{, if } k > i - 3n - 1\\
|
|
\end{array}\right.
|
|
.
|
|
\end{aligned}
|
|
\end{equation}
|
|
|
|
This method provides equal probability for $X, H, S$ and $CZ$ gate. For the
|
|
dense vector simulator $S$ can be replaced by $R_\phi$ with the parameter $x$.
|
|
|
|
Using this method circuits are generated and applied both to graphical and
|
|
dense vector states and the time required to execute the operations
|
|
\cite{timeit} is measured. The resulting graph can be seen in
|
|
Figure \ref{fig:scaling_qbits_linear} and Figure \ref{fig:scaling_qbits_log}.
|
|
Note that in both cases the length of the circuits have been scaled linearely
|
|
with the amount of qbits and the measured time was divided by the number of
|
|
qbits:
|
|
|
|
\begin{equation}
|
|
\begin{aligned}
|
|
L_{\mbox{circuit}} &= \alpha n \\
|
|
T_{\mbox{rescaled}} &= \frac{T_{\mbox{execution}}(L_{\mbox{circuit}})}{n}\\
|
|
\end{aligned}
|
|
\end{equation}
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\includegraphics[width=\linewidth]{../performance/scaling_qbits_linear.png}
|
|
\caption[Runtime Behaviour for Scaling Qbit Number]{Runtime Behaviour for Scaling Qbit Number}
|
|
\label{fig:scaling_qbits_linear}
|
|
\end{figure}
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\includegraphics[width=\linewidth]{../performance/scaling_qbits_log.png}
|
|
\caption[Runtime Behaviour for Scaling Qbit Number (Logarithmic Scale)]{Runtime Behaviour for Scaling Qbit Number (Logarithmic Scale)}
|
|
\label{fig:scaling_qbits_log}
|
|
\end{figure}
|
|
|
|
The reason for this scaling will be clear later; one can observe that the
|
|
performance of the graphical simulator increases in some cases with growing
|
|
number of qbits when the circuit length is constant. The code used to generate the
|
|
data for these plots can be found in \ref{ref:code_benchmarks}.
|
|
|
|
As described by \cite{andersbriegel2005} the graphical simulator is exponentially
|
|
faster than the dense vector simulator. According to \cite{andersbriegel2005} it
|
|
is considerably faster than a simulator using the straight forward approach simulating
|
|
the stabilizer tableaux like CHP \cite{CHP} with an average runtime behaviour
|
|
of $\mathcal{O}\left(n\log(n)\right)$ instead of $\mathcal{O}\left(n^2\right)$.
|
|
|
|
One should be aware that the gate execution time (the time required to apply a gate
|
|
to the state) highly depends on the state it is applied to. For the dense vector
|
|
simulator and CHP this is not true: Gate execution time is constant for all gates
|
|
and states. Because the graphical simulator has to toggle neighbourhoods the
|
|
gate execution time of the $CZ$ gate varies greatly. The plot Figure \ref{fig:scaling_circuits_linear}
|
|
shows the circuit execution time for two different numbers of qbits. One can observe three
|
|
regimes:
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\includegraphics[width=\linewidth]{../performance/regimes/scaling_circuits_linear.png}
|
|
\caption[Circuit Execution Time for Scaling Circuit Length]{Circuit Execution Time for Scaling Circuit Length}
|
|
\label{fig:scaling_circuits_linear}
|
|
\end{figure}
|
|
|
|
\begin{description}
|
|
\item[Low-Linear Regime] {Here the circuit execution time scales approximately linearely
|
|
with the number of gates in the circuit (i.e. the $CZ$ gate execution time is approximately constant).
|
|
}
|
|
\item[Intermediate Regime]{The circuit execution time has a nonlinear dependence on the circuit length.}
|
|
\item[High-Linear Regime]{This regime shows a linear dependence on the circuit length; the slope is
|
|
higher than in the low-linear regime.}
|
|
\end{description}
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\includegraphics[width=\linewidth]{../performance/regimes/scaling_circuits_measurements_linear.png}
|
|
\caption[Circuit Execution Time for Scaling Circuit Length with Random Measurements]{Circuit Execution Time for Scaling Circuit Length with Random Measurements}
|
|
\label{fig:scaling_circuits_measurements_linear}
|
|
\end{figure}
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\includegraphics[width=\linewidth]{graphics/graph_low_linear_regime.png}
|
|
\caption[Typical Graphical State in the Low-Linear Regime]{Typical Graphical State in the Low-Linear Regime}
|
|
\label{fig:graph_low_linear_regime}
|
|
\end{figure}
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\includegraphics[width=\linewidth]{graphics/graph_intermediate_regime_cut.png}
|
|
\caption[Window of a Typical Graphical State in the Intermediate Regime]{Window of a Typical Graphical State in the Intermediate Regime}
|
|
\label{fig:graph_intermediate_regime}
|
|
\end{figure}
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\includegraphics[width=\linewidth]{graphics/graph_high_linear_regime_cut.png}
|
|
\caption[Window of a Typical Graphical State in the High-Linear Regime]{Window of a Typical Graphical State in the High-Linear Regime}
|
|
\label{fig:graph_high_linear_regime}
|
|
\end{figure}
|
|
|
|
These two regimes can be explained when considering the graphical states that
|
|
typical live in these regimes. With increased circuit length the amount of
|
|
edges increases which makes toggling neighbourhoods harder. Graphs from the
|
|
low-linear, intermediate and high-linear regime can be seen in Figure
|
|
\ref{fig:graph_low_linear_regime}, Figure \ref{fig:graph_intermediate_regime}
|
|
and Figure \ref{fig:graph_high_linear_regime}. Due to the great amount of edges
|
|
in the intermediate and high-linear regime the pictures show a window of the
|
|
actual graph. The full images are in \ref{ref:complete_graphs}. Further the
|
|
regimes are not clearly visibe for $n>30$ qbits so choosing smaller graphs is
|
|
not possible. The code that was used to generate these images can be found
|
|
in \ref{ref:code_example_graphs}.
|
|
|
|
The Figure \ref{fig:scaling_circuits_measurements_linear} brings more substance
|
|
to this interpretation. In this simulation the Pauli $X$ gate has been replaced
|
|
by the measurement gate $M$, .i.e. in every gate drawn from the probability
|
|
space a qbit is measured with probability $\frac{1}{4}$. As described in
|
|
\cite{hein_eisert_briegel2008} the Schmidt measure for entropy is bounded from
|
|
above by Pauli persistency, i.e. the minimal amount of Pauli measurements
|
|
required to disentangle a state. This Pauli persistency is closely related to
|
|
the amount (and structure of) vertices in the graph
|
|
\cite{hein_eisert_briegel2008}. In particular Pauli measurements decrease the
|
|
entanglement (and the amount of edges) in a state
|
|
\cite{hein_eisert_briegel2008}\cite{li_chen_fisher2019}. The frequent
|
|
measurements in the simulation therefore keeps the amount of edges low thus
|
|
preventing a transition from the low linear regime to the intermediate regime.
|
|
|
|
|
|
Because states with more qbits reach the intermediate regime at higher circuit
|
|
lengths it is important to account for this virtual performance boost when
|
|
comparing with other simulation methods. This explains why the circuit length
|
|
in Figure \ref{fig:scaling_qbits_linear} had to be scaled with the qbit number.
|
|
|
|
\subsection{Future Work}
|
|
|
|
Although the simulator(s) are in a working state and have been tested there is
|
|
still some work that can be done. A noise model helping to teach and analyze
|
|
noisy execution is one particularly interesting piece of work. To allow a user
|
|
to execute circuits on other machines, including both real hardware and
|
|
simulators, a module that exports circuits to OpenQASM \cite{openqasm} seems
|
|
useful.
|
|
|
|
The current implementation of some graphical operations can be optimized. While
|
|
clearing VOPs as described in \ref{ref:dynamics_graph} the neighbourhood of
|
|
a vertex is toggled for every $L_a$ transformation. This is the most straight
|
|
forward implementation but often the $L_a$ transformation is performed several
|
|
times on the same vertex. The neighbourhood would have to be toggled either
|
|
once or not at all depending on whether the number of $L_a$ transformations is
|
|
odd or even.
|
|
|
|
When toggling an edge the simulator uses a series of well tested basic linked
|
|
list operations: Searching an element in the list, inserting an element into
|
|
the list and deleting an element from the list. This is known to have no bugs
|
|
but the performance could be increased by operating directly on the linked
|
|
list. Some initial work to improve this behaviour is already done but does not
|
|
work at the moment.
|