% vim: ft=tex \section{Implementation} This chapter discusses how the concepts introduced before are implemented into a simulator. Futher the infrastructure around the simulation and some tools are explained. The implementation is written as a \lstinline{python3} module. This allows users to quickly construct circuit, apply them to a state and measure amplitudes. Full access to the state (including intermediate) state has been priorized over execution speed. To keep the simulation speed as high as possible under these constraints some parts are implemented in \lstinline{C} \subsection{Dense State Vector Simulation} \subsubsection{Representation of Dense State Vectors} Recalling \eqref{eq:ci} any $n$-qbit state can be represented as a $2^n$ component vector in the integer state basis. This representation has some useful features when it comes to computations: \begin{itemize} \item{The projection on the integer states is trivial.} \item{For any qbit $j$ and $0 \le i \le 2^n-1$ the coefficient $c_i$ is part of the $\ket{1}_j$ amplitude iff $i \& (1 << j)$ and part of the $\ket{0}_j$ amplitude otherwise.} \item{For a qbit $j$ the coefficients $c_i$ and $c_{i \hat{} (1 << j)}$ are the conjugated coefficients.} \end{itemize} Where $\hat{}$ is the binary XOR, $\&$ the binary AND and $<<$ the binary leftshift operator. While implementing the dense state vectors two key points were allowing a simple and readable way to use them and simple access to the states by users that want more information than an abstracted view could allow. To meet both requirements the states are implemented as Python objects providing abstract features such as normalization checking, checking for sufficient qbit number when applying a circuit, computing overlaps with other states, a stringify method and stored measurement results. To store the measurement results a NumPy \lstinline{int8} array \cite{numpy_array} is used; this is called the classical state. The Python states also have a NumPy \lstinline{cdouble} array that stores the quantum mechanical state. Using NumPy arrays has the advantage that access to the data is simple and safe while operations on the states can be implemented in \lstinline{C} \cite{numpy_ufunc} providing a considerable speedup. This quantum mechanical state is the component vector in integer basis therefore it has $2^n$ components. Storing those components is acceptable in a range from $1$ to $30$ qbits; above this range the state requires space in the order of $1 \mbox{ GiB}$ which is in the range of usual RAM sizes for personal computers. For higher qbit numbers moving to high performance computers and other simulators is necessary. \subsubsection{Gates} Gates on dense state vectors are implemented as NumPy Universal Functions (ufuncs) \cite{numpy_ufunc} mapping a classical and a quantum state to a new classical state, a new quantum state and a $64 \mbox{ bit}$ integer indicating what qbits have been measured. Using ufuncs has the great advantage that managing memory is done by NumPy and an application programmer just has to implement the logic of the function. Because ufuncs are written in \lstinline{C} they provide a considerable speedup compared to an implementation in Python. The logic of gates is usually easy to implement using the integer basis. The example below implements the Hadamard gate \ref{ref:singleqbitgates}: \adjustbox{max width=\textwidth}{\lstinputlisting[language=C, firstline=153, lastline=178]{../pyqcs/src/pyqcs/gates/implementations/basic_gates.c}} A basic set of gates is implemented in PyQCS: \begin{itemize} \item{Hadamard $H$ gate.} \item{Pauli $X$ or \textit{NOT} gate.} \item{Pauli $Z$ gate.} \item{The $S$ phase gate.} \item{$Z$ rotation $R_\phi$ gate.} \item{Controlled $X$ gate: $CX$.} \item{Controlled $Z$ gate: $CZ$.} \item{The measurement "gate" $M$.} \end{itemize} To allow the implementation of possible hardware related gates the class \lstinline{GenericGate} takes a unitary $2\times2$ matrix as a NumPy \lstinline{cdouble} array and builds a gate from it. \subsubsection{Circuits} \label{ref:pyqcs_circuits} As mentioned in \ref{ref:quantum_circuits} quantum circuits are central in quantum programming. In the implementation great care was taken to make writing circuits as convenient and readable as possible. Users will almost never access the actual gates that perform the operation on a state; instead they will handle circuits.\\ Circuits can be applied to a state by multiplying them from the left on a state object: \begin{lstlisting}[language=Python] new_state = circuit * state \end{lstlisting} The elementary gates such as $H, R_\phi, CX$ are implemented as single gate circuits and can be constructing using the built-in generators. The generators take the act-qbit as first argument, parameters such as the control qbit or an angle as second argument: %\adjustbox{max width=\textwidth}{ \begin{lstlisting}[language=Python] In [1]: from pyqcs import CX, CZ, H, R, Z, X ...: from pyqcs import State ...: ...: state = State.new_zero_state(2) ...: intermediate_state = H(0) * state ...: ...: bell_state = CX(1, 0) * intermediate_state In [2]: bell_state Out[2]: (0.7071067811865476+0j)*|0b0> + (0.7071067811865476+0j)*|0b11> \end{lstlisting} %} Large circuits can be constructed using the binary OR operator \lstinline{|} in an analogy to the pipeline operator on many *NIX systems. As usual circuits are read from left to right similar to pipelines on *NIX systems: %\adjustbox{max width=\textwidth}{ \begin{lstlisting}[language=Python] In [1]: from pyqcs import CX, CZ, H, R, Z, X ...: from pyqcs import State ...: ...: state = State.new_zero_state(2) ...: ...: # This is the same as ...: # circuit = H(0) | CX(1, 0) ...: circuit = H(0) | H(1) | CZ(1, 0) | H(1) ...: ...: bell_state = circuit * state In [2]: bell_state Out[2]: (0.7071067811865477+0j)*|0b0> + (0.7071067811865477+0j)*|0b11> \end{lstlisting} %} A quick way to generate circuits programatically is to use the \lstinline{list_to_circuit} function: %\adjustbox{max width=\textwidth}{ \begin{lstlisting}[language=Python] In [1]: from pyqcs import CX, CZ, H, R, Z, X ...: from pyqcs import State, list_to_circuit ...: ...: circuit_CX = list_to_circuit([CX(i, i-1) for i in range(1, 5)]) ...: ...: state = (H(0) | circuit_CX) * State.new_zero_state(5) In [2]: state Out[2]: (0.7071067811865476+0j)*|0b0> + (0.7071067811865476+0j)*|0b11111> \end{lstlisting} %} \subsection{Graphical State Simulation} \subsubsection{Graphical States} For the graphical state $(V, E, O)$ the list of vertices $V$ can be stored implicitly by demanding $V = \{0, ..., n - 1\}$. This leaves two components that have to be stored: The edges $E$ and the vertex operators $O$. Storing the vertex operators is done using a \lstinline{uint8_t} array. Every local Clifford operator is associated from $0$ to $24$, their order is \begin{equation} \begin{aligned} &\left(\begin{matrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2}\\\frac{\sqrt{2}}{2} & - \frac{\sqrt{2}}{2}\end{matrix}\right), \left(\begin{matrix}1 & 0\\0 & i\end{matrix}\right), \left(\begin{matrix}1 & 0\\0 & 1\end{matrix}\right), \left(\begin{matrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2}\\\frac{\sqrt{2} i}{2} & - \frac{\sqrt{2} i}{2}\end{matrix}\right), \\ &\left(\begin{matrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2} i}{2}\\\frac{\sqrt{2}}{2} & - \frac{\sqrt{2} i}{2}\end{matrix}\right), \left(\begin{matrix}1 & 0\\0 & -1\end{matrix}\right), \left(\begin{matrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2} i}{2}\\\frac{\sqrt{2} i}{2} & \frac{\sqrt{2}}{2}\end{matrix}\right), \left(\begin{matrix}\frac{\sqrt{2}}{2} & - \frac{\sqrt{2}}{2}\\\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2}\end{matrix}\right), \\ &\left(\begin{matrix}1 & 0\\0 & - i\end{matrix}\right), \left(\begin{matrix}\frac{\sqrt{2}}{2} & - \frac{\sqrt{2}}{2}\\\frac{\sqrt{2} i}{2} & \frac{\sqrt{2} i}{2}\end{matrix}\right), \left(\begin{matrix}\frac{\sqrt{2}}{2} & - \frac{\sqrt{2} i}{2}\\\frac{\sqrt{2}}{2} & \frac{\sqrt{2} i}{2}\end{matrix}\right), \left(\begin{matrix}\frac{\sqrt{2}}{2} & - \frac{\sqrt{2} i}{2}\\\frac{\sqrt{2} i}{2} & - \frac{\sqrt{2}}{2}\end{matrix}\right), \\ &\left(\begin{matrix}\frac{1}{2} + \frac{i}{2} & \frac{1}{2} - \frac{i}{2}\\\frac{1}{2} - \frac{i}{2} & \frac{1}{2} + \frac{i}{2}\end{matrix}\right), \left(\begin{matrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2}\\- \frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2}\end{matrix}\right), \left(\begin{matrix}0 & 1\\1 & 0\end{matrix}\right), \left(\begin{matrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2}\\- \frac{\sqrt{2} i}{2} & \frac{\sqrt{2} i}{2}\end{matrix}\right), \\ &\left(\begin{matrix}0 & 1\\i & 0\end{matrix}\right), \left(\begin{matrix}\frac{1}{2} - \frac{i}{2} & \frac{1}{2} + \frac{i}{2}\\- \frac{1}{2} + \frac{i}{2} & \frac{1}{2} + \frac{i}{2}\end{matrix}\right), \left(\begin{matrix}0 & i\\1 & 0\end{matrix}\right), \left(\begin{matrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2} i}{2}\\- \frac{\sqrt{2} i}{2} & - \frac{\sqrt{2}}{2}\end{matrix}\right), \\ &\left(\begin{matrix}\frac{1}{2} - \frac{i}{2} & - \frac{1}{2} + \frac{i}{2}\\- \frac{1}{2} + \frac{i}{2} & - \frac{1}{2} + \frac{i}{2}\end{matrix}\right), \left(\begin{matrix}0 & -1\\1 & 0\end{matrix}\right), \left(\begin{matrix}\frac{\sqrt{2}}{2} & - \frac{\sqrt{2}}{2}\\- \frac{\sqrt{2} i}{2} & - \frac{\sqrt{2} i}{2}\end{matrix}\right), \left(\begin{matrix}\frac{1}{2} - \frac{i}{2} & \frac{i \left(-1 + i\right)}{2}\\- \frac{1}{2} + \frac{i}{2} & \frac{i \left(-1 + i\right)}{2}\end{matrix}\right) \end{aligned} \end{equation} The edges are stored in an adjacency matrix \begin{equation} A = (a_{i,j})_{i,j = 0, ..., n-1} \end{equation} \begin{equation} \begin{aligned} a_{i,j} = \left\{ \begin{array}{c} 1 \mbox{, if } \{i,j\} \in E\\ 0 \mbox{, if} \{i,j\} \notin E \end{array}\right. . \end{aligned} \end{equation} Recalling some operations on the graph as described in \ref{ref:dynamics_graph}, \ref{ref:meas_graph} or Lemma \ref{lemma:M_a} one sees that it is important to efficiently access and modify the neighbourhood of a vertex. To ensure good performance when accessing the neighbourhood while keeping the memory requirements low a linked list-array hybrid is used to store the adjacency matrix. For every vertex the neighbourhood is stored in a sorted linked list (which is a sparse representation of a column vector) and these adjacency lists are stored in a length $n$ array. Using this storage method all operations including searching and toggling edges are inherite their time complexity from the sorted linked list. \subsubsection{Operations on Graphical States} Operations on Graphical States are divided into three classes: Local Clifford operations, the CZ operation and measurements. The graphical states are implemented in \lstinline{C} and are exported to python3 in the class \lstinline{RawGraphState}. This class has three main methods to implement the three classes of operations. %\begin{description} % \item[\lstinline{RawGraphState.apply_C_L}]{This method % implements local clifford gates. It takes the qbit index and the index % of the local Clifford operator (ranging form $0$ to $23$).} % \item[\lstinline{RawGraphState.apply_CZ}]{Applies the $CZ$ gate to the % state. The first argument is the act-qbit, the second the control % qbit (note that this is just for consistency to the $CX$ gate).} % \item[\lstinline{RawGraphState.measure}]{Using this method one can % measure a qbit. It takes the qbit index as first argument and % a floating point (double precision) random number as second % argument. This random number is used to decide the measurement outcome % in non-deterministic measurements. This method returns either $1$ or $0$ as % a measurement result.} %\end{description} Because this way of modifying the state is rather unconvenient and might lead to many errors the \lstinline{RawGraphState} is wrapped by the pure python class \lstinline{pyqcs.graph.state.GraphState}. It allows the use of circuits as described in \ref{ref:pyqcs_circuits} and provides the method \lstinline{GraphState.to_naive_state} to convert the graphical state to a dense vector state. \subsubsection{Pure \lstinline{C} Implementation} Because python tends to be rather slow and might not run on any architecture a pure \lstinline{C} implementation of the graphical simulator is also provided. It should be seen as a reference implementation that can be extended to the needs of the user. This implementation reads byte code from a file and executes it. The execution is always done in three steps: \begin{enumerate}[1] \item{Initializing the state according the the header of the bytecode file.} \item{Applying operations given by the bytecode to the state. This includes local Clifford gates, $CZ$ gates and measurements (the measurement outcome is ignored).} \item{Sampling the state according the the description given in the header of the byte code file and writing the sampling results to either a file or \lstinline{stdout}. } \end{enumerate} \subsection{Utilities} TODO \subsection{Performance} TODO