% vim: ft=tex \section{Appendix} \subsection{Source Code for the Benchmarks} \label{ref:code_benchmarks} The benchmarks used in \ref{ref:performance} are done using this code. Note that the execution time is measured which is inherently noisy. To account for the noise several strategies are used: \begin{enumerate}[1] \item{The same circuit is applied to the starting state several times. The minimal result is used as the noise must be positive} \item{Several circuits are applied to the starting state. The remaining noise is mixed with the variance due to the different circuits.} \item{Because the noise can be timely correlated (i.e. another process requires processor time for a longer period) the tests have been randomized such that the time correlated noise is distributed randomly over several uncorrelated measurements.} \end{enumerate} The code used to benchmark the three regimes is analogous and not included here. \lstinputlisting[title={Generating Data for the Dense State Vector vs. Graphical Simulator Benchmark}, language=Python, breaklines=true]{../performance/generate_data_scaling_qbits.py} \lstinputlisting[title={Code for Measuring and Computing the Execution Time and Statistics}, language=Python, breaklines=true]{../performance/measure_circuit.py} \subsection{Complete Graphical States from the Three Regimes} \label{ref:complete_graphs} Because the whole graphs are barely percetible windows have been used in Figure \ref{fig:graph_high_linear_regime} and Figure \ref{fig:graph_intermediate_regime}. For the sake of completeness the whole graphs are included here in Figure \ref{fig:graph_low_linear_regime_full}, Figure \ref{fig:graph_intermediate_regime_full} and Figure \ref{fig:graph_high_linear_regime_full}. \begin{figure}[H] \centering \includegraphics[width=\linewidth]{graphics/graph_low_linear_regime.png} \caption[Typical Graphical State in the Low-Linear Regime]{Typical Graphical State in the Low-Linear Regime} \label{fig:graph_low_linear_regime_full} \end{figure} \begin{figure}[H] \centering \includegraphics[width=\linewidth]{graphics/graph_intermediate_regime.png} \caption[Typical Graphical State in the Intermediate Regime]{Typical Graphical State in the Intermediate Regime} \label{fig:graph_intermediate_regime_full} \end{figure} \begin{figure}[H] \centering \includegraphics[width=\linewidth]{graphics/graph_high_linear_regime.png} \caption[Typical Graphical State in the High-Linear Regime]{Typical Graphical State in the High-Linear Regime} \label{fig:graph_high_linear_regime_full} \end{figure} \subsection{Code to Generate the Example Graphs} \label{ref:code_example_graphs} This code has been used to generate the example graphs used in \ref{ref:performance}. Note that generating the graph is done using a random circuit as used in \ref{ref:code_benchmarks}. The generated \lstinline{dot} code is converted to an image using \lstinline{dot i_regime.dot -Tpng -o i_regime.png}. \lstinputlisting[title={Code used to Generate the Example Graphs}, language=Python, breaklines=true]{../performance/regimes/graph_intermediate_regime.py} \subsection{Code to Benchmark \lstinline{ufunc} Gates against Python} \label{ref:benchmark_ufunc_py} It has been mentioned several times that the implementation using \lstinline{ufuncs} as gates is faster than using a \lstinline{python} implementation. To support this statement a simple benchmark can be used. The relatively simple Pauli $X$ is used, more complicated gates like $CX$ or $H$ have worse performance when implemented in \lstinline{python}. The performance improvement when using the \lstinline{ufunc} is a factor around $6.4$ in this tested case. One must however note that the tested \lstinline{python} code is not realistic and in a possible applications there would be a significant overhead. \lstinputlisting[title={Code to Benchmark \lstinline{ufunc} Gates against Python}, language=Python, breaklines=True]{extra_benchmark/benchmark.py} When using \lstinline{result_py[0::2] = qm_state[1::2]} the result is identical and the performance is only increased by a factor around $1.7$. This method is however not applicable to general act-qbits and the bit mask has to be used.