MSThesis/Chapter2.tex

\TUchapter{RELATED WORKS}
Many authors and researchers have developed or extended attack graphs since their beginning
as attack trees. This Chapter reviews a few of their efforts as they relate to this work and to graph generation.
\TUsection{Introduction to Graph Generation}
Graph generation as a broad topic has many challenges that prevent full actualization of computation seen from a theoretical standpoint.
In actuality, graph generation often achieves only a very low percentage of its expected performance \cite{berry_graph_2007}. A few reasons
for this occurrence lie in the underlying mechanisms of graph generation. The generation is predominantly memory based (as opposed to based on processor speed),
where performance is tied to memory access time, the complexity of data dependency, and coarseness of parallelism \cite{berry_graph_2007}, \cite{zhang_boosting_2017},
\cite{ainsworth_graph_2016}. Graphs consume large amounts of memory through their
nodes and edges, graph data structures suffer from poor cache locality, and memory latency from the processor-memory gap all slow the generation process dramatically
\cite{berry_graph_2007}, \cite{ainsworth_graph_2016}. Section \ref{sec:gen_improv} discusses a few works that can be used to improve the graph generation process, and Section
\ref{sec:related_works} discusses a few works specific to attack graph generation improvements.

\TUsection{Graph Generation Improvements} \label{sec:gen_improv}
For architectural and hardware techniques for generation improvement, the authors of \cite{ainsworth_graph_2016} discuss the high cache miss rate, and how general prefetching
does not increase the prediction rate due to nonsequential graph structures and data-dependent access patterns. However, the authors continue to discuss that generation
algorithms are known in advance, so explicit tuning of the hardware prefetcher to follow the traversal order pattern can lead to better performance. The authors were able to achieve
over 2x performance improvement of a breadth-first search approach with this method. Another hardware approach is to make use of accelerators. The authors of \cite{yao_efficient_2018}
present an approach for minimizing the slowdown caused by the underlying graph atomic functions. By using the atomic function patterns, the authors utilized pipeline stages where vertex
updates can be processed in parallel dynamically. Other works, such as those by the authors of \cite{zhang_boosting_2017} and \cite{dai_fpgp_2016}, leverage field-programmable gate arrays
(FPGAs) for graph generation in the HPC space through various means. This includes reducing memory strain, storing repeatedly accessed lists, storing results, or other storage through the
on-chip block RAM, or even leveraging Hybrid Memory Cubes for optimizing parallel access.

From a data structure standpoint, the authors of \cite{arifuzzaman_fast_2015} describe the infeasibility of adjacency matrices in large-scale graphs, and this work and other works such as those
by the authors of \cite{yu_construction_2018} and \cite{liakos_memory-optimized_2016} discuss the appeal of distributing a graph representation across systems. The author of
\cite{liakos_memory-optimized_2016} discusses the usage of distributed adjacency lists for assigning vertices to workers. The authors of \cite{liakos_memory-optimized_2016} and
\cite{balaji_graph_2016} present other techniques for minimizing communication costs by achieving high compression ratios while maintaining a low compression cost. The Boost Graph Library
and the Parallel Boost Graph Library both provide appealing features for working with graphs, with the latter library notably having interoperability with MPI, Graphviz, and METIS
\cite{noauthor_overview_nodate}, \cite{noauthor_boost_nodate}.

\TUsection{Improvements Specific to Attack Graph Generation} \label{sec:related_works}
As a means of improving scalability of attack graphs, the authors of \cite{ou_scalable_2006} present a new representation scheme. Traditional attack graphs encode the entire network at each state,
but the representation presented by the authors uses logical statements to represent a portion of the network at each node. This is called a logical attack graph. This approach led to the reduction of the generation process
to quadratic time and reduced the number of nodes in the resulting graph to $\mathcal{O}({n}^2)$. However, this approach does require more analysis for identifying attack vectors. Another approach
presented by the authors of \cite{cook_scalable_2016} represents a description of systems and their qualities and topologies as a state, with a queue of unexplored states. This work was continued by the
authors of \cite{li_concurrency_2019} by implementing a hash table among other features. Each of these works demonstrates an improvement in scalability through refining the desirable information output.

Another approach for generation improvement is through parallelization. The authors of \cite{li_concurrency_2019} leverage OpenMP to parallelize the exploration of a FIFO queue. This parallelization also
includes the utilization of OpenMP's dynamic scheduling. In this approach, each thread receives a state to explore, where a critical section is employed to handle the atomic functions of merging new state
information while avoiding collisions, race conditions, or stale data usage. The authors measured a 10x speedup over the serial algorithm. The authors of \cite{9150145} present a parallel generation
approach using CUDA, where speedup is obtained through a large number of CUDA cores. For a distributed approach, the authors of \cite{7087377} present a technique for utilizing reachability hyper-graph partitioning
and a virtual shared memory abstraction to prevent duplicate work by multiple nodes. This work had promising results in terms of limiting the state-space explosion and speedup as the number of network hosts increases.