Intro to Intermediate Database Storage

This commit is contained in:
Noah L. Schrick 2022-04-03 18:55:27 -05:00
parent 7843b67e65
commit e6324fc2a3
3 changed files with 24 additions and 11 deletions

View File

@ -11,8 +11,10 @@
url = {https://www.boost.org/doc/libs/1_75_0/libs/graph_parallel/doc/html/overview.html},
}
@article{noauthor_graduate_2018,
title = {{THE} {GRADUATE} {SCHOOL} by {Ahmet} {Burak} {Cengiz} {A} thesis submitted in partial fulfillment of the requirements for the degree of {Master} of {Science} in {Engineering} in the {Discipline} of {Petroleum} {Engineering} {The} {Graduate} {School} {The} {University} of {Tulsa}},
@phdthesis{nichols_2018,
title = {{Hybrid} {Attack} {Graphs} for {Use} with a {Simulation} of a {Cyber-Physical} {System}},
author = {Nichols, Will M.},
school = {The {University} of {Tulsa}},
year = {2018},
file = {Will_Nichols_Thesis_FINAL_VER:/home/noah/Zotero/storage/8AXSZXJN/Will_Nichols_Thesis_FINAL_VER.pdf:application/pdf},
}
@ -23,6 +25,12 @@
url = {https://www.boost.org/doc/libs/1_75_0/libs/graph/doc/index.html},
}
@misc{Graphviz,
title = {{DOT} {Language}},
author = {{The} {Graphviz} {Authors}},
url = {https://graphviz.org/doc/info/lang.html}
}
@misc{noauthor_parallel_nodate,
title = {Parallel {BGL} {Parallel} {Boost} {Graph} {Library} - 1.75.0},
authors = {Edmonds, Nick and Gregor, Douglas and Lumsdaine, Andrew},

View File

@ -3,7 +3,7 @@
\TUsection{Path Walking} \label{sec:PW}
\par Due to the large-scale nature of attack graphs, analysis can become difficult and time-consuming. With some graphs reaching millions of states and edges,
analyzing the entire graph can be overwhelmingly complex. As a means of simplifying analysis, a potential strategy could be to consider only small subsets of
the graph at a time, rather than feeding the entire graph into an analysis algorithm. To aid in this effort, a path walking feature was implemented as a
the graph at a time, rather than feeding the entire graph into an analysis algorithm. This allows users to focus only on specific states or items within the graph without needing to analyze the entirety of the graph. To aid in this effort, a path walking feature was implemented as a
separate program, and has two primary modes of usage. The goal of this feature is to output a subset of the graph that includes all possible paths from the
root state to a designated state. The first mode is a manual mode, where a user can input the desired state to walk to, and the program will output a separate
graph of all possible paths to the specified state. The second mode is an automatic mode, where the program will output separate subgraphs to all states in
@ -25,9 +25,9 @@ As a visual aid for analysis purposes, color coding was another feature implemen
visibly identical in appearance apart from number of edges, edge IDs, and state IDs. To allow for visual differentiation, color coding can be enabled in the run script.
Color coding currently functions by working through the graph output text file, but it can be extended to read directly from the PostgreSQL database instead. The feature scans through the
output file, and locates states that have $``compliance$\_$vios > 0"$ or $``compliance$\_$vio = true"$. For states that meet these
properties, the color coding feature will add a color to the Graphviz DOT file through the $[color=COL]$ attribute for the given node, where \textit{COL} is assigned based on severity.
properties, the color coding feature will add a color to the Graphviz DOT \cite{Graphviz} file through the $[color=COL]$ attribute for the given node, where \textit{COL} is assigned based on severity.
For this version of color coding, severity is determined by the total number of compliance violations a node has, but future versions can alter the severity measure through alternative means.
Figure \ref{fig:CC} displays an example graph that leverages color coding to easily identify problem states.
Figure \ref{fig:CC} displays an example graph that leverages color coding to easily identify problem states. Each shaded state represents a problem state, where the system contains items that are out of compliance or in violation of a regulation. For this specific example, each shaded state represents a state where the vehicle is in need of maintenance on a component, as it has exceeded the recommended time since the last maintenance.
\begin{figure}[htp]
\includegraphics[width=\linewidth]{"./Chapter3_img/CC.png"}
\vspace{.2truein} \centerline{}
@ -40,7 +40,7 @@ Many of the graphs previously generated by RAGE comprise of states with features
established set of qualities that was used, with an established set of values. These typically have included $``compliance$\_$vio=true/false"$,
$``root=true/false"$, or other general $``true/false"$ values or $``version=X"$ qualities. To expand on the types and complexities of graphs that can be
generated, compound operators have been added to RAGE. When updating a state, rather than setting a quality to a specific value, the previous value can now
be modified by an amount specified through standard compound operators such as $\mathrel{+}=$, $\mathrel{-}=$, $\mathrel{*}=$, or $\mathrel{/}=$.
be modified by an amount specified through standard compound operators such as $\mathrel{+}=$, $\mathrel{-}=$, $\mathrel{*}=$, or $\mathrel{/}=$. Previous work on an attack graph generator included the implementation of compound operators, as seen by the author of \cite{nichols_2018}. However, this work was conducted on the previous iteration of an attack graph generator written in Python. This attack graph generator has since been rewritten in C++ by the author of \cite{cook_rage_2018}, and compound operators have not been included in the latest version of RAGE.
The work conducted by the author of \cite{cook_rage_2018} when designing the software architecture of RAGE included specifications for a quality encoding scheme. As the
author discusses, qualities have four fields, which include the asset ID, attributes, operator, and value. The operator field is 4 bits, which allows for a total
@ -68,10 +68,15 @@ of relational operators, to determine whether this exploit was applicable to a n
\textit{version=3.0.0}, or \textit{version=2.0.0}, or \textit{version=1.0.0}, or \textit{version=0.4.3}, etc. For the compliance graph exploit check, this could lead to even worse scaling where checks needed to be conducted at a much more granular level like \textit{engine$\_$coolant$\_$miles=24001}, or \textit{engine$\_$coolant$\_$miles=24002}, or \textit{engine$\_$coolant$\_$miles=24003}, etc. This becomes increasingly tedious when there are many checks to perform, and this not only reduces readability, but is also more
prone to human error when creating the exploit files. Relational operators work to alleviate these difficulties.
Similar to the compound operators discussed in Section \ref{sec:compops}, relational operators were also implemented in a previous attack graph generator, and were implemented as a part of the works performed by \cite{nichols_2018}. This implementation was in the previous Python generator tool, and relational operators were not a feature of RAGE. The implementation of relational operators in RAGE allows users to leverage the performance benefits of the C++ generator and have the ability for parallelization, while providing ease-of-use for exploit file creation.
To implement the relational operators, operator overloads were placed into the Quality class. At the time of writing, the following are implemented: $==$, $<$, $>$, $\leq$, $\geq$. However, these operators do not take up room in the
encoding scheme, so additional operators can be freely implemented as needed. The overloads ensure that the Quality asset IDs and Quality names match, and then compares the Quality values based on the operator in question.
\TUsection{Intermediate Database Storage}\label{sec:db-stor}
\TUsubsection{Introduction to Intermediate Database Storage}
Chapter 2 and the author of \cite{cook_rage_2018} discuss the challenges of attack graph generation in regards to its scalability. Specifically, the author of \cite{cook_rage_2018} displays results from generations based on 11 assets and 11 exploits that lead to 14,200 total states. Generating an attack or compliance graph based on a large network with a multitude of assets and involving a more thorough exploit or compliance violation checking will prevent the entire graph from being stored in memory as originally designed. This Section discusses the challenges of graph generation in regards to memory, and a solution through the implementation of intermediate database storage.
\TUsubsection{Memory Constraint Difficulties}
Previous works with RAGE have been designed around maximizing performance to limit the longer runtime caused by the state space explosion, such as the works seen by the authors of \cite{cook_rage_2018},
\cite{li_concurrency_2019}, and \cite{li_combining_2019}. To this end, the output graph is contained in memory during the generation process to minimize disk writing and reading. This also allows for leveraging the

View File

@ -140,17 +140,17 @@ K.~Kaynar and F.~Sivrikaya, ``Distributed attack graph generation,'' {\em IEEE
Transactions on Dependable and Secure Computing}, vol.~13, no.~5,
pp.~519--532, 2016.
\bibitem{li_combining_2019}
M.~Li, P.~Hawrylak, and J.~Hale, ``Combining {OpenCL} and {MPI} to support
heterogeneous computing on a cluster,'' {\em ACM International Conference
Proceeding Series}, 2019.
\bibitem{CVE-2019-10747}
``{set-value is vulnerable to Prototype Pollution in versions lower than 3.0.1.
The function mixin-deep could be tricked into adding or modifying properties
of Object.prototype using any of the constructor, prototype and $\_$proto$\_$
payloads.}.'' National Vulnerability Database, Aug. 2019.
\bibitem{li_combining_2019}
M.~Li, P.~Hawrylak, and J.~Hale, ``Combining {OpenCL} and {MPI} to support
heterogeneous computing on a cluster,'' {\em ACM International Conference
Proceeding Series}, 2019.
\bibitem{louthan_hybrid_2011}
G.~Louthan, {\em Hybrid {Attack} {Graphs} for {Modeling} {Cyber}-{Physical}
{Systems}}.