Minor edits

2023-04-25 01:00:29 -05:00 · 2023-04-25 01:00:29 -05:00 · c9186348ee
commit c9186348ee
parent a5f84f51ff
4 changed files with 31 additions and 28 deletions
--- a/Schrick-Noah_AG-CG-CR.aux
+++ b/Schrick-Noah_AG-CG-CR.aux
@ -57,6 +57,7 @@
 \citation{CR-Simple}
 \bibdata{Bibliography}
 \bibcite{schneier_modeling_1999}{1}
+\bibcite{j_hale_compliance_nodate}{2}
 \@writefile{lof}{\contentsline {figure}{\numberline {2}{\ignorespaces Time Taken to Checkpoint as the Size of the Instance Grows}}{4}{figure.2}\protected@file@percent }
 \newlabel{fig:inst-time}{{2}{4}{Time Taken to Checkpoint as the Size of the Instance Grows}{figure.2}{}}
 \@writefile{lof}{\contentsline {figure}{\numberline {3}{\ignorespaces Time Taken to Checkpoint as the Size of the Frontier Grows}}{4}{figure.3}\protected@file@percent }
@ -64,7 +65,7 @@
 \@writefile{lof}{\contentsline {figure}{\numberline {4}{\ignorespaces Time Taken to Restart as the Size of the Frontier Grows}}{4}{figure.4}\protected@file@percent }
 \newlabel{fig:front-rest-time}{{4}{4}{Time Taken to Restart as the Size of the Frontier Grows}{figure.4}{}}
 \@writefile{toc}{\contentsline {section}{\numberline {V}Conclusions and Future Work}{4}{section.5}\protected@file@percent }
-\bibcite{j_hale_compliance_nodate}{2}
+\@writefile{toc}{\contentsline {section}{References}{4}{section*.1}\protected@file@percent }
 \bibcite{cook_rage_2018}{3}
 \bibcite{berry_graph_2007}{4}
 \bibcite{zhang_boosting_2017}{5}
@ -80,5 +81,4 @@
 \bibcite{li_combining_2019}{15}
 \bibcite{CR-Simple}{16}
 \bibstyle{ieeetr}
-\@writefile{toc}{\contentsline {section}{References}{5}{section*.1}\protected@file@percent }
 \gdef \@abspage@last{5}
--- a/Schrick-Noah_AG-CG-CR.log
+++ b/Schrick-Noah_AG-CG-CR.log
@ -1,4 +1,4 @@
-This is pdfTeX, Version 3.141592653-2.6-1.40.25 (TeX Live 2023/Arch Linux) (preloaded format=pdflatex 2023.4.3)  25 APR 2023 00:40
+This is pdfTeX, Version 3.141592653-2.6-1.40.25 (TeX Live 2023/Arch Linux) (preloaded format=pdflatex 2023.4.3)  25 APR 2023 01:00
 entering extended mode
 restricted \write18 enabled.
 %&-line parsing enabled.
@ -488,6 +488,16 @@ Underfull \hbox (badness 2158) in paragraph at lines 55--62
 []\OT1/ptm/m/n/10 Despite their advantages, graph generation has many
 []

+
+Underfull \hbox (badness 3557) in paragraph at lines 68--69
+\OT1/ptm/m/n/10 reliability. User-level checkpointing, though has greater
+ []
+
+
+Underfull \hbox (badness 3417) in paragraph at lines 68--69
+\OT1/ptm/m/n/10 Checkpoint/Restart) library, which has seen widespread
+ []
+
 [1{/var/lib/texmf/fonts/map/pdftex/updmap/pdftex.map}{/usr/share/texmf-dist/fon
 ts/enc/dvips/base/8r.enc}

@ -504,15 +514,6 @@ LaTeX Font Info:    Trying to load font information for U+msb on input line 72.
 (/usr/share/texmf-dist/tex/latex/amsfonts/umsb.fd
 File: umsb.fd 2013/01/14 v3.01 AMS symbols B
 )
-Underfull \hbox (badness 1552) in paragraph at lines 70--75
-\OT1/ptm/m/n/10 compliance graphs attempt to improve performance and
- []
-
-
-Underfull \hbox (badness 1960) in paragraph at lines 70--75
-\OT1/ptm/m/n/10 scalability to mitigate state space explosion or lengthy
- []
-
 <./images/checkpoint.png, id=91, 755.82375pt x 402.50375pt>
 File: ./images/checkpoint.png Graphic file (type png)
 <use ./images/checkpoint.png>
@ -531,7 +532,7 @@ static
 []

 [2 <./images/checkpoint.png>]
-Underfull \hbox (badness 4660) in paragraph at lines 125--130
+Underfull \hbox (badness 4660) in paragraph at lines 124--129
 \OT1/ptm/m/it/10 3) Portability: [][][] \OT1/ptm/m/n/10 The checkpointing proc
 ess is greatly
 []
@ -540,32 +541,35 @@ ess is greatly
 <./images/instance_time.png, id=116, 606.265pt x 341.275pt>
 File: ./images/instance_time.png Graphic file (type png)
 <use ./images/instance_time.png>
-Package pdftex.def Info: ./images/instance_time.png  used on input line 141.
+Package pdftex.def Info: ./images/instance_time.png  used on input line 140.
 (pdftex.def)             Requested size: 252.0pt x 141.8556pt.
 <./images/frontier_checkpoint_time.png, id=118, 607.26875pt x 341.275pt>
 File: ./images/frontier_checkpoint_time.png Graphic file (type png)
 <use ./images/frontier_checkpoint_time.png>
 Package pdftex.def Info: ./images/frontier_checkpoint_time.png  used on input l
-ine 150.
+ine 149.
 (pdftex.def)             Requested size: 252.0pt x 141.61606pt.
 <./images/frontier_restart_time.png, id=120, 606.265pt x 341.275pt>
 File: ./images/frontier_restart_time.png Graphic file (type png)
 <use ./images/frontier_restart_time.png>
 Package pdftex.def Info: ./images/frontier_restart_time.png  used on input line
- 159.
+ 158.
 (pdftex.def)             Requested size: 252.0pt x 141.8556pt.

-Underfull \hbox (badness 1622) in paragraph at lines 165--166
+Underfull \hbox (badness 1622) in paragraph at lines 164--165
 \OT1/ptm/m/n/10 function calls or snapshots that are required. The C/R
 []


-Underfull \hbox (badness 2150) in paragraph at lines 167--168
+Underfull \vbox (badness 1776) has occurred while \output is active []
+
+
+Underfull \hbox (badness 2150) in paragraph at lines 166--167
 \OT1/ptm/m/n/10 checkpoint times and sizes, as well as time taken to
 []


-Underfull \hbox (badness 1565) in paragraph at lines 167--168
+Underfull \hbox (badness 1565) in paragraph at lines 166--167
 \OT1/ptm/m/n/10 settings to alter or enable, or communication strategies
 []

@ -611,7 +615,7 @@ Here is how much of TeX's memory you used:
 32330 multiletter control sequences out of 15000+600000
 544489 words of font info for 89 fonts, out of 8000000 for 9000
 1141 hyphenation exceptions out of 8191
- 75i,8n,76p,1314b,588s stack positions out of 5000i,500n,10000p,200000b,80000s
+ 75i,8n,76p,1314b,592s stack positions out of 5000i,500n,10000p,200000b,80000s
 </usr/share/texmf-dist/fonts/type1/public/amsfonts/cm/cmmi10.pfb></usr/share/
 texmf-dist/fonts/type1/public/amsfonts/cm/cmr10.pfb></usr/share/texmf-dist/font
 s/type1/public/amsfonts/cm/cmr7.pfb></usr/share/texmf-dist/fonts/type1/public/a
@ -619,7 +623,7 @@ msfonts/cm/cmsy10.pfb></usr/share/texmf-dist/fonts/type1/urw/times/utmb8a.pfb><
 /usr/share/texmf-dist/fonts/type1/urw/times/utmbi8a.pfb></usr/share/texmf-dist/
 fonts/type1/urw/times/utmr8a.pfb></usr/share/texmf-dist/fonts/type1/urw/times/u
 tmri8a.pfb>
-Output written on Schrick-Noah_AG-CG-CR.pdf (5 pages, 208691 bytes).
+Output written on Schrick-Noah_AG-CG-CR.pdf (5 pages, 208554 bytes).
 PDF statistics:
 184 PDF objects out of 1000 (max. 8388607)
 151 compressed objects within 2 object streams
--- a/Schrick-Noah_AG-CG-CR.pdf
+++ b/Schrick-Noah_AG-CG-CR.pdf
--- a/Schrick-Noah_AG-CG-CR.tex
+++ b/Schrick-Noah_AG-CG-CR.tex
@ -50,7 +50,7 @@ Attack Graph; Compliance Graph; MPI; High-Performance Computing; Checkpoint/Rest
 \end{IEEEkeywords}

 \section{Introduction} \label{sec:Intro}
-In order to predict and prevent the risk of cyber attacks, various modeling and tabletop approaches are implemented to best prepare for attack scenarios. One approach is through the use of attack graphs, originally presented by the author of \cite{schneier_modeling_1999}. Attack graphs represent possible attack scenarios or vulnerability paths in a network. These graphs consist of nodes and edges, with various information encoded at the topological level as well as within the nodes themselves. Similarly, compliance graphs are used to predict and prevent violations of compliance or regulation mandates \cite{j_hale_compliance_nodate}. These graphs are now generated through the use of attack or compliance generators, rather than by hand. The generator tool used by this work is RAGE (the RAGE Attack Graph Engine) \cite{cook_rage_2018}.
+In order to predict and prevent the risk of cyber attacks, various modeling and tabletop approaches are implemented to best prepare for attack scenarios. One approach is through the use of attack graphs, originally presented by the author of \cite{schneier_modeling_1999}. Attack graphs represent possible attack scenarios or vulnerability paths in a network. These graphs consist of nodes and edges, with various information encoded at the topological level as well as within the nodes themselves. Similarly, compliance graphs are used to predict and prevent violations of compliance or regulation mandates \cite{j_hale_compliance_nodate}. These graphs are now generated through the use of attack or compliance graph generators, rather than by hand. The generator tool used for the implementation of this work is RAGE (the RAGE Attack Graph Engine) \cite{cook_rage_2018}.

 Despite their advantages, graph generation has many challenges that prevent full actualization of computation seen from a theoretical standpoint, and these challenges extend to attack and compliance graphs. 
 In practice, graph generation often achieves only a very low percentage of its expected performance \cite{berry_graph_2007}. A few reasons
@ -62,12 +62,12 @@ nodes and edges, graph data structures suffer from poor cache locality, and memo

 The author of \cite{cook_rage_2018} discusses the challenges of attack graph generation in regards to its scalability. Specifically, the author of \cite{cook_rage_2018} displays results from generations based on small networks that result in a large state space. The authors of \cite{ou_scalable_2006} also present the scalability challenges of attack graphs. Their findings indicate that small networks result in graphs with total edges and nodes in the order of millions. Generating an attack or compliance graph based on a large network with a multitude of assets and involving a more thorough exploit or compliance violation checking will prevent the entire graph from being stored in memory as originally designed. 

-Due to the runtime requirements and scalability challenges imposed by graph generation, fault-tolerance is critical to ensure reliable generation. These difficulties highlight the need for fault-tolerance and memory relief approaches. The ability to safely checkpoint and recover from a system error is crucial to avoid duplicated work or needing to request more cycles on an HPC cluster. In addition, having the ability to handle the memory strain without requesting excess RAM on an HPC cluster assists in reducing incurred cost. This work presents an application-level checkpoint/restart (C/R) approach tailored to large-scale graph generation. This work illustrates the advantages in having a C/R system built into the generation process itself, rather than using alternative libraries. By having native C/R, performance can be maximized and runtime interruption and overhead can be minimized. This C/R approach allows the user to ensure fault-tolerance for graph generation without the reliance on a system-level, HPC cluster implementation of C/R.
+Due to the runtime requirements and scalability challenges imposed by graph generation, fault-tolerance is critical to ensure reliable generation. These difficulties highlight the need for fault-tolerance and memory relief approaches. The ability to safely checkpoint and recover from a system error is crucial to avoid duplicated work or needing to request more cycles on an HPC cluster. In addition, having the ability to minimize the memory strain without requesting excess RAM on an HPC cluster assists in reducing incurred cost. This work presents an application-level checkpoint/restart (C/R) approach tailored to large-scale graph generation. This work illustrates the advantages in having a C/R system built into the generation process itself, rather than using alternative libraries. By having native C/R, performance can be maximized and runtime interruption and overhead can be minimized. This C/R approach allows the user to ensure fault-tolerance for graph generation without the reliance on a system-level, HPC cluster implementation of C/R.

 \section{Related Work} \label{sec:Rel-Works}
-Numerous efforts have been presented for C/R techniques with various categories available. The authors of \cite{CR-Survey} and \cite{hursey2010coordinated} discuss three categories of C/R, which include application-level, user-level, and system-level. Each approach draws upon advantages that appeal toward different aspects of reliability. Notably, application-level requiring additional work for the implementation but resulting in smaller, faster C/R, user-level with its simplicity, but resulting in larger checkpoints, and system-level requiring compatibility with the operating system and any libraries used for the application. The authors of \cite{SCR} present the SCR (Scalable Checkpoint/Restart) library, which has seen widespread adoption due to its minimal overhead. DMTCP (Distributed MultiThreaded Checkpointing) \cite{dmtcp} and BLCR (Berkely Lab Checkpoint/Restart) \cite{BLCR} are two other commonly-seen C/R approaches.
+Numerous efforts have been presented for C/R techniques with various categories available. The authors of \cite{CR-Survey} and \cite{hursey2010coordinated} discuss three categories of C/R, which include application-level, user-level, and system-level. Each approach draws upon advantages that appeal toward different aspects of reliability. User-level checkpointing, though has greater simplicity, results in larger checkpoints. System-level requires compatibility with the operating system and any libraries used for the application. Application-level checkpointing requires additional work for the implementation, but resuls in smaller, faster C/R. The authors of \cite{SCR} present the SCR (Scalable Checkpoint/Restart) library, which has seen widespread adoption due to its minimal overhead. DMTCP (Distributed MultiThreaded Checkpointing) \cite{dmtcp} and BLCR (Berkely Lab Checkpoint/Restart) \cite{BLCR} are two other commonly-used C/R approaches.

-Rather than using C/R, investigations into attack and compliance graphs attempt to improve performance and scalability to mitigate state space explosion or lengthy runtimes. As a means of improving scalability of attack graphs themselves, the authors of \cite{ou_scalable_2006} present a new representation scheme. Traditional attack graphs encode the entire network at each state,
+Other investigations into attack and compliance graphs attempt to improve performance and scalability to mitigate state space explosion or lengthy runtimes, rather than focus on C/R. As a means of improving scalability of attack graphs themselves, the authors of \cite{ou_scalable_2006} present a new representation scheme. Traditional attack graphs encode the entire network at each state,
 but the representation presented by the authors uses logical statements to represent a portion of the network at each node. This is called a logical attack graph. This approach led to the reduction of the generation process
 to quadratic time and reduced the number of nodes in the resulting graph to $\mathcal{O}({n}^2)$. However, this approach does require more analysis for identifying attack vectors. Another approach
 presented by the authors of \cite{cook_scalable_2016} represents a description of systems and their qualities and topologies as a state, with a queue of unexplored states. This work was continued by the
@ -109,9 +109,8 @@ Previous works with RAGE have been designed around maximizing performance to lim
 To decide when to checkpoint due to memory capacity, two separate checks are made. The first check is for the frontier. If the size of the frontier consumes equal to or more than the allowed allocated memory, then all new states
 are stored into a new table in the database called “unexplored states”. Each new state from this point forward is stored in the table, regardless of if room is freed in the frontier. This is to ensure proper ordering of the FIFO queue.
 The only time new states are stored directly into the frontier is when the unexplored states table is empty. Once the frontier has been completely emptied, new states are then pulled from the database into the frontier. To pull from
- the database, the parent loop for the generator process has been altered. Instead of a while loop for when the frontier is not empty, it has been adjusted to when the frontier is not empty or the unexplored states table is not empty. Due
- to C++ using short-circuit evaluation where the first argument is completely evaluated before processing the second, some performance is gained. The performance gained is due to not having to pass a SQL statement to disk to check the size of the unexplored states table unless the frontier is empty. The original generation design stored new states
- into the frontier during the critical section to avoid testing on already-explored states. To follow this design decision, writing new states to the database is also performed during the critical section.
+ the database, the parent loop for the generator process has been altered. Instead of a while loop for when the frontier is not empty, it has been adjusted to when the frontier is not empty or, if the frontier is empty, if the unexplored states table is not empty.The original generation design stored new states
+ into the frontier during an OpenMP critical section to avoid testing on already-explored states. To follow this design decision, writing new states to the database is also performed during the critical section.

 For the graph instance, a check in the critical section determines if the size of the graph instance consumes more than its allocated share of the memory. If it does, the edges, network states, and network state items are written to the database,
 and are then removed from memory.