Incorporating edits by PJH

This commit is contained in:
Noah L. Schrick 2023-11-07 20:05:32 -06:00
parent 6fd055c3fc
commit 3dc9b125f4
11 changed files with 24 additions and 20 deletions

View File

@ -0,0 +1 @@
,noah,NovaArchSys,07.11.2023 20:03,file:///home/noah/.config/libreoffice/4;

Binary file not shown.

Binary file not shown.

BIN
Data/instance_plot.ods Normal file

Binary file not shown.

View File

@ -62,6 +62,7 @@
\citation{CR-Simple}
\bibdata{Bibliography}
\bibcite{schneier_modeling_1999}{1}
\bibcite{j_hale_compliance_nodate}{2}
\@writefile{lof}{\contentsline {figure}{\numberline {2}{\ignorespaces Time Taken to Checkpoint as the Size of the Instance Grows}}{4}{figure.2}\protected@file@percent }
\newlabel{fig:inst-time}{{2}{4}{Time Taken to Checkpoint as the Size of the Instance Grows}{figure.2}{}}
\@writefile{lof}{\contentsline {figure}{\numberline {3}{\ignorespaces Time Taken to Checkpoint as the Size of the Frontier Grows}}{4}{figure.3}\protected@file@percent }
@ -69,7 +70,7 @@
\@writefile{lof}{\contentsline {figure}{\numberline {4}{\ignorespaces Time Taken to Restart as the Size of the Frontier Grows}}{4}{figure.4}\protected@file@percent }
\newlabel{fig:front-rest-time}{{4}{4}{Time Taken to Restart as the Size of the Frontier Grows}{figure.4}{}}
\@writefile{toc}{\contentsline {section}{\numberline {V}Conclusions and Future Work}{4}{section.5}\protected@file@percent }
\bibcite{j_hale_compliance_nodate}{2}
\@writefile{toc}{\contentsline {section}{References}{4}{section*.1}\protected@file@percent }
\bibcite{cook_rage_2018}{3}
\bibcite{berry_graph_2007}{4}
\bibcite{zhang_boosting_2017}{5}
@ -90,5 +91,4 @@
\bibcite{li_combining_2019}{20}
\bibcite{CR-Simple}{21}
\bibstyle{ieeetr}
\@writefile{toc}{\contentsline {section}{References}{5}{section*.1}\protected@file@percent }
\gdef \@abspage@last{5}

View File

@ -1,4 +1,4 @@
This is pdfTeX, Version 3.141592653-2.6-1.40.25 (TeX Live 2023/Arch Linux) (preloaded format=pdflatex 2023.4.3) 25 APR 2023 02:18
This is pdfTeX, Version 3.141592653-2.6-1.40.25 (TeX Live 2023/Arch Linux) (preloaded format=pdflatex 2023.9.6) 7 NOV 2023 20:03
entering extended mode
restricted \write18 enabled.
%&-line parsing enabled.
@ -238,7 +238,7 @@ File: pdftex.def 2022/09/22 v1.2b Graphics/color driver for pdftex
Package: babel 2023/02/13 3.86 The Babel package
\babel@savecnt=\count282
\U@D=\dimen175
\l@unhyphenated=\language87
\l@unhyphenated=\language3
(/usr/share/texmf-dist/tex/generic/babel/txtbabel.def)
\bbl@readstream=\read2
@ -248,7 +248,7 @@ Package babel Info: You haven't specified a language as a class or package
(/usr/share/texmf-dist/tex/generic/babel/nil.ldf
Language: nil 2023/02/13 3.86 Nil language
\l@nil=\language88
\l@nil=\language4
))
(/usr/share/texmf-dist/tex/latex/base/textcomp.sty
Package: textcomp 2020/02/02 v2.0n Standard LaTeX package
@ -537,29 +537,32 @@ ess is greatly
[]
[3]
<./images/instance_time.png, id=123, 606.265pt x 341.275pt>
<./images/instance_time.png, id=123, 720.6925pt x 297.11pt>
File: ./images/instance_time.png Graphic file (type png)
<use ./images/instance_time.png>
Package pdftex.def Info: ./images/instance_time.png used on input line 136.
(pdftex.def) Requested size: 252.0pt x 141.8556pt.
<./images/frontier_checkpoint_time.png, id=125, 607.26875pt x 341.275pt>
(pdftex.def) Requested size: 252.0pt x 103.88577pt.
<./images/frontier_checkpoint_time.png, id=125, 695.59875pt x 341.275pt>
File: ./images/frontier_checkpoint_time.png Graphic file (type png)
<use ./images/frontier_checkpoint_time.png>
Package pdftex.def Info: ./images/frontier_checkpoint_time.png used on input l
ine 145.
(pdftex.def) Requested size: 252.0pt x 141.61606pt.
<./images/frontier_restart_time.png, id=127, 606.265pt x 341.275pt>
(pdftex.def) Requested size: 252.0pt x 123.6348pt.
<./images/frontier_restart_time.png, id=127, 612.2875pt x 331.2375pt>
File: ./images/frontier_restart_time.png Graphic file (type png)
<use ./images/frontier_restart_time.png>
Package pdftex.def Info: ./images/frontier_restart_time.png used on input line
154.
(pdftex.def) Requested size: 252.0pt x 141.8556pt.
(pdftex.def) Requested size: 252.0pt x 136.32883pt.
Underfull \hbox (badness 1622) in paragraph at lines 160--161
\OT1/ptm/m/n/10 function calls or snapshots that are required. The C/R
[]
Underfull \vbox (badness 4060) has occurred while \output is active []
Underfull \hbox (badness 2150) in paragraph at lines 162--163
\OT1/ptm/m/n/10 checkpoint times and sizes, as well as time taken to
[]
@ -605,18 +608,18 @@ Package rerunfilecheck Info: File `Schrick-Noah_AG-CG-CR.out' has not changed.
(rerunfilecheck) Checksum: CC85FF3DB94FE8393E2ED734D36908F3;1379.
)
Here is how much of TeX's memory you used:
12086 strings out of 476025
191517 string characters out of 5796533
12086 strings out of 477985
191516 string characters out of 5840059
1871388 words of memory out of 5000000
32336 multiletter control sequences out of 15000+600000
32083 multiletter control sequences out of 15000+600000
544489 words of font info for 89 fonts, out of 8000000 for 9000
1141 hyphenation exceptions out of 8191
75i,8n,76p,1431b,588s stack positions out of 5000i,500n,10000p,200000b,80000s
14 hyphenation exceptions out of 8191
75i,8n,76p,1431b,592s stack positions out of 10000i,1000n,20000p,200000b,200000s
</usr/share/texmf-dist/fonts/type1/public/amsfonts/cm/cmmi10.pfb></usr/share/
texmf-dist/fonts/type1/urw/times/utmb8a.pfb></usr/share/texmf-dist/fonts/type1/
urw/times/utmbi8a.pfb></usr/share/texmf-dist/fonts/type1/urw/times/utmr8a.pfb><
/usr/share/texmf-dist/fonts/type1/urw/times/utmri8a.pfb>
Output written on Schrick-Noah_AG-CG-CR.pdf (5 pages, 185791 bytes).
Output written on Schrick-Noah_AG-CG-CR.pdf (5 pages, 194649 bytes).
PDF statistics:
179 PDF objects out of 1000 (max. 8388607)
152 compressed objects within 2 object streams

Binary file not shown.

View File

@ -62,7 +62,7 @@ nodes and edges, graph data structures suffer from poor cache locality, and memo
The author of \cite{cook_rage_2018} discusses the challenges of attack graph generation in regards to its scalability. Specifically, the author of \cite{cook_rage_2018} displays results from generations based on small networks that result in a large state space. The authors of \cite{ou_scalable_2006} also present the scalability challenges of attack graphs. Their findings indicate that small networks result in graphs with total edges and nodes in the order of millions. Generating an attack or compliance graph based on a large network with a multitude of assets and involving a more thorough exploit or compliance violation checking will prevent the entire graph from being stored in memory as originally designed.
Due to the runtime requirements and scalability challenges imposed by graph generation, fault-tolerance is critical to ensure reliable generation. These difficulties highlight the need for fault-tolerance and memory relief approaches. The ability to safely checkpoint and recover from a system error is crucial to avoid duplicated work or needing to request more cycles on an HPC cluster. In addition, having the ability to minimize the memory strain without requesting excess RAM on an HPC cluster assists in reducing incurred cost. This work presents an application-level checkpoint/restart (C/R) approach tailored to large-scale graph generation. This work illustrates the advantages in having a C/R system built into the generation process itself, rather than using alternative libraries. By having native C/R, performance can be maximized and runtime interruption and overhead can be minimized. This C/R approach allows the user to ensure fault-tolerance for graph generation without the reliance on a system-level, HPC cluster implementation of C/R.
Due to the runtime requirements and scalability challenges imposed by graph generation, fault-tolerance is critical to ensure reliable generation. These difficulties highlight the need for fault-tolerance and non-volatile memory (memory) relief approaches. The ability to safely checkpoint and recover from a system error is crucial to avoid duplicated work or needing to request more cycles on an HPC cluster. In addition, having the ability to minimize the memory strain without requesting excess RAM on an HPC cluster assists in reducing incurred cost. This work presents an application-level checkpoint/restart (C/R) approach tailored to large-scale graph generation. This work illustrates the advantages in having a C/R system built into the generation process itself, rather than using alternative libraries. By having native C/R, performance can be maximized and runtime interruption and overhead can be minimized. This C/R approach allows the user to ensure fault-tolerance for graph generation without the reliance on a system-level, HPC cluster implementation of C/R.
\section{Related Work} \label{sec:Rel-Works}
Numerous efforts have been presented for C/R techniques with various categories available. The authors of \cite{CR-Survey} and \cite{hursey2010coordinated} discuss three categories of C/R, which include application-level, user-level, and system-level. Each approach draws upon advantages that appeal toward different aspects of reliability. User-level checkpointing, though has greater simplicity, results in larger checkpoints. System-level requires compatibility with the operating system and any libraries used for the application. Application-level checkpointing requires additional work for the implementation, but resuls in smaller, faster C/R. The authors of \cite{SCR} present the SCR (Scalable Checkpoint/Restart) library, which has seen widespread adoption due to its minimal overhead. DMTCP (Distributed MultiThreaded Checkpointing) \cite{dmtcp} and BLCR (Berkely Lab Checkpoint/Restart) \cite{BLCR} are two other commonly-used C/R approaches.
@ -98,14 +98,14 @@ Previous works with RAGE have been designed around maximizing performance to lim
Rather than only a static implementation of storing to the database on disk at a set interval or a set size, the goal was to also allow for dynamically storing to the database only when necessary. This would allow for proper utilization of systems with greater memory, and would reduce fine-tuning of a maximum size variable before database writes on different systems. Since there is an associated cost with preparing
the writes to disk, the communication cost across nodes, the writing to disk itself, and a cost for retrieving items from disk, it may be desirable to store as much in memory for as long as possible and only checkpoint when necessary. When
running RAGE, a new argument can be passed \textit{(-a $<$double$>$)} to specify the amount of memory the tool should use before writing to disk. This argument is a value between 0 and 0.99 to specify a percentage. Alternatively, an integer greater than or equal to 1 can be passed, which allows for a discrete number of states to be held in memory before checkpointing.
If the value passed is between 0 and 1, it is immediately reduced by 10\%. For instance, if 0.6 is passed, it is immediately reduced to 0.5. This acts as a buffer for PostgreSQL. Since queries will consume a variable amount of memory through parsing or preparation,
If the value passed is between 0 and 1, it is immediately reduced by a static subtraction of 10\%. For instance, if 0.6 (60\%) is passed, it is immediately reduced by a static subtraction of 10\% to yield 0.5 (0.6-0.1=0.5, or 60\%-10\%=50\%). This acts as a buffer for PostgreSQL. Since queries will consume a variable amount of memory through parsing or preparation,
an additional 10\% is saved as a precaution. This can be changed as needed or desired for future optimizations. Specific to the graph data, the statement is made that the frontier is allowed to consume half of the allocated memory,
and that the graph instance is allowed to consume the other half.
To decide when to checkpoint due to memory capacity, two separate checks are made. The first check is for the frontier. If the size of the frontier consumes equal to or more than the allowed allocated memory, then all new states
are stored into a new table in the database called “unexplored states”. Each new state from this point forward is stored in the table, regardless of if room is freed in the frontier. This is to ensure proper ordering of the FIFO queue.
The only time new states are stored directly into the frontier is when the unexplored states table is empty. Once the frontier has been completely emptied, new states are then pulled from the database into the frontier. To pull from
the database, the parent loop for the generator process has been altered. Instead of a while loop for when the frontier is not empty, it has been adjusted to when the frontier is not empty or, if the frontier is empty, if the unexplored states table is not empty.The original generation design stored new states
the database, the parent loop for the generator process has been altered. Instead of a while loop for when the frontier is not empty, it has been adjusted to when the frontier is not empty or, if the frontier is empty, if the unexplored states table is not empty. The original generation design stored new states
into the frontier during an OpenMP critical section to avoid testing on already-explored states. To follow this design decision, writing new states to the database is also performed during the critical section.
For the graph instance, a check in the critical section determines if the size of the graph instance consumes more than its allocated share of the memory. If it does, the edges, network states, and network state items are written to the database,

Binary file not shown.

Before

Width:  |  Height:  |  Size: 29 KiB

After

Width:  |  Height:  |  Size: 32 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 29 KiB

After

Width:  |  Height:  |  Size: 32 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 39 KiB

After

Width:  |  Height:  |  Size: 42 KiB