1 year ago · e651eea350
--- a/IEEE-conference-template-062824.tex
+++ b/IEEE-conference-template-062824.tex
@@ -27,14 +27,14 @@
 
															 \title{Intermittent Systems at Small Scale: Execution Model and Design Guidelines \\
														
 
															 % \thanks{This work was supported by IITP grant funded by the Korea government (MSIT) (No.2021-0-00360, Development of Core Technology for Autonomous Energy-driven Computing System SW in Power-instable Environment).}
														
 
															-\thanks{This work was supported by IITP grant funded by the Korea government (MSIT) (No.2021-0-00360).}
														
 
															+% \thanks{This work was supported by IITP grant funded by the Korea government (MSIT) (No.2021-0-00360).}
														
 
															 }
														
 
															-\author{\IEEEauthorblockN{Youngbin Kim and Yoojin Lim}
														
 
															-\IEEEauthorblockA{yb.kim@etri.re.kr, yoojin.lim@etri.re.kr \\
														
 
															-Electronics and Telecommunications Research Institute (ETRI), Daejeon, Republic of Korea
														
 
															-}
														
 
															-}
														
 
															+% \author{\IEEEauthorblockN{Youngbin Kim and Yoojin Lim}
														
 
															+% \IEEEauthorblockA{yb.kim@etri.re.kr, yoojin.lim@etri.re.kr \\
														
 
															+% Electronics and Telecommunications Research Institute (ETRI), Daejeon, Republic of Korea
														
 
															+% }
														
 
															+% }
														
 
															 \maketitle
														
--- a/figs/plot_expr_8b_cropped.pdf
+++ b/figs/plot_expr_8b_cropped.pdf
--- a/sections/Introduction.tex
+++ b/sections/Introduction.tex
@@ -27,7 +27,7 @@ Software designers aim to leverage this execution model to implement intermitten
 
															 In the meantime, recent research on intermittent systems is increasingly exploring shorter operation times by using smaller capacitors.
														
 
															 % In the meantime, researches on intermittent systems are increasingly exploring shorter operation times by using smaller capacitors (e.g., less than 1mF~\cite{ahmedEfficient2019}).
														
 
															 Operating on small capacitors is generally desirable, as it reduces device volume and enhances the responsiveness by enabling the system to wake up more frequently~\cite{bakarProtean2023a,maengAdaptive2020,alsubhiStash2024}.
														
 
															-As a result, recent studies have targeted operation times in the range of tens of milliseconds~\cite{reymondSCHEMATIC2024,wuIntOS2024,yildizEfficient2023,choiCompilerDirected2022} or even microseconds~\cite{reymondSCHEMATIC2024,wuIntOS2024}.
														
 
															+As a result, recent studies have targeted operation times in the range of tens of milliseconds~\cite{reymondSCHEMATIC2024,wuIntOS2024,yildizEfficient2023,choiCompilerDirected2022,netoDiCA2023,kortbeekWARio2022,kortbeekTimesensitive2020} or even microseconds~\cite{reymondSCHEMATIC2024,wuIntOS2024}.
														
 
															 % However, as energy storage size decreases, the traditional execution model no longer provides an accurate abstraction of actual execution behavior.
														
 
															 However, as energy storage sizes decrease, the traditional execution model is failing to provide an accurate abstraction of actual execution behavior.
														
 
															 % The challenge is that the traditional execution model does not provide precise abstraction of the real execution anymore when the energy storage is very small.
														
--- a/sections/OurApproach.tex
+++ b/sections/OurApproach.tex
@@ -10,7 +10,7 @@ In \emph{static}, checkpoint triggers are inserted at every loop latch in the pr
 
															 At runtime, checkpoint triggers examine $V_{ES}$ and execute checkpoint only when it is below a predefined threshold.
														
 
															 In contrast, \emph{dynamic}~\cite{jayakumarQUICKRECALL2014,maengSupporting2019,balsamoHibernus2016,balsamoHibernus2015,kortbeekTimesensitive2020} does not modify the original program code.
														
 
															 Instead, it executes checkpoints via interrupts from the power management system, generated when $V_{ES}$ reaches $V_l$.
														
 
															-These schemes are considered since most checkpoint techniques exploit $V_{ES}$ by either actively polling it (as in \emph{static}) or receiving a signal (as in \emph{dynamic}).
														
 
															+These schemes are considered since most checkpoint techniques utilize $V_{ES}$ by either actively polling it (as in \emph{static}) or by receiving a signal (as in \emph{dynamic}).
														
 
															 All the evaluations are conducted with 470uF energy storage and 1mA of input current at 1.9V, unless otherwise stated.
														
 
															 \subsection{Delaying Checkpoint Executions}
														
@@ -19,7 +19,7 @@ All the evaluations are conducted with 470uF energy storage and 1mA of input cur
 
															 The first design practice we propose is to delay checkpoint executions until the last possible moment.
														
 
															 While this practice is generally regarded as desirable in existing works~\cite{ransfordMementos2011,bhattiHarvOS2017}, it has not been recognized as a critical property.
														
 
															 Under the traditional execution model, early checkpoint execution is often considered acceptable as it makes the system wake up sooner, incurring only minor costs for initialization and recovery.
														
 
															-For example, some approaches have explored proactive power-offs based on the program's worst-case execution time~\cite{choiCompilerDirected2022,reymondSCHEMATIC2024}.
														
 
															+For example, some approaches have explored proactive power-offs based on the program's worst-case execution time~\cite{choiCompilerDirected2022,reymondSCHEMATIC2024,raffeckWoCA2024}.
														
 
															 % For example, some approaches have explored proactive power-offs based on the program's worst-case execution time~\cite{choiCompilerDirected2022,reymondSCHEMATIC2024}, which can be overly pessimistic~\cite{raffeckWoCA2024}.
														
 
															 In contrast, our model reveals that significant energy is wasted each time the system powers off (Sec.~\ref{sec:power_efficiency}).%, highlighting the impact of delaying checkpoint executions.
														
 
															 % As a result, the importance of delaying checkpoint executions is greater than previously assumed.
														
@@ -35,7 +35,7 @@ We evaluate the impact of delaying checkpoint executions in \emph{dynamic}, by v
 
															 A 1100uF capacitor is used for $C_{ES}$.
														
 
															 % Fig.~\ref{fig:expr_checkpoint_voltages} presents the benchmark execution times in dynamic checkpoint scheme, across various checkpoint execution voltages.
														
 
															 % A 1100uF capacitor is used for $C_{ES}$ and the execution times are normalized to the 3.4V configuration. 
														
 
															-Fig.~\ref{fig:expr_checkpoint_voltages} presents the average execution times of the benchmarks over 30 runs, normalized to the 3.V configuration.
														
 
															+Fig.~\ref{fig:expr_checkpoint_voltages} presents the average execution times of the benchmarks over 30 runs, normalized to the 3.4V configuration.
														
 
															 The results show that executing checkpoints earlier is significantly inefficient as opposed to existing expectations: by 1.38x in 3.7V, and 2.45x in 4.0V setups, on average.
														
 
															 Moreover, the overhead is consistent across all benchmarks since early checkpoint executions directly reduce the energy available for the computing system.
														
 
															 % Consequently, to design efficient checkpoint techniques, it important to minimize the margin between checkpoint execution and the power-off.
														
@@ -62,7 +62,7 @@ For consistent operation of ADCs, we adopt a voltage source with a known value o
 
															 In STM32L5 and MSP430, an internal reference voltage source of 1.2V is available; alternatively, an external voltage reference (e.g., TI LVM431~\cite{texasinstrumentsLMV431}) can be used.
														
 
															 Note that $V_{ref}$ should be lower than the minimal operating voltage of MCU (e.g., 1.7V) as $V_{ref}$ is generated by regulating $V_{dd}$.
														
 
															-$S_{sta}$ is designed for the techniques like \emph{static}, which query to decide checkpoint execution at checkpoint triggers.
														
 
															+$S_{sta}$ is designed for techniques similar to \emph{static}, which query whether to execute a checkpoint at checkpoint triggers.
														
 
															 Since directly reading $V_{dd}$ is infeasible (i.e., $V_{dd}$ itself is a reference voltage), $S_{sta}$ reads $V_{ref}$ instead.
														
 
															 % Instead of reading $V_{ES}$ at checkpoint triggers, $S_{sta}$ reads $V_{ref}$. 
														
 
															 This results in the same value of $\lfloor V_{ref}/V_{dd} \cdot 2^n \rfloor$ when operating on normal voltage, where $n$ is the ADC resolution.
														
@@ -102,7 +102,7 @@ Specifically, we configure $R1$ and $R2$ to satisfy $\frac{R2}{R1+R2} \cdot V_{t
 
															 Fig.~\ref{fig:expr_precise_checkpoint_timings} compares the average execution times of the benchmarks over 30 iterations between traditional systems and the proposed setups.
														
 
															 Fig.~\ref{fig:expr_precise_checkpoint_timings_static} and Fig.~\ref{fig:expr_precise_checkpoint_timings_dynamic} illustrates the results of $S_{sta}$ and $S_{dyn}$, respectively.
														
 
															 % illustrates the performance of $S_{sta}$ and Fig.~\ref{fig:expr_precise_checkpoint_timings_dynamic} presents the result for $S_{dyn}$.
														
 
															-The error bars indicate the minimum and maximum execution times for each benchmark.
														
 
															+The whiskers indicate the minimum and maximum execution times for each benchmark.
														
 
															 The results show significant improvements in execution times for both systems, with average gain of 3.04x in $S_{sta}$ and 2.85x in $S_{dyn}$.
														
 
															 While the effectiveness of checkpoint schemes varies depending on application characteristics, our setups evenly enhance performance across all benchmarks.
														
 
															 This underscores the importance of accurately detecting power-off events for efficient intermittent system operation.
														
@@ -135,7 +135,7 @@ F2 is set to have the lowest access latency but requires the system stop operati
 
															 Fig.~\ref{fig:expr_peripheral_voltage} presents the execution times of the benchmarks for the two configurations in $S_{dyn}$, averaged over 30 runs.
														
 
															 Despite its doubled latency, F1 completes the workloads 1.46x faster on average, with consistent improvements across all benchmarks.
														
 
															 These results suggest that using slower FRAM that operates until 1.8V (e.g.,~\cite{fujitsuMB85R4M2T}) could considerably improve the performance of our reference system.
														
 
															-This example clearly shows that operating voltage, often overlooked in the traditional execution model, should be considered a critical design parameter.
														
 
															+This example clearly shows that operating voltage, often overlooked in the traditional model, should be considered a critical design parameter.
														
 
															 Finally, our model highlights advantages of using smaller decoupling capacitors.
														
 
															 Larger buffers not only increases the ratio of sub-normal voltage operations but also raise the amount of discharged energy during power-offs.