1 年之前 · 8e38396e23
--- a/figs/plot_expr_10_cropped.pdf
+++ b/figs/plot_expr_10_cropped.pdf
--- a/figs/plot_expr_11_cropped.pdf
+++ b/figs/plot_expr_11_cropped.pdf
--- a/figs/plot_expr_8a_cropped.pdf
+++ b/figs/plot_expr_8a_cropped.pdf
--- a/figs/plot_expr_8b_cropped.pdf
+++ b/figs/plot_expr_8b_cropped.pdf
--- a/sections/Introduction.tex
+++ b/sections/Introduction.tex
@@ -41,7 +41,7 @@ For example, during power-on, decoupling capacitors are rapidly charged using th
 
				 This buffered energy also allows the system operate for a while at sub-normal voltages after the power supply is stopped.
			
 
				 Additionally, between power cycles, decoupling capacitors discharge due to the resistance of the system, considerably lowering the power efficiency.
			
 
				 In systems with smaller capacitors, these effects dominate the behaviors that are modeled in the traditional execution model.
			
 
				-Consequently, highly efficient techniques according to the traditional model may introduce substantial power overhead and even correctness issues in small-scale systems.
			
 
				+Consequently, highly efficient checkpoint techniques according to the traditional model may introduce substantial power overhead and even correctness issues in small-scale systems.
			
 
				 % Consequently, designing software techniques based on the traditional model brings significant power overhead and even correctness issues, even they are extremely efficient in the traditional model.
			
 
				 % While this seems merely delaying the start and the end of the operations at first glance, we will show that it significantly affects the power efficiency and even correctness of software designs.
			
 
				 
			
--- a/sections/OurApproach.tex
+++ b/sections/OurApproach.tex
@@ -12,7 +12,7 @@ In contrast, the dynamic scheme~\cite{jayakumarQUICKRECALL2014,maengSupporting20
 
				 Instead, it executes checkpoints via interrupts from the power management system, generated when $V_{ES}$ reaches $V_l$.
			
 
				 All the evaluations are conducted with 470uF energy storage and 1mA of input current at 1.9V, unless otherwise stated.
			
 
				 
			
 
				-\subsection{Delay Checkpoint Executions}
			
 
				+\subsection{Delaying Checkpoint Executions}
			
 
				 \label{sec:delay_checkpoint_execution}
			
 
				 
			
 
				 The first design practice we propose is to delay checkpoint executions until the last possible moment.
			
@@ -38,27 +38,27 @@ Achieving this fundamentally depends on accurately predicting imminent power fai
 
				 % Consequently, it is important to execute as long as possible whenever the system wakes up.
			
 
				 % In the next section, we discuss how this can be implemented in the existing intermittent systems.
			
 
				 
			
 
				-\subsection{Use $V_{dd}$ with a Reference Voltage for Checkpoint Signals}
			
 
				+\subsection{Using $V_{dd}$ with a Reference Voltage for Checkpoint Signals}
			
 
				 \label{sec:use_vdd_for_checkpoint}
			
 
				 
			
 
				 Sec.~\ref{sec:predicting_power_failures} demonstrates that $V_{ES}$ is not a good estimate for the system's remaining execution time.
			
 
				 Instead, we propose using $V_{dd}$ to more accurately estimate the imminent power-off events, similar to approaches used in systems without power management system (Sec.~\ref{sec:related_work}).
			
 
				-Also, when obtaining $V_{dd}$, it is important to account for the operations of ADC in sub-normal voltage conditions (Sec.~\ref{sec:sub_normal_execution}).
			
 
				+Additionally, when obtaining $V_{dd}$, it is important to account for the operations of ADC in sub-normal voltage conditions (Sec.~\ref{sec:sub_normal_execution}).
			
 
				 
			
 
				 For consistent operation of ADCs, we adopt a voltage source with a known value of $V_{ref}$.
			
 
				 In STM32L5 and MSP430, an internal reference voltage source of 1.2V is available; alternatively, an external voltage reference (e.g., TI LVM431~\cite{texasinstrumentsLMV431}) can be used.
			
 
				 Note that $V_{ref}$ should be lower than the minimal operating voltage of MCU (e.g., 1.7V) as $V_{ref}$ is generated by regulating $V_{dd}$.
			
 
				-We propose two efficient implementations, $T_{sta}$ and $T_{dyn}$, to accurately detect the imminent power-off events in static and dynamic checkpoint schemes.
			
 
				+We propose two efficient implementations, $S_{sta}$ and $S_{dyn}$, to accurately detect the imminent power-off events in static and dynamic checkpoint schemes, respectively.
			
 
				 
			
 
				-$T_{sta}$ is designed for static checkpoint techniques.
			
 
				-Instead of reading $V_{ES}$ at checkpoint triggers, $T_{sta}$ reads $V_{ref}$. 
			
 
				+$S_{sta}$ is designed for static checkpoint techniques.
			
 
				+Instead of reading $V_{ES}$ at checkpoint triggers, $S_{sta}$ reads $V_{ref}$. 
			
 
				 This results in the same value of $\lfloor V_{ref}/V_{dd} \cdot 2^n \rfloor$ when operating on normal voltage, where $n$ is the ADC resolution.
			
 
				 During sub-voltage execution, this value increases as $V_{dd}$ decreases, as discussed in Sec.~\ref{sec:sub_normal_execution}.
			
 
				 Given that the target threshold voltage for checkpoint execution is $V_{th}$, software designers can compare the ADC value against $\lfloor V_{ref}/V_{th} \cdot 2^n \rfloor$ to determine whether to execute a checkpoint.
			
 
				 
			
 
				-On the other hand, $T_{dyn}$ utilizes an on-chip comparator, which is available in most modern MCUs including STM32L5 and MSP430.
			
 
				+On the other hand, $S_{dyn}$ utilizes an on-chip comparator, which is available in most modern MCUs including STM32L5 and MSP430.
			
 
				 As $V_{ref}$ is always lower than $V_{dd}$, we use a voltage divider consisting of two resistors, $R1$ and $R2$, to scale $V_{dd}$ and compare it with $V_{ref}$.
			
 
				-Specifically, we configure $R1$ and $R2$ to satisfy $\frac{R2}{R1+R2} \cdot V_{th} = V_{ref}$, so the comparator generates an interrupt when $V_{dd}$ reaches the threshold voltage $V_{th}$.
			
 
				+Specifically, we configure $R1$ and $R2$ to satisfy $\frac{R2}{R1+R2} \cdot V_{th} = V_{ref}$, so the comparator generates an interrupt when $V_{dd}$ reaches the threshold $V_{th}$.
			
 
				 
			
 
				 % T2 is setup for static checkpoint techniques, which poll the capacitor voltage to determine whether execute checkpoint or not.
			
 
				 % Instead of reading the capacitor voltage, it reads the reference voltage.
			
@@ -73,13 +73,13 @@ Specifically, we configure $R1$ and $R2$ to satisfy $\frac{R2}{R1+R2} \cdot V_{t
 
				     \centering
			
 
				     \begin{subfigure}{\linewidth}
			
 
				         \includegraphics[width=\textwidth]{figs/plot_expr_11_cropped.pdf}
			
 
				-        \caption{Static checkpointing with $T_{sta}$.}
			
 
				+        \caption{Static checkpointing with $S_{sta}$.}
			
 
				         \label{fig:expr_precise_checkpoint_timings_static}
			
 
				         \vspace{7pt}
			
 
				     \end{subfigure}
			
 
				     \begin{subfigure}{\linewidth}
			
 
				         \includegraphics[width=\textwidth]{figs/plot_expr_10_cropped.pdf}
			
 
				-        \caption{Dynamic checkpointing with $T_{dyn}$.}
			
 
				+        \caption{Dynamic checkpointing with $S_{dyn}$.}
			
 
				         \label{fig:expr_precise_checkpoint_timings_dynamic}
			
 
				     \end{subfigure}
			
 
				     \caption{Impact of precise checkpoint timings to the end-to-end execution times.}
			
@@ -87,9 +87,9 @@ Specifically, we configure $R1$ and $R2$ to satisfy $\frac{R2}{R1+R2} \cdot V_{t
 
				 \end{figure}
			
 
				 
			
 
				 Fig.~\ref{fig:expr_precise_checkpoint_timings} shows the average end-to-end execution times of the benchmarks over 30 iterations, comparing the traditional systems with the proposed setups.
			
 
				-Fig.~\ref{fig:expr_precise_checkpoint_timings_static} illustrates the performance of $T_{sta}$ and Fig.~\ref{fig:expr_precise_checkpoint_timings_dynamic} presents the result for $T_{dyn}$.
			
 
				+Fig.~\ref{fig:expr_precise_checkpoint_timings_static} illustrates the performance of $S_{sta}$ and Fig.~\ref{fig:expr_precise_checkpoint_timings_dynamic} presents the result for $S_{dyn}$.
			
 
				 The error bars indicate the minimum and maximum measured execution times for each benchmark.
			
 
				-The results clearly demonstrate that the execution time is significantly improved in both systems by extending the operation at sub-normal voltages: 3.04x in $T_{sta}$ and 2.85x in $T_{dyn}$.
			
 
				+The results clearly demonstrate that the execution time is significantly improved in both systems by extending the operation at sub-normal voltages: 3.04x in $S_{sta}$ and 2.85x in $S_{dyn}$.
			
 
				 Furthermore, these improvements are consistent across all benchmarks, regardless of the application characteristics, highlighting the general effectiveness of the proposed setups.
			
 
				 
			
 
				 Another advantage of the proposed setups is their simplicity and practical applicability.
			
--- a/sections/OurModel.tex
+++ b/sections/OurModel.tex
@@ -23,14 +23,14 @@ The computing system equips NVMs along with the MCU and peripherals, and utilize
 
				 
			
 
				 This setup includes two notable decoupling capacitors that affect the execution model of intermittent systems.
			
 
				 The first one (C1 in the figure) is located within the power management system as voltage regulators require a capacitor larger than the device-specific minimum to ensure stable operation.
			
 
				-The second capacitor (C2) is part of the computing system and is used for stabilizing the operating voltage against sudden current draw.
			
 
				+The second capacitor (C2) is part of the computing system and it stabilizes the operating voltage against sudden current draw.
			
 
				 
			
 
				 Recent studies have increasingly explored 32-bit architectures for computing systems~\cite{shihIntermittent2024,wuIntOS2024,kimRapid2024,akhunovEnabling2023,kimLACT2024,kimLivenessAware2023,parkEnergyHarvestingAware2023,kortbeekWARio2022,khanDaCapo2023,barjamiIntermittent2024,songTaDA2024}, as emerging applications on intermittent systems, such as Deep Neural Networks (DNNs)~\cite{houTale2024,yenKeep2023,khanDaCapo2023,gobieskiIntelligence2019,islamEnabling2022,kangMore2022,leeNeuro2019,islamZygarde2020,custodeFastInf2024,barjamiIntermittent2024,songTaDA2024}, demand greater computational capabilities~\cite{bakarProtean2023a,carontiFinegrained2023}.
			
 
				 In this context, we employ a custom-built board featuring a 32-bit ARM Cortex-M33 processor (STM32L5, operating at 16Mhz) with 512KB of Ferroelectric RAM (FRAM) as our reference system.
			
 
				-For the power management system, we use a TI BQ25570-based board with configuration of $V_h$ = 4.9V and $V_l$ = 3.4V.
			
 
				+For the power management system, we use a TI BQ25570-based board configured with $V_h$ = 4.9V and $V_l$ = 3.4V.
			
 
				 % For the power management system, we use a TI BQ25570-based board with power-on and power-off thresholds of 4.9 V and 3.4 V, respectively.
			
 
				 % A TI BQ25570 based board is used for the power management system, with power-on and off thresholds of 4.9V and 3.4V, respectively.
			
 
				-We empirically select 22uF and 220uF capacitors for C1 and C2, respectively, as the capacitors smaller then these cannot provide a reliable voltage for stable checkpoint and recovery.
			
 
				+We empirically select 22 uF and 220 uF capacitors for C1 and C2, respectively, as smaller capacitors fail to provide a reliable voltage for stable checkpoint and recovery.
			
 
				 % We empirically select 22uF and 220uF capacitors for C1 and C2, respectively, as these are the minimum capacitor sizes for stable checkpoint and recovery.
			
 
				 Sec.~\ref{sec:other_architectures} evaluates the generality of our model across different architectures, such as systems with Magnetic RAM (MRAM) and a 16-bit core (e.g., MSP430).
			
 
				 
			
@@ -53,23 +53,23 @@ Sec.~\ref{sec:other_architectures} evaluates the generality of our model across
 
				     \end{subfigure}
			
 
				     \begin{subfigure}{\linewidth}
			
 
				         \includegraphics[width=\textwidth]{figs/plot_expr_8b_cropped.pdf}
			
 
				-        \caption{Voltage traces around the first power-on.}
			
 
				+        \caption{Voltage traces of the first execution cycle.}
			
 
				         \label{fig:execution_trace_detailed}
			
 
				     \end{subfigure}
			
 
				-    \caption{Voltages trace of energy storage and Vdd.}
			
 
				+    \caption{Voltages trace of energy storage ($V_{ES}$) and $V_{dd}$.}
			
 
				     \label{fig:execution_trace}
			
 
				 \end{figure}
			
 
				 
			
 
				-To derive general execution model with the effects of decoupling capacitors, we first present a sample measurement from our reference system.
			
 
				-Within this paper, we denote the voltage of the energy storage $C_{ES}$ as $V_{ES}$ and the MCU operating voltage as $V_{dd}$.
			
 
				-To achieve an operation time of 50ms under 1.5mA current supply, we use a 470uF capacitor for $C_{ES}$.
			
 
				+To derive a general execution model with the effects of decoupling capacitors, we first present a sample measurement from our reference system.
			
 
				+In this paper, we denote the voltage of the energy storage $C_{ES}$ as $V_{ES}$ and the MCU operating voltage as $V_{dd}$.
			
 
				+To achieve an operation time of 50 ms under 1.5 mA current supply, we use a 470 uF capacitor for $C_{ES}$.
			
 
				 Fig.~\ref{fig:execution_trace_one_cycle} illustrates the voltage traces of $V_{ES}$ and $V_{dd}$ over a single power cycle.
			
 
				-Note that $V_{dd}$ is maintained by decoupling capacitors after current supply from the power management system stops.
			
 
				+Note that $V_{dd}$ is maintained by decoupling capacitors once the power supply from the power management system stops.
			
 
				 The shaded areas represent the periods that system executes the application code.
			
 
				 % Fig.~\ref{fig:execution_trace_one_cycle} shows the trace during one power cycle, and Fig.~\ref{fig:execution_trace_detailed} presents the first execution cycle in more detail.
			
 
				 
			
 
				 Fig.~\ref{fig:execution_trace_detailed} presents the first execution cycle in more detail. It reveals several differences between the traditional execution model and the actual operation.
			
 
				-Among them, we highlight three key observations that affect software design decisions.
			
 
				+Among them, we highlight three key observations that affect software design decisions:
			
 
				 
			
 
				 \begin{itemize}
			
 
				     \item \textbf{O1}: The capacitor voltage ($V_{ES}$) drops rapidly to charge decoupling capacitors when the system wakes up ($t1$--$t2$).
			
@@ -111,19 +111,19 @@ This indicates that much smaller energy may be used for the useful computation c
 
				     \label{fig:power_distribution}
			
 
				 \end{figure}
			
 
				 
			
 
				-Fig.~\ref{fig:power_distribution} shows the distribution of the energy consumption for each stage of operation within one power cycle, averaged over 50 executions, where 1mA of input current is provided at 1.9V.
			
 
				+Fig.~\ref{fig:power_distribution} shows the distribution of the energy consumption for each stage of operation within one power cycle, averaged over 50 executions, where 1 mA of input current is provided at 1.9V.
			
 
				 The x-axis represents energy storage sizes and the line in the secondary axis represents the average operation times for application code.
			
 
				 The checkpoint is executed by the interrupt from the power management system~\cite{jayakumarQUICKRECALL2014,maengSupporting2019,balsamoHibernus2016,balsamoHibernus2015,kortbeekTimesensitive2020}, which is generated when $V_{ES}$ reaches $V_l$ (3.4V).
			
 
				 Note that this is the most efficient point for checkpoint execution according to the traditional model (i.e., just before the poweroff).
			
 
				 
			
 
				 The results shows that significant energy is wasted in the decoupling capacitors.
			
 
				-For example, 60.7\% of power is wasted during the power-off duration (denoted as \emph{Dischrged}) in 470uF case, leaving just 13.1\% of the energy for computation.
			
 
				+For example, in 470 uF case, 60.7\% of the energy is lost during the power-off duration (denoted as \emph{Dischrged}), leaving only 13.1\% of the energy for computation.
			
 
				 The discharging behavior can be modeled as an RC-discharging circuit (i.e., $q=CVe^{-\frac{t}{RC}}$), which exhibits an exponential discharge rate.
			
 
				-Indeed, we observe that 50\% of energy is discharged within the first 161 ms.
			
 
				-Since recharging $C_{ES}$ takes xx secs even in 470uF configuration, most of the buffered energy is lost at the next power-on regardless of the capacitor sizes.
			
 
				-As a result, the energy loss ratio due to discharging is larger when the capacitor size is small.
			
 
				-While this ratio decreases with larger $C_{ES}$, it remains significant;
			
 
				-for example, in the 1320uF case, 28.5\% of energy is discharged, which is still non-negligible.
			
 
				+Indeed, 50\% of the energy is discharged within the first 161 ms in our measurements.
			
 
				+Since recharging $C_{ES}$ takes xx secs even in 470 uF configuration, most of the buffered energy is lost before the next power-on, regardless of the capacitor size.
			
 
				+As a result, the energy loss ratio due to discharging is larger with smaller capacitors.
			
 
				+While this ratio decreases with larger $C_{ES}$, it remains substantial;
			
 
				+for example, in the 1320 uF case, 28.5\% of energy is discharged, which is still non-negligible.
			
 
				 % The discharge rate decreases as the capacitor size increases, down to 28.5\% in 1320uF case, which is still not negligible.
			
 
				 % The cost is more expensive when the capacitor size is small since the discharge rate follows the RC-discharging circuits.
			
 
				 % While using smaller capacitors shortens the power-off durations, the discharging behavior penalizes them most since RC-discharging circuits discharge exponentially (in our case, 50\% of energy is discharged at the first 161 ms).
			
@@ -131,8 +131,8 @@ for example, in the 1320uF case, 28.5\% of energy is discharged, which is still
 
				 
			
 
				 Another important observation is the error introduced by the traditional model.
			
 
				 The traditional model expects both the energies, \emph{Execution} and \emph{Discharged}, are used for computation.
			
 
				-This introduces huge errors, up to 5.62x in 470uF setup, for example.
			
 
				-In the same context, the traditional model predicts that using a 470uF capacitor instead of a 1320uF would result in only 1.22x overhead in energy efficiency, while the actual difference is 4.71x.
			
 
				+This introduces huge errors, up to 5.62x in 470 uF setup, for example.
			
 
				+In the same context, the traditional model predicts that using a 470 uF $C_{ES}$ instead of a 1320 uF would result in only 1.22x overhead in energy efficiency, while the actual difference is 4.71x.
			
 
				 % However, our model shows that the actual energy efficiency differs by xx\% in reality, brining xx\% error in the traditional model.
			
 
				 This can significantly mislead system designers when they select capacitor sizes by considering tradeoffs between overall efficiency and reactiveness.
			
 
				 In Sec.~\ref{sec:design_guidelines}, we explore strategies to minimize the inefficiencies caused by discharging when designing software techniques.
			
@@ -148,9 +148,9 @@ In Sec.~\ref{sec:design_guidelines}, we explore strategies to minimize the ineff
 
				 \subsection{Impact on Predicting Power Failures}
			
 
				 \label{sec:predicting_power_failures}
			
 
				 
			
 
				-According to the traditional model, system states should be saved to NVM before $V_{ES}$ reaches $V_l$, as the system halts at this point.
			
 
				+According to the traditional model, system states should be saved to NVM before $V_{ES}$ reaches $V_l$, as the system is expected to halt at this point.
			
 
				 On the other hand, our model shows that the system may continue operating using the energy stored in the decoupling capacitors (\textbf{O2}). 
			
 
				-Since modern MCUs can operate across a wide range of supply voltages (e.g., from 1.7V to 3.6V in STM32L5 and MSP430), the computing system is executed until the voltage of decoupling capacitors drops to the minimum operating level.
			
 
				+Since modern MCUs can operate across a wide range of supply voltages (e.g., 1.7V to 3.6V in STM32L5 and MSP430), the computing system operates until the voltage of decoupling capacitors drops to the minimum operating level.
			
 
				 % Modern MCUs can operate on a range of supply voltages (e.g., from 1.7V to 3.6V for STM32L5 and MSP430).
			
 
				 % Since the voltage of decoupling capacitors decreases as the discharge, the computing system is executed until the voltage reaches the minimum operating voltage.
			
 
				 % While the voltage of decoupling capacitors decreases as they discharge, the computing system operates since modern MCUs can operate on a range of supply voltages (e.g., from 1.7V to 3.6V for STM32L5 and MSP430).
			
@@ -179,16 +179,16 @@ This makes $V_{ES}$ not a reliable indicator for the imminent power-off.
 
				 Fig.~\ref{fig:sub_voltage_execution} presents the ratio of the times executed under sub-normal voltage to the total execution times, averaged over 30 measurements.
			
 
				 The x-axis represents different capacitor sizes and the colors indicate the voltage levels at which the system stops operation.
			
 
				 We evaluate a range of stop voltages from 1.7V to 2.5V since not all components in the computing system may function at the lowest voltage level (Sec.~\ref{sec:sub_normal_execution}).
			
 
				-Also, we examine two cases with different input currents of 1mA (Fig.~\ref{fig:sub_voltage_execution_1mA}) and 3mA (Fig.~\ref{fig:sub_voltage_execution_3mA}), to assess the impact of varying input power.
			
 
				+Also, we examine two cases with different input currents of 1 mA (Fig.~\ref{fig:sub_voltage_execution_1mA}) and 3 mA (Fig.~\ref{fig:sub_voltage_execution_3mA}), to assess the impact of input power.
			
 
				 
			
 
				-The figure shows that a significant portion of MCU operation occurs at sub-normal voltage.
			
 
				+The figure shows that a significant portion of MCU operation occurs at sub-normal voltages.
			
 
				 For example, when 470uF capacitor is used at 1mA input current (Fig.~\ref{fig:sub_voltage_execution_1mA}), 82.8\% of computation takes place \emph{after} the power-off threshold.
			
 
				 This ratio decreases as the system stops earlier (reducing sub-voltage operation time) or the input current increases (extending operation time at normal voltage).
			
 
				-However, at least 13.0\% of computations are operated in sub-normal voltage even in a highly optimistic configurations (1320uF in Fig.~\ref{fig:sub_voltage_execution_3mA}).
			
 
				+However, at least 13.0\% of computations are operated at sub-normal voltages even in highly optimistic configurations (1320uF in Fig.~\ref{fig:sub_voltage_execution_3mA}).
			
 
				 % Overall, the average sub-voltage operation ratio is xx\% for the configurations exhibiting less than 100 ms, which is the main focus of this paper. 
			
 
				 
			
 
				 These values can be directly translated to the inefficiencies of the systems based on the traditional model.
			
 
				-For example, in the case of 470uF with 1mA input current, systems executing checkpoint at $V_l$ may operate 16.3ms.
			
 
				+For example, in the case of 470uF with 1mA input current, systems executing checkpoint at $V_l$ may operate 16.3 ms.
			
 
				 However, the system could operate for an additional 29.4ms if the checkpoint is executed at 2.5V.
			
 
				 At the next power-on, the decoupling capacitors discharge to similar voltage levels in both cases, as their discharge behavior follows an exponential curve (Sec.~\ref{sec:power_efficiency}).
			
 
				 As a result, failing to utilize the available energy at sub-normal voltage introduces significant power efficiency overhead.
			
@@ -200,11 +200,11 @@ In Sec.~\ref{sec:use_vdd_for_checkpoint}, we validate this aspect and propose me
 
				 
			
 
				 The traditional model leads the software designers to assume that the system is executed under a stable voltage.
			
 
				 However, a significant portion of execution may happen after the power-off threshold at sub-normal voltages (\textbf{O3}), as discussed in Sec.~\ref{sec:predicting_power_failures}.
			
 
				-Being aware of this is crucial to software designers since analog components and peripherals may function differently at sub-normal voltage.
			
 
				+Being aware of this is crucial to software designers since analog components and peripherals may function differently at sub-normal voltages.
			
 
				 
			
 
				 Two of the most relevant examples are Analog-Digital Converters (ADCs) and external NVMs.
			
 
				-They play an important role in checkpointing, since ADCs are often used to determine checkpoint execution by reading $V_{ES}$ and NVM serves as the storage for checkpoints.
			
 
				-At the same time, they are likely to operate at sub-normal voltages, as it is most efficient to execute checkpoint just before power-off.
			
 
				+They play an important role in checkpointing: ADCs are commonly used to determine when to execute a checkpoint by reading $V_{ES}$ and NVM serves as the storage for the checkpoints.
			
 
				+At the same time, they are likely to operate at sub-normal voltages, as checkpoint executions typically happen just before the power-off.
			
 
				 % Incorrect execution of these components may lead to unsafe or incomplete checkpoint executions.
			
 
				 
			
 
				 \begin{figure}
			
@@ -225,17 +225,17 @@ At the same time, they are likely to operate at sub-normal voltages, as it is mo
 
				 \end{figure}
			
 
				 
			
 
				 Fig.~\ref{fig:adc_error} shows the behavior of ADCs in sub-normal voltage.
			
 
				-ADC quantizes the input analog voltage into the range of discrete $2^n$ values from 0 to the given reference voltage, where $n$ is a resolution.
			
 
				-Since $n$ is fixed, using smaller reference voltage results in higher resolution, at the cost of reduced representation range.
			
 
				+ADC quantizes the input analog voltage into discrete $2^n$ values, ranging from 0 to the given reference voltage, where $n$ is a resolution.
			
 
				+Since $n$ is fixed, using smaller reference voltage increases sensitivity of the ADC at the cost of reduced representation range.
			
 
				 As STM32L5 is designed to use $V_{dd}$ as a reference voltage, accessing the ADC during sub-normal voltage operation leads to inconsistent results.
			
 
				 As shown in the figure, the ADC returns values higher than the measurements since its representation range is decreased as $V_{dd}$ drops.
			
 
				 As a result, during sub-normal voltage operation, the system may incorrectly interpret ADC results as if there is sufficient energy in $C_{ES}$ and decide not to execute a checkpoint, resulting in loss of the progress during the entire power cycle.
			
 
				 
			
 
				-Also, intermittent systems typically designed to be deployed with peripherals, including sensors~\cite{yildizAdaptable2024,dangIoTree2022,afanasovBatteryless2020,maengAdaptive2020}, wireless communication modules~\cite{katanbafMultiScatter2021,dewinkelIntermittentlypowered2022,babatundeGreentooth2024} or external NVMs~\cite{dewinkelIntermittentlypowered2022,kimLACT2024,kimLivenessAware2023,akhunovEnabling2023}, which have their own minimum operating voltage requirements.
			
 
				+Also, intermittent systems typically designed to operate with peripherals such as sensors~\cite{yildizAdaptable2024,dangIoTree2022,afanasovBatteryless2020,maengAdaptive2020}, wireless communication modules~\cite{katanbafMultiScatter2021,dewinkelIntermittentlypowered2022,babatundeGreentooth2024} or external NVMs~\cite{dewinkelIntermittentlypowered2022,kimLACT2024,kimLivenessAware2023,akhunovEnabling2023}, which have their own minimum operating voltage requirements.
			
 
				 % Also, some peripherals may not work below certain voltage.
			
 
				 Fig.~\ref{fig:fram_drror} illustrates the error rate of FRAM in the reference system at different voltages, showing FRAM cannot operate reliably below 2.4V.
			
 
				 Since the system continues operating until it reaches the lowest MCU operation voltage (e.g., 1.7V), software designers must ensure that peripherals are accessed only at safe voltage levels.
			
 
				-Failing to this can result in corrupted sensor data or unsafe checkpointing.
			
 
				+Failing to do so can result in corrupted sensor data or unsafe checkpointing.
			
 
				 % In Sec.~\ref{sec:design_guidelines}, we propose two techniques that can safely estimate the power-off time under sub-normal voltage conditions.
			
 
				 
			
 
				 \subsection{Sensitivity to Architectural Designs}
			
@@ -262,11 +262,11 @@ Failing to this can result in corrupted sensor data or unsafe checkpointing.
 
				     }
			
 
				     \end{table}
			
 
				 
			
 
				-To assess generality, we evaluate the proposed model across two additional architectural setups.
			
 
				+To evaluate the generality of the proposed model, we assess it across two additional architectural setups.
			
 
				 Table~\ref{tab:architectures} shows the detailed parameters of the target architectures.
			
 
				 A1 shares the same configuration as the reference system but equips MRAM (Everspin MR5A16ACYS35) instead of FRAM.
			
 
				 This setup is included since MRAM is also gaining attention as a next generation NVM~\cite{akhunovEnabling2023,bakarProtean2023a,dewinkelIntermittentlypowered2022,wuIntOS2024}.
			
 
				-Second target is MSP430, which has been the mostly adopted 16-bit platform in intermittent system research.
			
 
				+Second target is MSP430, a widely adopted 16-bit platform in intermittent system research.
			
 
				 For both systems, the architectural parameters are set to achieve an operation time of approximately 50 ms.
			
 
				 
			
 
				 \begin{figure}
			
@@ -277,10 +277,10 @@ For both systems, the architectural parameters are set to achieve an operation t
 
				 \end{figure}
			
 
				 
			
 
				 Fig.~\ref{fig:other_architectures} shows the results for different power-off voltages.
			
 
				-The bars on the left illustrate the energy breakdown in a single power cycle, and the bars on the right represent the ratio of the execution time operated at sub-voltage.
			
 
				+The bars on the left illustrate the energy breakdown in a single power cycle, and the bars on the right represent the ratio of the execution time operated at sub-normal voltages.
			
 
				 The most noticeable difference is ratio of energy consumed during the \emph{Ramp-up \& Init} stage.
			
 
				 While A1 consumes 63.4\% power at this stage on average, only 5.6\% of energy is consumed in A2.
			
 
				-This is because A1 is configured with an external MRAM, which exhibits significantly higher leakage current even compared to the FRAM used in the reference system.
			
 
				+This is because A1 is configured with an external MRAM, which exhibits significantly higher leakage current, even compared to the FRAM used in the reference system.
			
 
				 In contrast, A2 is equipped with on-chip FRAM, which has much lower leakage.
			
 
				 
			
 
				 Despite these differences, both architectures exhibit high sub-voltage execution rates, up to 55.5\% in A1 and 70.1\% in A2.
			
--- a/sections/RelatedWork.tex
+++ b/sections/RelatedWork.tex
@@ -2,13 +2,13 @@
 
				 \label{sec:related_work}
			
 
				 
			
 
				 This work can be compared to existing modeling-based approaches that estimate the timing or efficiency of intermittent systems~\cite{kimRapid2024,houTale2024,erataETAP2023,ghasemiPES2023,sanmiguelEH2018a,sanmiguelEH2018}.
			
 
				-The primary focus of these works is to identify the most efficient design configurations (e.g., capacitor size, input power or checkpoint techniques~\cite{kimRapid2024}) for a given application.
			
 
				+The primary focus of these works is identifying the most efficient design configurations (e.g., capacitor size, input power or checkpoint techniques~\cite{kimRapid2024}) for a given application.
			
 
				 Zhan et al.~\cite{zhanExploring2022} especially examined the trade-offs between capacitor sizes and forward progress.
			
 
				-However, the modelings in these works assume that the entire energy discharged from the capacitor is utilized by computing system, overlooking the buffering effects addressed in this work.
			
 
				+However, these works assume that the entire energy discharged from the capacitor is utilized by computing system, overlooking the buffering effects addressed in this work.
			
 
				 Furthermore, our work proposes several practical guidelines to improve the efficiency of existing techniques with minimal efforts.
			
 
				 
			
 
				 In some works that do not have a dedicated power management system and directly supply unregulated power to the computing system~\cite{balsamoHibernus2015,balsamoHibernus2016,netoDiCA2023,raffeckCO2CoDe2024,reymondEarlyBird2024}, the MCU operating voltage ($V_{dd}$) has been used as a checkpoint signal.
			
 
				-This is natural in these works since the voltage of the energy storage ($V_{ES}$) is always identical to $V_{dd}$.
			
 
				+This is natural in these works since the voltage of the energy storage is always identical to $V_{dd}$.
			
 
				 % This is natural in these works since the voltage of the energy storage is always same as Vdd and the MCU operates in varying voltage levels.
			
 
				 In contrast, our work demonstrates that accounting for sub-normal voltage operation is also critical in systems with regulated power supplies, which represent the majority of intermittent system setups.
			
 
				 % Especially, this work reveals that these impacts come from the buffering effects of the inherent capacitance, which are not exist in these works.