1 год назад · dd5e860a63
--- a/files/figures.pptx
+++ b/files/figures.pptx
--- a/sections/Introduction.tex
+++ b/sections/Introduction.tex
@@ -2,12 +2,12 @@
 
															 Batteryless systems are emerging as a promising future platform of Internet-of-Things (IoT) devices.
														
 
															 These systems adopt a small capacitor as an energy storage and operate by harvesting power from environmental sources.
														
 
															 This setup effectively addresses challenges associated with traditional battery-based systems, such as need for human intervention for recharging or replacement~\cite{choiCompilerDirected2022} and harmful environmental impacts~\cite{ahmedInternet2024}.
														
 
															-These systems are also known as intermittent systems, since the computation happens intermittently during short time only when there exist sufficient power to compute.
														
 
															+They are also known as intermittent systems, since the computation happens intermittently during short time only when there exist sufficient power to compute.
														
 
															-Intermittent systems require software support to sustain long-running executions across power failures.
														
 
															+Intermittent systems require software supports to sustain long-running executions across power failures.
														
 
															 % An intermittent system requires software support to retain volatile system state information across power interruptions. 
														
 
															 During operation, volatile data (e.g., registers or SRAM data) must be saved to Non-Volatile Memories (NVMs) through a process known as checkpointing.
														
 
															-When power is restored, this saved state is recovered to allow operations to continue the execution (recovery). 
														
 
															+When power is restored, this saved state is recovered to allow operations to resume the execution from the last saved context (recovery). 
														
 
															 In designing these state retention techniques, software designers rely on an \emph{execution model} that abstracts hardware-level operations and represents behavior of intermittent systems necessary for software design.
														
 
															 Figure 1 illustrates such execution model commonly adopted in literature~\cite{}. 
														
@@ -31,15 +31,15 @@ As a result, recent studies have targeted operation times in the range of tens o
 
															 However, as energy storage size decreases, the traditional execution model is failing to provide an accurate abstraction of actual execution behavior.
														
 
															 % The challenge is that the traditional execution model does not provide precise abstraction of the real execution anymore when the energy storage is very small.
														
 
															 The major source of this discrepancy is the buffering effect of the system's inherent capacitance, mostly coming from its decoupling capacitors.
														
 
															-This aspect is overlooked in the traditional model, as the inherent capacitance was considered negligible compared to the main energy storage.
														
 
															+This aspect is overlooked in the traditional model, as the inherent capacitance was considered negligible compared to that of the main energy storage.
														
 
															 Decoupling capacitors are on-board capacitors that act as energy buffers.
														
 
															 They are mandatory components since the buffered energy prevent transient voltage drop when the system suddenly draws a large current, such as during checkpointing (Sec.~\ref{sec:system_description}).
														
 
															 However, at the same time, their buffering effect introduces discrepancies between the execution model and the actual system behavior.
														
 
															-For example, when the system powers on, decoupling capacitors are quickly charged using the energy in the storage, making the energy storage voltage an unreliable estimate of available energy.
														
 
															+For example, when the system powers on, decoupling capacitors are quickly charged using the energy in the storage, making capacitor voltage an unreliable estimate of available energy.
														
 
															 This buffered energy also allows the system operate for a while at a sub-normal voltage after the power supply is stopped.
														
 
															 Additionally, between power cycles, decoupling capacitors discharge due to the resistance of the system, considerably lowering the power efficiency.
														
 
															-In the systems with small capacitors, these effects dominate the behaviors that are modeled in the traditional model.
														
 
															+In the systems with small capacitors, these effects dominate the behaviors that are modeled in the traditional execution model.
														
 
															 Consequently, highly efficient techniques according to the traditional model may introduce substantial power overhead and even correctness issues in small-scale systems.
														
 
															 % Consequently, designing software techniques based on the traditional model brings significant power overhead and even correctness issues, even they are extremely efficient in the traditional model.
														
 
															 % While this seems merely delaying the start and the end of the operations at first glance, we will show that it significantly affects the power efficiency and even correctness of software designs.
														
@@ -60,4 +60,4 @@ In this paper, we propose a new execution model for intermittent systems which i
 
															 In Sec.~\ref{sec:detailed_execution_model}, we demonstrate that understanding this model is critical for software designers:
														
 
															 intermittent systems designed upon the traditional model can be up to 5.62x more energy-inefficient than expected and may fail to predict power-off timings accurately, leading to unsafe checkpointing. 
														
 
															 In Sec.~\ref{sec:design_guidelines}, we propose design guidelines to implement efficient and safe intermittent systems with small energy storages, based on the insights from our model.
														
 
															-Without incurring any extra overhead, our proposed power failure prediction methods improve end-to-end execution latency of both dynamic and static checkpointing schemes, by 2.86x and 3.04x on average, respectively.
														
 
															+Our proposed power failure prediction methods improve end-to-end execution latencies by 2.86x in dynamic and 3.04x in static checkpointing schemes on average, without incurring any additional overhead.
														
--- a/sections/OurModel.tex
+++ b/sections/OurModel.tex
@@ -26,9 +26,8 @@ The first one (C1 in the figure) is placed at the power management system as vol
 
															 Also, the computing system has its own decoupling capacitor (C2) to stabilize operating voltage.
														
 
															 Recent studies increasingly explore 32-bit architectures for the computing system~\cite{shihIntermittent2024,wuIntOS2024,kimRapid2024,akhunovEnabling2023,kimLACT2024,kimLivenessAware2023,parkEnergyHarvestingAware2023,kortbeekWARio2022,khanDaCapo2023,barjamiIntermittent2024,songTaDA2024}, as emerging applications on intermittent systems, such as Deep Neural Networks (DNNs)~\cite{houTale2024,yenKeep2023,khanDaCapo2023,gobieskiIntelligence2019,islamEnabling2022,kangMore2022,leeNeuro2019,islamZygarde2020,custodeFastInf2024,barjamiIntermittent2024,songTaDA2024}, demand greater computational capabilities~\cite{bakarProtean2023a,carontiFinegrained2023}.
														
 
															-% Emerging intermittent applications demand increasing computing capability~\cite{bakarProtean2023a} due to computat
														
 
															-In this context, we employ a custom-built board featuring a 32-bit ARM Cortex-M33 processor (operating at 16Mhz) with 1MB of Ferroelectric RAM (FRAM, Infineon FM22L16) as a reference system.
														
 
															-A TI BQ25570 based board is used for the power management system.
														
 
															+In this context, we employ a custom-built board featuring a 32-bit ARM Cortex-M33 processor (STM32L5, operating at 16Mhz) with 512KB of Ferroelectric RAM (FRAM) as a reference system.
														
 
															+A TI BQ25570 based board is used for the power management system, with power-on and off thresholds of 4.9V and 3.4V, respectively.
														
 
															 We empirically select 22uF and 220uF capacitors for C1 and C2, respectively, as these are the minimum capacitor sizes for stable checkpoint and recovery.
														
 
															 Sec.~\ref{sec:other_architectures} evaluates generality of our model in different architectures, such as systems with Magnetic RAM (MRAM) and a 16-bit core (e.g., MSP430).
														
@@ -83,11 +82,11 @@ Among them, we highlight three key observations that affect software designer's
 
															 % As we discuss in the following sections, all three observations significantly impact the performance of intermittent system designs.
														
 
															 % We propose a detailed execution model which reflects these observations.
														
 
															-Fig.~\ref{fig:detailed_execution_model} shows our detailed execution model, which reflects all the key observations.
														
 
															+Fig.~\ref{fig:detailed_execution_model} shows our detailed execution model, which reflects these key observations.
														
 
															 When the capacitor voltage reaches the power-on threshold, the voltage experience quick drop due to the buffering effects (\circled{1}), instead of gradual reduction.
														
 
															 After initialization (\circled{2}), the system starts to execute at normal voltage (\circled{3}), 3.3V for example.
														
 
															 When the voltage hits the power-off threshold, the power supply stops but system now starts to execute using the buffered energy (\circled{4}).
														
 
															-Since voltage of the decoupling capacitor decreases as it discharges, the system executes at sub-normal voltage until it reaches the voltage it cannot operate (e.g., 1.7V).
														
 
															+Since voltage of the decoupling capacitor decreases as it discharges, the system executes at sub-normal voltage until it reaches the voltage it cannot operate (e.g., 2.5V).
														
 
															 % This voltage is known as Brown-Out Reset (BOR) voltage and is typically in a range of 1.7V to 2.5V in modern MCUs~\cite{}.
														
 
															 Finally, until the next power-on, the remaining energy in decoupling capacitors continues to discharge (\circled{5}).
														
@@ -97,7 +96,7 @@ In the following sections, we discuss the impact of this model to software desig
 
															 \subsection{Impact on Power Efficiency}
														
 
															 \label{sec:power_efficiency}
														
 
															-The traditional model implies that the energy consumed between power-on and power-off thresholds are entirely used for the computing system.
														
 
															+The traditional model implies that the energy consumed between power-on and power-off thresholds are entirely used in the computing system.
														
 
															 However, our model reveals that considerable energy is used for charging the decoupling capacitors (\textbf{O1}) and dissipated during power-off durations (\textbf{O3}).
														
 
															 This implies that much smaller energy may be used for the useful computation compared to the designer's expectation.
														
@@ -108,15 +107,14 @@ This implies that much smaller energy may be used for the useful computation com
 
															     \label{fig:power_distribution}
														
 
															 \end{figure}
														
 
															-Fig.~\ref{fig:power_distribution} shows the distribution of the energy consumed for each stage of operation within one power cycle, averaged over 50 executions.
														
 
															-An 1mA of input current is provided at 1.9V.
														
 
															+Fig.~\ref{fig:power_distribution} shows the distribution of the energy consumed for each stage of operation within one power cycle, averaged over 50 executions, where 1mA of input current is provided at 1.9V.
														
 
															 The x-axis represents different capacitor sizes and the line in the secondary axis represents the average operation times for application code.
														
 
															 The checkpoint is executed by the interrupt from the power management system~\cite{}, which is generated when the capacitor voltage reaches the power-off threshold (3.4V).
														
 
															 Note that this is the most efficient point for checkpoint execution according to the traditional model.
														
 
															 The results shows that significant energy is wasted in the decoupling capacitors.
														
 
															 For example, 60.7\% of power is wasted during the power-off duration (denoted as \emph{Dischrged}) in 470uF case.
														
 
															-The discharging behavior can be modeled as RC-discharging circuits (i.e., $q=CVe^{-\frac{1}{RC}t}$), which show exponential discharge rate.
														
 
															+The discharging behavior can be modeled as RC-discharging circuit (i.e., $q=CVe^{-\frac{1}{RC}t}$), which has exponential discharge rate.
														
 
															 As a result, the cost from discharging is more expensive when the capacitor size is small;
														
 
															 in our case, 50\% of energy is discharged at the first 161 ms.
														
 
															 The discharge rate decreases as the capacitor size increases, down to 28.5\% in 1320uF case, which is still not negligible.
														
@@ -130,7 +128,7 @@ This introduces significant errors, up to 5.62x in 470uF setup.
 
															 In the same context, the traditional model expects using 470uF capacitor instead of 1320uF results in merely 1.22x overhead in energy efficiency, but the actual energy efficiency differs by 4.71x.
														
 
															 % However, our model shows that the actual energy efficiency differs by xx\% in reality, brining xx\% error in the traditional model.
														
 
															 This can significantly mislead the system designers when they decide the capacitor size by considering tradeoffs between overall efficiency and reactiveness.
														
 
															-In Sec.~\ref{sec:design_guidelines}, we discuss our guidelines to minimize overhead from discharging when designing software techniques.
														
 
															+In Sec.~\ref{sec:design_guidelines}, we discuss options to minimize overhead from discharging when designing software techniques.
														
 
															 % More importantly, this wasted energy is expected to be used for the computation in traditional execution model, as all the energy except for the initialization and checkpoint/recovery is expected to be used in computations.
														
 
															 % It brings significant errors between the two models in available energy for the execution.
														
@@ -141,11 +139,13 @@ In Sec.~\ref{sec:design_guidelines}, we discuss our guidelines to minimize overh
 
															 % In Sec.~\ref{sec:design_guidelines}, we discuss our guidelines to maximize power efficiency with software-level designs.
														
 
															 \subsection{Impact on Predicting Power Failures}
														
 
															+\label{sec:predicting_power_failures}
														
 
															 According to the traditional model, the system states should be saved to NVM before power-off threshold, as the system halts at this point.
														
 
															 On the other hand, our model shows that the system may operate afterward using the energy stored in the decoupling capacitors (\textbf{O2}). 
														
 
															-Modern MCUs can operate on a range of supply voltages (e.g., from 1.7V to 3.6V for STM32L5 and MSP430).
														
 
															-Since the voltage of decoupling capacitors decreases as the discharge, the computing system is executed until the voltage reaches the minimum operating voltage.
														
 
															+Since modern MCUs can operate on a range of supply voltages (e.g., from 1.7V to 3.6V in STM32L5 and MSP430), the computing system is executed until the voltage of decoupling capacitors reaches the minimum operating voltage.
														
 
															+% Modern MCUs can operate on a range of supply voltages (e.g., from 1.7V to 3.6V for STM32L5 and MSP430).
														
 
															+% Since the voltage of decoupling capacitors decreases as the discharge, the computing system is executed until the voltage reaches the minimum operating voltage.
														
 
															 % While the voltage of decoupling capacitors decreases as they discharge, the computing system operates since modern MCUs can operate on a range of supply voltages (e.g., from 1.7V to 3.6V for STM32L5 and MSP430).
														
 
															 This makes the energy storage voltage not a good estimate of the remaining time that system can execute.
														
@@ -169,43 +169,46 @@ This makes the energy storage voltage not a good estimate of the remaining time
 
															 % Modern MCUs can operate on wide range of operating voltages (e.g., from 1.7V to 3.6V for STM32L5 and MSP430).
														
 
															 Fig.~\ref{fig:sub_voltage_execution} shows the ratio of the times executed under sub-voltage over the total execution times, averaged over 30 measurements.
														
 
															-The x-axis shows the different capacitor sizes and the colors represent the voltages that system enters sleep state.
														
 
															+The x-axis shows the different capacitor sizes and the colors represent the voltages that system stops its operation.
														
 
															 We evaluate various voltages ranging from 1.7V to 2.5V since not all components in the computing system may operate at the lowest voltage (Sec.~\ref{sec:sub_normal_execution}).
														
 
															 Also, we present two different cases with input current of 1mA (Fig.~\ref{fig:sub_voltage_execution_1mA}) and 3mA (Fig.~\ref{fig:sub_voltage_execution_3mA}) to evaluate the impact of input power.
														
 
															-The figure shows that significant MCU operation is executed under sub-normal voltage.
														
 
															+The figure shows that significant MCU operation is executed at sub-normal voltage.
														
 
															 For example, when 470uF capacitor is used at 1mA input current (Fig.~\ref{fig:sub_voltage_execution_1mA}), 82.8\% of computation is executed \emph{after} power-off threshold.
														
 
															 The ratio decreases as the system powers-off early (reduced sub-voltage operation time) or the input current increases (longer operation time at normal voltage).
														
 
															 Under 1000uF is the major focus of this paper.
														
 
															 These values can be directly translated to the inefficiency of the system based on the traditional model.
														
 
															-For example, in 470uF with 1mA input current case, systems executing checkpoint at power-off threshold execute 16.3 ms while it can operate 29.4 ms more if it execute checkpoint at 2.5V.
														
 
															-Although executing checkpoint early may save some energy in decoupling capacitors, the saved energy is not preserved as discussed in Sec.~\ref{sec:power_efficiency}.
														
 
															-In Sec.~\ref{sec:design_guidelines}, we validate this aspect and propose a method to execute checkpoint truly just before the poweroff.   
														
 
															+For example, in 470uF with 1mA input current case, systems executing checkpoint at power-off threshold may operate 16.3ms, although it can operate 29.4ms longer if it execute checkpoint at 2.5V.
														
 
															+At next power-on, decoupling capacitors are discharged to similar voltages in either cases, as capacitors discharge exponentially (Sec.~\ref{sec:power_efficiency}).
														
 
															+As a result, failing to execute at sub-normal voltage introduces significant power efficiency overhead.
														
 
															+% Although early checkpoint execution may save some energy in decoupling capacitors, the saved energy is not preserved as discussed in Sec.~\ref{sec:power_efficiency}.
														
 
															+In Sec.~\ref{sec:design_guidelines}, we validate this aspect and propose a method to predict the power-off time more accurately.
														
 
															 \subsection{Impact of Sub-normal Voltage Execution}
														
 
															 \label{sec:sub_normal_execution}
														
 
															 The traditional model makes the software designers assume the system is executed under stable voltage.
														
 
															-However, the execution after the power-off threshold (\textbf{O3}) happens in sub-normal voltage.
														
 
															-Being aware of this is important to the software designers since the peripherals and analog components may function differently.
														
 
															+However, the majority of execution may happen after the power-off threshold at sub-normal voltage (\textbf{O3}), as discussed in Sec.~\ref{sec:predicting_power_failures}.
														
 
															+Being aware of this is important to software designers since the peripherals and analog components may function differently at sub-normal voltage.
														
 
															-The two most critical examples are Analog-Digital Converter (ADC) and external memory.
														
 
															+The two most critical examples are Analog-Digital Converters (ADCs) and external NVMs.
														
 
															+They play an important role in checkpointing, since ADCs are often used to estimate power-off time by reading the capacitor voltage and NVMs have to save the checkpoint data safely.
														
 
															 \begin{figure}
														
 
															     \centering
														
 
															     \begin{subfigure}{0.45\linewidth}
														
 
															         \includegraphics[width=\textwidth]{figs/plot_expr_2_cropped.pdf}
														
 
															-        \caption{Trace of one power cycle.}
														
 
															+        \caption{Analog-Digital Converter.}
														
 
															         \label{fig:adc_error}
														
 
															     \end{subfigure}
														
 
															     \hfill
														
 
															     \begin{subfigure}{0.52\linewidth}
														
 
															         \includegraphics[width=\textwidth]{figs/plot_expr_3_cropped.pdf}
														
 
															-        \caption{Detailed trace.}
														
 
															+        \caption{External FRAM.}
														
 
															         \label{fig:fram_drror}
														
 
															     \end{subfigure}
														
 
															-    \caption{Voltage of the capacitor and Vdd, sampled 470uF and 1.5mA.}
														
 
															+    \caption{Incorrectly functioning components at sub-normal voltage.} 
														
 
															     \label{fig:adc_and_fram_error}
														
 
															 \end{figure}
														
@@ -225,13 +228,13 @@ Fig.~\ref{fig:fram_drror}: FRAM error.
 
															     \label{tab:architectures}
														
 
															     \renewcommand{\arraystretch}{0.9} % Reduce vertical spacing
														
 
															     \setlength{\tabcolsep}{3pt} % Reduce horizontal spacing
														
 
															-    \resizebox{\columnwidth}{!}{%
														
 
															+    \resizebox{0.95\columnwidth}{!}{%
														
 
															     \begin{tabular}{@{}cccccccc@{}}
														
 
															     \toprule
														
 
															     \multirow{2}{*}{} & \multirow{2.5}{*}{Core} & \multirow{2.5}{*}{\begin{tabular}[c]{@{}c@{}}Core\\ Freq.\end{tabular}} & \multicolumn{3}{c}{Capacitance (uF)} & \multirow{2.5}{*}{Current} & \multirow{2.5}{*}{Memory}                                   \\ \cmidrule(lr){4-6}
														
 
															                       &                       &                                                                       & C1       & C2        & Storage       &                          &                                                           \\ \midrule
														
 
															-    MRAM              & STM32L5               & 16MHz                                                                 & 22       & 220       & 1,320         & 3mA                      & \begin{tabular}[c]{@{}c@{}}MRAM\\ (off-chip)\end{tabular} \\
														
 
															-    MSP430            & MSP430FR5994          & 8MHz                                                                  & 22       & 10        & 40            & 100uA                    & \begin{tabular}[c]{@{}c@{}}FRAM\\ (on-chip)\end{tabular}  \\ \bottomrule
														
 
															+    A1              & STM32L5               & 16MHz                                                                 & 22       & 220       & 1,320         & 3mA                      & \begin{tabular}[c]{@{}c@{}}MRAM\\ (off-chip)\end{tabular} \\
														
 
															+    A2            & MSP430FR5994          & 8MHz                                                                  & 22       & 10        & 40            & 100uA                    & \begin{tabular}[c]{@{}c@{}}FRAM\\ (on-chip)\end{tabular}  \\ \bottomrule
														
 
															     \end{tabular}%
														
 
															     }
														
 
															     \end{table}