| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131 |
- \section{Design Guidelines}
- \label{sec:design_guidelines}
- Based on the insights from our model, we propose design guidelines for efficient and safe intermittent systems.
- The effectiveness of the guidelines is evaluated using seven benchmarks on the reference system used in Sec.~\ref{sec:detailed_execution_model}.
- We ported five benchmarks from miBench~\cite{guthausMiBench2001} benchmark suite and implemented two computation kernels (\emph{matmul} and \emph{conv2d}) commonly used for evaluating intermittent systems in literature~\cite{kimLACT2024,maengSupporting2019,bhattacharyyaNvMR2022,ganesanWhat2019,akhunovEnabling2023}.
- We evaluate two popular existing checkpointing schemes: \emph{static} and \emph{dynamic}.
- The static scheme~\cite{ransfordMementos2011,kimLivenessAware2023,kimLACT2024,maengAdaptive2018} inserts checkpoint triggers at every loop latch in the program during compilation.
- At runtime, checkpoint triggers examine $V_{ES}$ and execute checkpoint only when it is below a predefined threshold.
- In contrast, the dynamic scheme~\cite{jayakumarQUICKRECALL2014,maengSupporting2019,balsamoHibernus2016,balsamoHibernus2015,kortbeekTimesensitive2020} does not modify the original program code.
- Instead, it executes checkpoints via interrupts from the power management system, generated when $V_{ES}$ reaches $V_l$.
- All the evaluations are conducted with 470uF energy storage and 1mA of input current at 1.9V, unless otherwise stated.
- \subsection{Delaying Checkpoint Executions}
- \label{sec:delay_checkpoint_execution}
- The first design practice we propose is to delay checkpoint executions until the last possible moment.
- While this practice is generally regarded as desirable in existing works~\cite{ransfordMementos2011,bhattiHarvOS2017}, it has not been recognized as a critical property.
- Under the traditional execution model, early checkpoint execution is often considered acceptable as it allows the system to wake up sooner, incurring only minor costs for initialization and recovery.
- For example, some approaches have explored proactive power-offs based on the program's worst-case execution time~\cite{choiCompilerDirected2022,reymondSCHEMATIC2024}, which can be overly pessimistic~\cite{raffeckWoCA2024}.
- On the other hand, our model reveals that significant energy is wasted each time the system powers off (Sec.~\ref{sec:power_efficiency}), highlighting the impact of delaying checkpoint executions.
- % As a result, the importance of delaying checkpoint executions is greater than previously assumed.
- \begin{figure}
- \centering
- \includegraphics[width=\linewidth]{figs/plot_expr_7_cropped.pdf}
- \caption{Execution times across various checkpoint voltages, normalized to the 3.4V configuration.}
- \label{fig:expr_checkpoint_voltages}
- \end{figure}
- Fig.~\ref{fig:expr_checkpoint_voltages} presents the benchmark execution times in dynamic checkpoint scheme, across various checkpoint execution voltages.
- A 1100 uF capacitor is used as an energy storage and the execution times are normalized to the 3.4V case.
- The results show that executing checkpoints earlier is significantly inefficient: by 1.38x and 2.45x in 3.7V and 4.0V configurations, respectively.
- Moreover, the overhead is consistent across all benchmarks since early checkpoint executions directly reduce the energy available for the computing system.
- Consequently, delaying checkpoint executions is crucial when designing state-retention techniques.
- Achieving this fundamentally depends on accurately predicting imminent power failures, which is the focus of the next section.
- % Consequently, it is important to execute as long as possible whenever the system wakes up.
- % In the next section, we discuss how this can be implemented in the existing intermittent systems.
- \subsection{Using $V_{dd}$ with a Reference Voltage for Checkpoint Signals}
- \label{sec:use_vdd_for_checkpoint}
- Sec.~\ref{sec:predicting_power_failures} demonstrates that $V_{ES}$ is not a good estimate for the system's remaining execution time.
- Instead, we propose using $V_{dd}$ to more accurately estimate the imminent power-off events, similar to approaches used in systems without power management system (Sec.~\ref{sec:related_work}).
- Additionally, when obtaining $V_{dd}$, it is important to account for the operations of ADC in sub-normal voltage conditions (Sec.~\ref{sec:sub_normal_execution}).
- For consistent operation of ADCs, we adopt a voltage source with a known value of $V_{ref}$.
- In STM32L5 and MSP430, an internal reference voltage source of 1.2V is available; alternatively, an external voltage reference (e.g., TI LVM431~\cite{texasinstrumentsLMV431}) can be used.
- Note that $V_{ref}$ should be lower than the minimal operating voltage of MCU (e.g., 1.7V) as $V_{ref}$ is generated by regulating $V_{dd}$.
- We propose two efficient implementations, $S_{sta}$ and $S_{dyn}$, to accurately detect the imminent power-off events in static and dynamic checkpoint schemes, respectively.
- $S_{sta}$ is designed for static checkpoint techniques.
- Instead of reading $V_{ES}$ at checkpoint triggers, $S_{sta}$ reads $V_{ref}$.
- This results in the same value of $\lfloor V_{ref}/V_{dd} \cdot 2^n \rfloor$ when operating on normal voltage, where $n$ is the ADC resolution.
- During sub-voltage execution, this value increases as $V_{dd}$ decreases, as discussed in Sec.~\ref{sec:sub_normal_execution}.
- Given that the target threshold voltage for checkpoint execution is $V_{th}$, software designers can compare the ADC value against $\lfloor V_{ref}/V_{th} \cdot 2^n \rfloor$ to determine whether to execute a checkpoint.
- On the other hand, $S_{dyn}$ utilizes an on-chip comparator, which is available in most modern MCUs including STM32L5 and MSP430.
- As $V_{ref}$ is always lower than $V_{dd}$, we use a voltage divider consisting of two resistors, $R1$ and $R2$, to scale $V_{dd}$ and compare it with $V_{ref}$.
- Specifically, we configure $R1$ and $R2$ to satisfy $\frac{R2}{R1+R2} \cdot V_{th} = V_{ref}$, so the comparator generates an interrupt when $V_{dd}$ reaches the threshold $V_{th}$.
- % T2 is setup for static checkpoint techniques, which poll the capacitor voltage to determine whether execute checkpoint or not.
- % Instead of reading the capacitor voltage, it reads the reference voltage.
- % As we discussed in Sec.~\ref{sec:sub_normal_execution}, the voltage remains same while the system executes at normal voltage but the value increases during sub-normal voltage execution.
- % \begin{itemize}
- % \item T1 utilizes a on-chip comparator (available both in STM32L5 and MSP430) with a reference voltage.
- % \item T2.
- % \end{itemize}
- \begin{figure}
- \centering
- \begin{subfigure}{\linewidth}
- \includegraphics[width=\textwidth]{figs/plot_expr_11_cropped.pdf}
- \caption{Static checkpointing with $S_{sta}$.}
- \label{fig:expr_precise_checkpoint_timings_static}
- \vspace{7pt}
- \end{subfigure}
- \begin{subfigure}{\linewidth}
- \includegraphics[width=\textwidth]{figs/plot_expr_10_cropped.pdf}
- \caption{Dynamic checkpointing with $S_{dyn}$.}
- \label{fig:expr_precise_checkpoint_timings_dynamic}
- \end{subfigure}
- \caption{Impact of precise checkpoint timings to the end-to-end execution times.}
- \label{fig:expr_precise_checkpoint_timings}
- \end{figure}
- Fig.~\ref{fig:expr_precise_checkpoint_timings} shows the average end-to-end execution times of the benchmarks over 30 iterations, comparing the traditional systems with the proposed setups.
- Fig.~\ref{fig:expr_precise_checkpoint_timings_static} illustrates the performance of $S_{sta}$ and Fig.~\ref{fig:expr_precise_checkpoint_timings_dynamic} presents the result for $S_{dyn}$.
- The error bars indicate the minimum and maximum measured execution times for each benchmark.
- The results clearly demonstrate that the execution time is significantly improved in both systems by extending the operation at sub-normal voltages: 3.04x in $S_{sta}$ and 2.85x in $S_{dyn}$.
- Furthermore, these improvements are consistent across all benchmarks, regardless of the application characteristics, highlighting the general effectiveness of the proposed setups.
- Another advantage of the proposed setups is their simplicity and practical applicability.
- Since the both setups only modify the method to detect imminent power failures and leave the checkpoint algorithms unchanged, it is straightforward to apply them in existing techniques.
- Furthermore, the proposed setups can reduce the system complexity, as they eliminate the need for communication (e.g., interrupt or access to $V_{ES}$) between the energy storage system and the computing system.
- % \subsection{Checkpoint Techniques and Evaluation Methods}
- \subsection{On Selecting Hardware Components}
- Our model helps designers to select efficient hardware components in various aspects.
- For example, it implies that operating voltage of peripherals (e.g., external NVMs) is a critical design parameter, often more important than their latency.
- % We evaluate this tradeoff by simulating an external FRAM having faster access latency but smaller operating voltage.
- We evaluate this tradeoff by simulating two FRAM configurations, F1 and F2, in our reference system.
- F1 represents slower setup operating until 2.5V; we double the software-configurable wait time for FRAM accesses for this setup.
- In F2, the fastest FRAM access parameters are used but the system stops operating at 2.8V.
- \begin{figure}
- \centering
- \includegraphics[width=\linewidth]{figs/plot_expr_12_cropped.pdf}
- \caption{Impact of peripheral operating voltage.}
- \label{fig:expr_peripheral_voltage}
- \end{figure}
- Fig.~\ref{fig:expr_peripheral_voltage} presents the results.
- It shows that operating voltage should considered, which can be ignored in the traditional execution model.
- Finally, our model highlights advantages of using smaller decoupling capacitors.
- Using larger buffers not only increases the ratio of sub-normal voltage operations but also increases the amount of discharged energy during power-offs.
- Indeed, we observe our reference system requires xx\% and xx\% longer time on average for execution of the benchmarks, when xxuF and xxuF decoupling capacitors are used, compared to our design of 220uF.
- As a result, it is a good design practice to use the smallest decoupling capacitors for efficiency of intermittent systems.
- % \begin{figure}
- % \centering
- % \includegraphics[width=\linewidth]{figs/plot_expr_12_cropped.pdf}
- % \caption{Execution times with varying decoupling capacitors.}
- % % \label{fig:expr_checkpoint_voltages}
- % \end{figure}
- % Power failure injection (soft reset)~\cite{wuIntOS2024,yildizEfficient2023}.
|