%$Header$
\chapter{Rational Linear Approximation}
\label{crat0}
\beginchapterquote{``Die ganzen Zahlen hat der liebe Gott gemacht,
alles andere ist Menschenwerk.''\footnote{German
language: God made the integers; everything
else was made by man.}}
{Leopold Kronecker}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Introduction}
%Section tag: INT0
\label{crat0:sint0}
In this chapter, we consider practical applications of
rational approximation.
Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0}
have presented algorithms for finding
the closest rational numbers to an arbitrary real number,
subject to constraints on the numerator and denominator.
The basis of these algorithms is complex and comes from number theory, and so
these algorithms and their basis have been presented in separate chapters.
In Section \ref{crat0:srla0}, rational linear approximation itself
and associated error bounds are presented. By \emph{rational linear
approximation} we mean simply the approximation of a line
$y = r_I x$ ($y, r_I, x \in \vworkrealset$) by a line
\begin{equation}
\label{eq:crat0:sint0:01}
y = \left\lfloor
\frac{h \lfloor x \rfloor + z}{k}
\right\rfloor ,
\end{equation}
\noindent{}where we choose $h/k \approx r_I$ and optionally choose $z$ to
shift the error introduced. Note that (\ref{eq:crat0:sint0:01}) is
very economical for microcontroller instruction sets, since only integer
arithmetic is required. We may choose $h/k$ from a Farey series (see
Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0}), or
we may choose a ratio $h/2^q$ so that the division in (\ref{eq:crat0:sint0:01})
can be implemented
by a bitwise right shift.
Section \ref{crat0:srla0} discusses linear rational approximation
in general, with a special eye on error analysis.
Section \ref{crat0:spwi0} discusses piecewise linear rational approximation,
which is the approximation of a curve or complex mapping by a
number of joined line segments.
Section \ref{crat0:sfdv0} discusses frequency division and rational counting.
Such techniques share the same mathematical framework as rational linear
approximation, and as with rational linear approximation the ratio
involved may be chosen from a Farey series or with a denominator of $2^q$, depending
on the algorithm employed.
Section \ref{crat0:sbla0} discusses Bresenham's classic line algorithm,
which is a practical application of rational linear approximation.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Rational Linear Approximation}
%Section tag: RLA0
\label{crat0:srla0}
It occurs frequently in embedded software design that one wishes to
implement a linear scaling from a domain to a range of the form
\begin{equation}
\label{eq:crat0:srla0:01}
f(x) = r_I x ,
\end{equation}
\noindent{}where $r_I$ is the \emph{ideal}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Model Functions}
%Section tag: mfu0
\label{crat0:srla0:smfu0}
In general, we seek to approximate the ideal function
\noindent{}by some less ideal function where
\begin{itemize}
\item $r_A \neq r_I$, although we seek to choose $r_A \approx r_I$.
\item The input to the function, $x$, may already contain
quantization error.
\item Although $r_I x \in \vworkrealsetnonneg$, we must choose
an integer as the function output.
\end{itemize}
In modeling quantization error, we use the floor function\index{floor function}
($\lfloor\cdot\rfloor$)
for algebraic simplicity. The floor function precisely
describes the behavior of integer division instructions (where
remainders are discarded), but may not describe other sources of
quantization, such as quantization that occurs in A/D conversion.
However, techniques identical to those presented in this
section may be used when quantization is not best described
by the floor function, and these results are left to the reader.
Traditionally, because addition of integers is an inexpensive
machine operation, a parameter $z \in \vworkintset$ may optionally
be added to the product $hx$ in order to round or otherwise
shift the result.
If $x$ is assumed to be without error, the ideal function is
given by (\ref{eq:crat0:srla0:smfu0:01}), whereas the function
that can be economically implemented is
\begin{equation}
\label{eq:crat0:srla0:smfu0:02}
g(x) = \left\lfloor \frac{hx + z}{k} \right\rfloor
=
\left\lfloor r_A x + \frac{z}{k} \right\rfloor .
\end{equation}
If, on the other hand, $x$ may be already quantized,
the function that can actually be implemented is
\begin{equation}
\label{eq:crat0:srla0:smfu0:03}
h(x) = \left\lfloor \frac{h \lfloor x \rfloor + z}{k} \right\rfloor
=
\left\lfloor r_A \lfloor x \rfloor + \frac{z}{k} \right\rfloor .
\end{equation}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section[\protect\mbox{\protect$h/2^q$} and \protect\mbox{\protect$2^q/k$} Rational Linear Approximation]
{\protect\mbox{\protect\boldmath$h/2^q$} and \protect\mbox{\protect\boldmath$2^q/k$} Rational Linear Approximation}
%Section tag: HQQ0
\label{crat0:shqq0}
\index{h/2q@$h/2^q$ rational linear approximation}
\index{rational linear approximation!h/2q@$h/2^q$}
The algorithms presented in
Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0}
will always provide the rational number $h/k$ closest to
an arbitrary real number $r_I$ subject to the constraints
$h \leq h_{MAX}$ and $k \leq k_{MAX}$.
However, because shifting in order
to implement multiplication or division by a power of 2
is at least as fast (and often \emph{much} faster)
on all processors as arbitrary multiplication or division,
and because not all processors have multiplication and division instructions,
it is worthwhile to examine choosing $h/k$ so that either $h$ or $k$ are
powers of 2.
There are thus three rational linear approximation techniques to be
examined:
\begin{enumerate}
\item \emph{$h/k$ rational linear approximation}, in which an arbitrary
$h \leq h_{MAX}$ and an arbitrary $k \leq k_{MAX}$ are used,
with $r_A = h/k$. $h$ and $k$ can be chosen using the algorithms
presented in Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0}.
Implementation of this technique would most often involve a single integer
multiplication instruction to form the product $hx$, followed by an optional single
addition instruction to form the sum $hx+z$, and then
followed by by a single division instruction
to form the quotient $\lfloor (hx+z)/k \rfloor$. Implementation may also less commonly involve
multiplication, addition, and division of operands too large to be processed
with single machine instructions.
\item \emph{$h/2^q$ rational linear approximation}, in which an arbitrary
$h \leq h_{MAX}$ and an integral power of two $k=2^q$ are used, with
$r_A = h/2^q$.
Implementation of this technique would most often involve a single integer
multiplication instruction to form the product $hx$, followed by an optional single
addition instruction to form the sum $hx+z$, and then
followed by right shift instruction(s)
to form the quotient $\lfloor (hx+z)/2^q \rfloor$. Implementation may also less commonly involve
multiplication, addition, and right shift of operands too large to be processed
with single machine instructions.
\item \emph{$2^q/k$ rational linear approximation}, in which an integral
power of two $h=2^q$ and an arbitrary $k \leq k_{MAX}$ are used, with
$r_A = 2^q/k$.
Implementation of this technique would most often involve left shift
instruction(s) to form the product $2^qx$, followed by an optional single
addition instruction to form the sum $2^qx+z$, and then
followed by a single division instruction to form
the quotient $\lfloor (2^qx+z)/k \rfloor$. Implementation may also less
commonly involve
left shift, addition, and division of operands too large to be processed
with single machine instructions.
\end{enumerate}
We use the nomenclature ``\emph{$h/k$ rational linear approximation}'',
``\emph{$h/2^q$ rational linear approximation}'', and
``\emph{$2^q/k$ rational linear approximation}'' to identify the three
techniques enumerated above.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Integer Arithmetic and Processor Instruction Set Characteristics}
%Subsection tag: pis0
\label{crat0:shqq0:pis0}
The following observations about integer arithmetic and about processors
used in embedded control can be made:
\begin{enumerate}
\item \label{enum:crat0:shqq0:pis0:01:01a}
\emph{Shifting is the fastest method of integer multiplication or division
(by $2^q$ only),
followed by utilization of the processor multiplication or division instructions (for arbitrary
operands),
followed by software implementation of multiplication or division (for arbitrary operands).}
Relative costs vary depending on the processor, but the monotonic
ordering always holds.
$h/2^q$ and $2^q/k$ rational linear
approximation are thus worthy of investigation. (Note also that in many practical
applications of $h/2^q$ and $2^q/k$ rational linear approximation,
the required shift is performed by
addressing the operand with an offset,
and so has no cost.)
\item \label{enum:crat0:shqq0:pis0:01:01b}
\emph{Shifting is $O(N)$ (where $N$ is the number of bits in the argument),
but both
multiplication and division are $O(N^2)$ for
practical\footnote{\index{Karatsuba multiplication}Karatsuba
multiplication, for example, is
$O(N^{\log_2 3}) \approx O(N^{1.58}) \ll O(N^2)$. However, Karatsuba
multiplication cannot be applied economically to the small
operands that typically occur in embedded control work. It would
be rare in embedded control applications
for the length of a multiplication operand to exceed four
times the length that is accommodated by a machine instruction; and this
is far below the threshold at which Karatsuba multiplication is
economical. Thus, for all intents and purposes in embedded control work,
multiplication is $O(N^2)$.} operands (where
$N$ is the number of bits in each
operand).} It follows that $2^q/k$ and $h/2^q$ rational
linear approximation
will scale to large operands better than $h/k$ rational linear approximation.
\item \label{enum:crat0:shqq0:pis0:01:02a}
\emph{Integer division instructions take as long or longer than
integer multiplication instructions.} In designing digital logic
to implement basic integer arithmetic, division is the operation most difficult
to perform economically.\footnote{For some processors, the penalty is extreme.
For example, on the NEC V850 (a RISC processor),
a division requires 36 clock cycles,
whereas multiplication, addition, and subtraction each effectively
require 1 clock cycle.}
It follows that multiplication using operands that exceed the machine's word size
is often far less expensive than division using operands that exceed the
machine's word size.
\item \label{enum:crat0:shqq0:pis0:01:03a}
\emph{All processors that have an integer division instruction also
have an integer multiplication instruction.}
Phrased
differently, no processor has an integer division instruction but no
integer multiplication instruction.
\end{enumerate}
Enumerated items
(\ref{enum:crat0:shqq0:pis0:01:01a}) through
(\ref{enum:crat0:shqq0:pis0:01:03a}) above lead to the following conclusions.
\begin{enumerate}
\item $h/2^q$ rational linear approximation is likely to be implementable
more efficiently on most processors than $h/k$ rational linear approximation.
(\emph{Rationale:} shift instruction(s) or accessing a
memory address with an offset
is
likely to be more economical than division, particularly if $k$ would exceed
the native
operand size of the processor.)
\item $h/2^q$ rational linear approximation is likely to be a more useful
technique than $2^q/k$ rational linear approximation.
(\emph{Rationale:} the generally high cost of division compared to
multiplication, and the existence of processors that possess a multiplication
instruction but no division instruction.)
\end{enumerate}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection[Design Procedure For \protect\mbox{\protect$h/2^q$} Rational Linear Approximations]
{Design Procedure For \protect\mbox{\protect\boldmath$h/2^q$} Rational Linear Approximation}
%Subsection tag: dph0
\label{crat0:shqq0:dph0}
An $h/2^q$ rational linear approximation is parameterized by:
\begin{itemize}
\item The unsigned or signed nature of $h$ and $x$. (Rational linear approximations
may involve either signed or unsigned domains and ranges. Furthermore,
signed integers may be maintained using either 2's-complement
or sign-magnitude representation, and the processor instruction set
may or may not directly support signed multiplication.)
\item $r_I$, the real number we wish to approximate by $r_A = h/2^q$.
\item $x_{MAX}$, the maximum possible value of the input argument $x$. (Typically,
software contains a test to clip the output if $x > x_{MAX}$.)
\item $w_h$, the width in bits allowed for $h$. (Typically, $w_h$ is
the maximum operand size of a machine multiplication instruction.)
\item $w_r$, the width in bits allowed for the result $hx$. (Typically,
$w_r$ is the maximum result size of a machine multiplication instruction.)
\item The rounding mode when choosing $h$ (and thus effectively $r_A$)
based on $r_I$. It is common to choose the
closest value,
$r_A=\lfloor r_I 2^q + 1/2 \rfloor/2^q$
or
$r_A=\lceil r_I 2^q - 1/2 \rceil/2^q$,
but other choices are possible.
\item The rounding mode for the result (i.e. the choice of $z$ in
Eq. \ref{eq:crat0:sint0:01}).
\end{itemize}
This section develops a design procedure for $h/2^q$ rational linear
approximations with the most typical set of assumptions: unsigned arithmetic,
$r_A=\lfloor r_I 2^q + 1/2 \rfloor/2^q$,
and $z=0$. Design procedures for other scenarios are presented as exercises.
By definition, $h$ is constrained in two ways:
\begin{equation}
\label{eq:crat0:shqq0:dph0:00}
h \leq 2^{w_h} - 1
\end{equation}
\noindent{}and
\begin{equation}
\label{eq:crat0:shqq0:dph0:01}
h \leq \frac{2^{w_r} - 1}{x_{MAX}} .
\end{equation}
\noindent{}(\ref{eq:crat0:shqq0:dph0:00}) comes directly from the
requirement that $h$ fit in $w_h$ bits.
(\ref{eq:crat0:shqq0:dph0:01}) comes directly from the requirement
that $hx$ fit in $w_r$ bits.
(\ref{eq:crat0:shqq0:dph0:00}) and (\ref{eq:crat0:shqq0:dph0:01})
may be combined to form one inequality:
\begin{equation}
\label{eq:crat0:shqq0:dph0:02}
h \leq \min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) .
\end{equation}
If $q$ is known, the choice of $h$ that will be made so as to minimize
$|r_A-r_I| = |h/2^q - r_I|$ is
\begin{equation}
\label{eq:crat0:shqq0:dph0:03}
h=\left\lfloor r_I 2^q + \frac{1}{2} \right\rfloor .
\end{equation}
\noindent{}It is required that the choice of $h$ specified by
(\ref{eq:crat0:shqq0:dph0:03}) meet
(\ref{eq:crat0:shqq0:dph0:02}). Making the most pessimistic
assumption about the rounding of $h$ and substituting into
(\ref{eq:crat0:shqq0:dph0:02}) leads to
\begin{equation}
\label{eq:crat0:shqq0:dph0:04}
r_I 2^q + \frac{1}{2}
\leq
\min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) .
\end{equation}
\noindent{}Isolating $q$ in (\ref{eq:crat0:shqq0:dph0:04})
yields
\begin{equation}
\label{eq:crat0:shqq0:dph0:05}
2^q
\leq
\frac{\min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) - \frac{1}{2}}
{r_I}.
\end{equation}
\noindent{}Solving
(\ref{eq:crat0:shqq0:dph0:05})
for maximum value of $q$ that meets the constraint yields
\begin{equation}
\label{eq:crat0:shqq0:dph0:06}
q=
\left\lfloor
{
\log_2
\left(
{
\frac{\min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) - \frac{1}{2}}{r_I}
}
\right)
}
\right\rfloor .
\end{equation}
\noindent{}(\ref{eq:crat0:shqq0:dph0:06})
can be rewritten for easier calculation using most calculators (which do
not allow the direct evaluation of base-2 logarithms):
\begin{equation}
\label{eq:crat0:shqq0:dph0:07}
q=
\left\lfloor
\frac
{
{
\ln
\left(
{
\frac{\min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) - \frac{1}{2}}{r_I}
}
\right)
}
}
{\ln 2}
\right\rfloor .
\end{equation}
\noindent{}Once $q$ is established using (\ref{eq:crat0:shqq0:dph0:07}),
$h$ can be calculated using (\ref{eq:crat0:shqq0:dph0:03}).
In embedded control work (as well as in operating system internals),
$h/2^q$ rational linear approximations are often used in conjunction with
tabulated constants or calibratable parameters
where each constant or calibratable parameter may vary over a range of
$[0, r_I]$, and where $r_I$ is the value used in the design procedure
presented above. In these applications, the values of $h$ are
tabulated, but $q$ is invariant (usually hard-coded)
and is chosen at design time based on the upper bound $r_I$
of the interval $[0, r_I]$ in which each tabulated constant or calibratable
parameter will fall. With $q$ fixed,
$r_A$ can be adjusted in steps of $1/2^q$.
If $r_I$ is invariant, a final design step may be to reduce the rational
number $h/2^q$ by dividing some or all occurrences of 2 as a factor from both the
numerator and denominator. With some processors and in some applications, this
may save execution time by reducing the number of shift instructions that
must be executed, reducing the execution time of the shift instructions
that are executed, or allowing shifting via offset addressing.
For example, on a byte-addressible machine, if the design procedure
yields $h=608$ and $q=10$, it may be desirable to divide both $h$ and $2^q$ by 4 to
yield $h=152$ and $q=8$, as this allows the shift by 8 to be done by fetching
alternate bytes (rather than by actual shifting). In other applications, it may
be desirable to remove \emph{all} occurrences of 2 as a prime factor
from $h$.
For an invariant $r_I$, a suitable design procedure is:
\begin{enumerate}
\item Choose $q$ using (\ref{eq:crat0:shqq0:dph0:07}).
\item With $q$ fixed, choose $h$ using (\ref{eq:crat0:shqq0:dph0:03}).
\item If economies can be achieved on the target processor,
examine the possibility of removing some or all occurrences
of 2 as a prime factor from $h$ and decreasing $q$.
\end{enumerate}
For tabulated or calibratable constants in the
interval $[0,r_I]$, a suitable design procedure is to use the
procedure presented immediately above but without the third step.
Each tabulated value of $h$ is chosen using (\ref{eq:crat0:shqq0:dph0:03}).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection[Design Procedure For \protect\mbox{\protect$2^q/k$} Rational Linear Approximations]
{Design Procedure For \protect\mbox{\protect\boldmath$2^q/k$} Rational Linear Approximation}
%Subsection tag: dpk0
\label{crat0:shqq0:dpk0}
TBD.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Piecewise Rational Linear Approximation}
%Section tag: PWI0
\label{crat0:spwi0}
TBD.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section[Frequency Division And Rational Counting]
{Frequency Division And Rational Counting Techniques}
%Section tag: FDV0
\label{crat0:sfdv0}
\index{frequency division}\index{rational counting}\index{counting}%
Often, software must ``divide down'' an execution rate. For example,
an interrupt service routine may be scheduled by hardware every
10ms, but may perform useful processing only every 50ms. This requires
that the ISR maintain a counter and only perform useful processing
every fifth invocation. This section deals with counting strategies
used to achieve invocation frequency division and other similar results.
Frequency division and
rational counting techniques presented in this section find application
primarily in the following scenarios:
\begin{itemize}
\item ISRs and other software components which must divide down
their invocation rate.
\item Pulse counting and scaling from encoders and other
similar systems.
\item The correction of inaccuracies in timebases (such as crystals
which oscillate at a frequency different than the
nominal rate).
\end{itemize}
Because the techniques presented must be usable with inexpensive
microcontrollers, such techniques must meet these constraints:
\begin{enumerate}
\item \label{enum:01:crat0:sfdv0:econex}
The counting techniques must be economical to execute on
an inexpensive microcontroller.
\item \label{enum:01:crat0:sfdv0:econcccalc}
An inexpensive microcontroller must be capable of calculating any
constants used as limits in counting (i.e. it cannot necessarily
be assumed that a more powerful computer calculates these constants,
and it cannot be assumed that these limits do not change on the fly).
\end{enumerate}
In this section, we analyze the behavior of several types of
rational counting algorithms, supplied as Algorithms
\ref{alg:crat0:sfdv0:01a}
through
\ref{alg:crat0:sfdv0:02a}.
\begin{algorithm}
\begin{verbatim}
/* The constants K1 through K4, which parameterize the */
/* counting behavior, are assumed assigned elsewhere in */
/* the code. The solution is analyzed in terms of the */
/* parameters K1 through K4. */
/* */
/* We also place the following restrictions on K1 through */
/* K4: */
/* K1 : K1 <= K3 - K2. */
/* K2 : K4 > K2 > 0. */
/* K3 : No restrictions. */
/* K4 : K4 > K2 > 0. */
void base_rate_sub(void)
{
static int state = K1;
state += K2;
if (state >= K3)
{
state -= K4;
A();
}
}
\end{verbatim}
\caption{Rational Counting Algorithm For $K_2/K_4 < 1$}
\label{alg:crat0:sfdv0:01a}
\end{algorithm}
\begin{algorithm}
\begin{verbatim}
/* The constants K1 through K4, which parameterize the */
/* counting behavior, are assumed assigned elsewhere in */
/* the code. The solution is analyzed in terms of the */
/* parameters K1 through K4. */
/* */
/* We also place the following restrictions on K1 through */
/* K4: */
/* K1 : K1 <= K3 - K2. */
/* K2 : K2 > 0. */
/* K3 : No restrictions. */
/* K4 : K4 > 0. */
void base_rate_sub(void)
{
static int state = K1;
state += K2;
while (state >= K3)
{
state -= K4;
A();
}
}
\end{verbatim}
\caption{Rational Counting Algorithm For $K_2/K_4 \geq 1$}
\label{alg:crat0:sfdv0:01b}
\end{algorithm}
\begin{algorithm}
\begin{verbatim}
/* The constants K1, K2, and K4, which parameterize the */
/* counting behavior, are assumed assigned elsewhere in */
/* the code. The solution is analyzed in terms of the */
/* parameters K1 through K4. */
/* */
/* We also place the following restrictions on K1, K2, */
/* and K4: */
/* K1 : K1 >= 0. */
/* K2 : K4 > K2 > 0. */
/* K4 : K4 > K2 > 0. */
/* */
/* Special thanks to Chuck B. Falconer (of the */
/* comp.arch.embedded newsgroup) for this rational */
/* counting algorithm. */
/* */
/* Note below that the test against K3 does not exist, */
/* instead a test against zero is used, which many */
/* machine instruction sets will do as part of the */
/* subtraction (but perhaps this needs to be coded in */
/* A/L). This saves machine code and also eliminates */
/* one unnecessary degree of freedom (K3). */
void base_rate_sub(void)
{
static int state = K1;
if ((state -= K2) < 0)
{
state += K4;
A();
}
}
\end{verbatim}
\caption{Zero-Test Rational Counting Algorithm For $K_2/K_4 < 1$}
\label{alg:crat0:sfdv0:01c}
\end{algorithm}
\begin{algorithm}
\begin{verbatim}
;Special thanks to John Larkin (of the comp.arch.embedded
;newsgroup) for this rational counting algorithm.
;
;This is the TMS-370C8 assembly-language version of the
;algorithm. The algorithm is parameterized solely by
;K1 and K2, with no restrictions on their values, because
;the values are naturally constrained by the data types.
;K1, which is the initial value of "state", is assumed
;assigned elsewhere. The snippet shown here uses only
;K2.
MOV state, A ;Get "state".
ADD #K2, A ;Increase by K2. Carry flag
;will be set if rollover to or
;past zero.
PUSH ST ;Save carry flag.
MOV A, state ;Move new value back.
POP ST ;Restore carry flag.
JNC done ;If didn't roll, don't run sub.
CALL A_SUBROUTINE ;Run sub.
done:
/* This is the 'C' version of the algorithm. It is not */
/* as easy or efficient in 'C' to detect rollover. */
void base_rate_sub(void)
{
static unsigned int state = K1;
unsigned int old_state;
old_state = state;
state += K2;
if (state < old_state)
{
A();
}
}
\end{verbatim}
\caption{$2^q$ Rollover Rational Counting Algorithm}
\label{alg:crat0:sfdv0:01d}
\end{algorithm}
\begin{algorithm}
\begin{verbatim}
/* The constants K1 through K4, which parameterize the */
/* counting behavior, are assumed assigned elsewhere in */
/* the code. The solution is analyzed in terms of the */
/* parameters K1 through K4. */
/* */
/* We also place the following restrictions on K1 through */
/* K4: */
/* K1 : K1 <= K3. */
/* K2 : K2 > 0. */
/* K3 : No restrictions. */
/* K4 : K4 > 0. */
void base_rate_sub(void)
{
static unsigned int state = K1;
if (state >= K3)
{
state -= K4;
A();
}
else
{
state += K2;
B();
}
}
\end{verbatim}
\caption{Rational Counting Algorithm With \texttt{else} Clause}
\label{alg:crat0:sfdv0:02a}
\end{algorithm}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01a}]
{Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01a}}
%Section tag: PRC0
\label{crat0:sfdv0:sprc0}
Algorithm \ref{alg:crat0:sfdv0:01a}
is used frequently in microcontroller
software. A base rate subroutine\footnote{For brevity, we usually
call this just the \emph{base subroutine}.} (named ``\texttt{base\_rate\_sub()}''
in the algorithm) is called at a periodic rate, and subroutine
``\texttt{A()}'' is called at a lesser rate.
We are interested in determining the relationships between the rates
as a function of $K_1$, $K_2$, $K_3$, and $K_4$; and we are interested
in developing other properties.
Notationally when analyzing rational counting algorithms, we agree
that $state_n$ denotes the value of the \texttt{state} variable
after the $n$th invocation and before the $n+1$'th invocation
of the base rate subroutine.
Using this convention with Algorithm \ref{alg:crat0:sfdv0:01a},
$state_0 = K_1$.\footnote{Algorithm \ref{alg:crat0:sfdv0:01a}
requires a knowledge of
`C' to fully understand. The \texttt{static} keyword ensures that the
variable \texttt{state} is initialized only once, at the time the program
is loaded. \texttt{state} is \emph{not} initialized each time the
base subroutine runs.}
We can first easily derive the number of initial invocations of
the base subroutine before ``\texttt{A()}'' is called for the first
time.
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprc0:01}
$N_{STARTUP}$, the number of invocations of the base subroutine
in Algorithm \ref{alg:crat0:sfdv0:01a} before ``\texttt{A()}'' is called
for the first time, is given by
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:01:01}
N_{STARTUP} =
\left\lceil
{
\frac{-K_1 - K_2 + K_3}{K_2}
}
\right\rceil .
\end{equation}
\end{vworklemmastatement}
\begin{vworklemmaproof}
The value of \texttt{state} after the $n$th invocation
is $state_n = K_1 + n K_2$. In order for the test in the
\texttt{if()} statement not to be met, we require that
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:01:02}
K_1 + n K_2 < K_3
\end{equation}
\noindent{}or equivalently that
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:01:03}
n < \frac{K_3 - K_1}{K_2} .
\end{equation}
Solving (\ref{eq:lem:crat0:sfdv0:sprc0:01:03}) for the largest
value of $n \in \vworkintset$ which still meets the criterion
yields (\ref{eq:lem:crat0:sfdv0:sprc0:01:01}). Note that
the derivation of (\ref{eq:lem:crat0:sfdv0:sprc0:01:01}) requires
that the restrictions on $K_1$ through $K_4$ documented in
Algorithm \ref{alg:crat0:sfdv0:01a} be met.
\end{vworklemmaproof}
\begin{vworklemmaparsection}{Remarks}
Note that if one chooses $K_1 > K_3 - K_2$ (in contradiction to the
restrictions in Algorithm \ref{alg:crat0:sfdv0:01a}), it is possible
to devise a counting scheme (and results analogous to this lemma) where
``\texttt{A()}'' is run a number of times before it is
\emph{not} run for the first time. The construction of an analogous
lemma is the topic of Exercise \ref{exe:crat0:sexe0:01}.
\end{vworklemmaparsection}
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprc0:02}
Let $N_I$ be the number of times the Algorithm
\ref{alg:crat0:sfdv0:01a} base subroutine
is called, let $N_O$ be the number of times the
``\texttt{A()}'' subroutine is called, let
$f_I$ be the frequency of invocation of the
Algorithm \ref{alg:crat0:sfdv0:01a} base subroutine, and let
$f_O$ be the frequency of invocation of
``\texttt{A()}''. Provided the constraints
on $K_1$ through $K_4$ documented in
Algorithm \ref{alg:crat0:sfdv0:01a} are met,
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:02:01}
\lim_{N_I\rightarrow\infty}\frac{N_O}{N_I}
=
\frac{f_O}{f_I}
=
\frac{K_2}{K_4} .
\end{equation}
\end{vworklemmastatement}
\begin{vworklemmaproof}
(\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) indicates that once
the initial delay (determined by $K_1$ and $K_3$) has finished,
$N_O/N_I$ will converge on a steady-state value of
$K_2/K_4$.
Assume that $K_1=0$ and $K_3=K_4$. The
conditional subtraction then calculates
$state \bmod K_4$. After the $n$th
invocation of the base subroutine, the value
of \texttt{state} will be
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:02:02}
state_n|_{K_1=0, K_3=K_4} = n K_2 \bmod K_4 .
\end{equation}
Assume that for two distinct values of
$n \in \vworkintsetnonneg$, $n_1$ and $n_2$,
the value of the \texttt{state} variable is the same:
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:02:03}
n_1 K_2 \bmod K_4 = n_2 K_2 \bmod K_4.
\end{equation}
Then
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:02:04}
(n_2 - n_1) K_2 = i K_4, \; \exists i \in \vworkintsetpos .
\end{equation}
However, we have no knowledge of whether $K_2$ and $K_4$ are
coprime (they are not required to be). We may rewrite
(\ref{eq:lem:crat0:sfdv0:sprc0:02:04}) equivalently as
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:02:05}
(n_2 - n_1) \frac{K_2}{\gcd(K_2, K_4)} = i \frac{K_4}{\gcd(K_2, K_4)},
\; \exists i \in \vworkintsetpos
\end{equation}
where of course by definition
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:02:06}
\gcd \left( { \frac{K_2}{\gcd(K_2, K_4)}, \frac{K_4}{\gcd(K_2, K_4)} } \right) = 1.
\end{equation}
In order to satisfy (\ref{eq:lem:crat0:sfdv0:sprc0:02:05}),
$n_2 - n_1$ must contain all of the prime factors of
$K_4/\gcd(K_2,K_4)$ in at least the same multiplicities,
and it follows that the set of values
of $n_2-n_1$ that satisfies
(\ref{eq:lem:crat0:sfdv0:sprc0:02:03}) is
precisely the set of multiples of $K_4/\gcd(K_2,K_4)$:
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:02:07}
n_2 - n_1 = j \frac{K_4}{\gcd(K_2, K_4)}, \; \exists j \in \vworkintsetpos .
\end{equation}
Examining (\ref{eq:lem:crat0:sfdv0:sprc0:02:02}), it can
also be seen that
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:02:08}
\gcd(K_2, K_4) \vworkdivides (n K_2 \bmod K_4),
\end{equation}
and so
\begin{eqnarray}
\label{eq:lem:crat0:sfdv0:sprc0:02:09}
& n K_2 \bmod K_4 \in & \\
\nonumber
& \{ 0, \gcd(K_2, K_4), 2 \gcd(K_2, K_4), \ldots , K_4 - \gcd(K_2, K_4) \} , &
\end{eqnarray}
a set which contains exactly $K_4/\gcd(K_2, K_4)$ elements.
Thus we've established by the pigeonhole principle
that the sequence of the
values of the variable \texttt{state}
specified by (\ref{eq:lem:crat0:sfdv0:sprc0:02:02})
repeats perfectly with periodicity $K_4/\gcd(K_2, K_4)$,
and we've established that in one period, every element of the set
specified in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) appears exactly
once. (However, we have not specified the order in which the
elements appear, but this is not important for this lemma. In general
the elements appear out of the order shown in
Eq. \ref{eq:lem:crat0:sfdv0:sprc0:02:09}.)
To establish the frequency with which the test against
$K_4$ is met, note that if $state_n + K_2 \geq K_4$, then
\begin{eqnarray}
\label{eq:lem:crat0:sfdv0:sprc0:02:10}
& \displaystyle{state_n \in \left\{ \frac{K_4-K_2}{\gcd(K_2,K_4)} \gcd(K_2, K_4), \right.} & \\
\nonumber & \displaystyle{\left. \left(\frac{K_4-K_2}{\gcd(K_2,K_4)} + 1 \right) \gcd(K_2, K_4), \ldots ,
K_4 - \gcd(K_2, K_4)\right\} ,} &
\end{eqnarray}
which has a cardinality $K_2/K_4$ that of the set in
(\ref{eq:lem:crat0:sfdv0:sprc0:02:09}). Since the
\texttt{state} variable cycles through the set in
(\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) with perfect periodicity and since
$K_2/K_4$ of the set elements lead to the \texttt{if()} statement
test being
met, (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) is also met as
$N_I\rightarrow\infty$.
Note that if $K_1 \neq 0$, it simply changes the startup
behavior of the rational counting. So long as $K_2 < K_4$,
Algorithm \ref{alg:crat0:sfdv0:01a} will reach a steady state where
(\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) holds.
Note that if $K_3 \neq K_4$, it simply ``shifts'' the sets
specified in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09})
and (\ref{eq:lem:crat0:sfdv0:sprc0:02:10}), but
(\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) still holds.
The lemma has thus been proved
for every case. (We have neglected to give
the formal proof as required by the definition of a limit that
for any arbitrarily small error $\epsilon$, a
finite $N_I$ can be found so that
the error is at or below $\epsilon$; however the skeptical reader
is encouraged to complete Exercise \ref{exe:crat0:sexe0:02}.)
\end{vworklemmaproof}
\begin{vworklemmaparsection}{Remarks}
It is possible to view the long-term accuracy of
Algorithm \ref{alg:crat0:sfdv0:01a} in terms of a limit, as is done in
(\ref{eq:lem:crat0:sfdv0:sprc0:02:01}). However, it is also
possible to observe that $K_1$ and $K_3$ set a delay until
the counting algorithm reaches steady state.
With $K_3=K_4$, the attainment of
steady state is characterized by the \texttt{state} variable
being assigned for the first time to one of the values in
(\ref{eq:lem:crat0:sfdv0:sprc0:02:09}). Once in steady state,
the algorithm cycles with perfect periodic behavior through all of the
$K_4/\gcd(K_2,K_4)$ elements in
(\ref{eq:lem:crat0:sfdv0:sprc0:02:09}), but not necessarily in
the order shown in the equation.
During this period of length $K_4/\gcd(K_2,K_4)$,
exactly $K_2/\gcd(K_2,K_4)$ invocations of the base
subroutine result in
subroutine ``\texttt{A()}'' being run, and exactly
$(K_4-K_2)/\gcd(K_2,K_4)$ do not. Thus, after reaching steady-state the
algorithm has \emph{perfect} accuracy if one considers periods of
length $K_4/\gcd(K_2,K_4)$.
\end{vworklemmaparsection}
%\vworklemmafooter{}
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprc0:04}
If $K_3=K_4$, $K_1=0$, and
$\gcd(K_2, K_4)=1$\footnote{\label{footnote:lem:crat0:sfdv0:sprc0:04:01}If
$\gcd(K_2, K_4) > 1$, then by Theorem
\cprizeroxrefhyphen\ref{thm:cpri0:ppn0:00a} the largest
value that $n K_2 \bmod K_4$ can attain is
$K_4-\gcd(K_2, K_4)$ and the interval in
(\ref{eq:lem:crat0:sfdv0:sprc0:04:01}) is correspondingly
smaller. (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}) is
technically correct but not as conservative as possible.
This is a minor point and we do not dwell on it.}, the error between
the approximation to $N_O$ implemented by
Algorithm \ref{alg:crat0:sfdv0:01a} and the ``ideal'' mapping is always
in the set
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:04:01}
\left[ - \frac{K_4 - 1}{K_4} , 0 \right] ,
\end{equation}
and no algorithm can be constructed to
confine the error to a smaller interval.
\end{vworklemmastatement}
\begin{vworklemmaproof}
With $K_1=0$ and $K_3 = K_4$, it can be verified analytically that
the total number of times the function ``\texttt{A()}'' has been
invoked up to and including the $n$th invocation of the base subroutine
is
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:04:02}
N_O = \left\lfloor \frac{n K_2}{K_4} \right\rfloor .
\end{equation}
On the other hand, the ``ideal'' number of invocations, which
we denote $\overline{N_O}$, is given by
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:04:03}
\overline{N_O} = \frac{n K_2}{K_4} .
\end{equation}
Quantization of the rational number in (\ref{eq:lem:crat0:sfdv0:sprc0:04:02})
can introduce an error of up to $-(K_4-1)/K_4$, therefore
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:04:04}
N_O - \overline{N_O} =
\left\lfloor \frac{n K_2}{K_4} \right\rfloor - \frac{n K_2}{K_4}
\in \left[ - \frac{K_4 - 1}{K_4} , 0 \right] .
\end{equation}
This proves the error bound for Algorithm \ref{alg:crat0:sfdv0:01a}.
The proof that there can be no better algorithm is the topic
of Exercise \ref{exe:crat0:sexe0:06}.
\end{vworklemmaproof}
\begin{vworklemmaparsection}{Remarks}
Algorithm \ref{alg:crat0:sfdv0:01a} is \emph{optimal} in the
sense that no algorithm can achieve a tighter error
bound than (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}). As
demonstrated in Exercises \ref{exe:crat0:sexe0:04}
and \ref{exe:crat0:sexe0:05}, $K_1 \neq 0$ can be chosen
to shift the interval in (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}), but
the span of the interval cannot be reduced.
\end{vworklemmaparsection}
\vworklemmafooter{}
Lemmas \ref{lem:crat0:sfdv0:sprc0:02}
and \ref{lem:crat0:sfdv0:sprc0:04} have demonstrated that the ratio of
counts $N_O/N_I$ will asymptotically
approach $K_2/K_4$
(i.e. the long-term accuracy of Algorithm \ref{alg:crat0:sfdv0:01a}
is \emph{perfect}).
However,
for many applications it is also desirable to have a lack of
``bursty'' behavior. We demonstrate the lack of bursty
behavior in the following lemma.
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprc0:03}
For Algorithm \ref{alg:crat0:sfdv0:01a}, once steady
state has been achieved, the number of consecutive
base subroutine invocations during which subroutine
``\texttt{A()}'' is executed is always in the set
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:03:01}
\left\{
\left\lfloor \frac{K_2}{K_4 - K_2} \right\rfloor ,
\left\lceil \frac{K_2}{K_4 - K_2} \right\rceil
\right\} \cap \vworkintsetpos,
\end{equation}
which contains one integer if $K_2/K_4 \leq 1/2$ or $(K_4-K_2) \vworkdivides K_2$,
or two integers otherwise.
Once steady state has been achieved, the number of
consecutive base function invocations during which
subroutine ``\texttt{A()}'' is not executed is
always in the set
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:03:02}
\left\{
\left\lfloor \frac{K_4-K_2}{K_2} \right\rfloor ,
\left\lceil \frac{K_4-K_2}{K_2} \right\rceil
\right\} \cap \vworkintsetpos,
\end{equation}
which contains one integer if $K_2/K_4 \geq 1/2$ or $K_2 \vworkdivides K_4$,
or two integers otherwise.
\end{vworklemmastatement}
\begin{vworklemmaproof}
As before in Lemma \ref{lem:crat0:sfdv0:sprc0:02}
for convenience and without
loss of generality, assume $K_3=K_4$ and
$K_1=0$. Then after a transient period
determined by $K_1$ and $K_3$, the \texttt{state}
variable will be assigned one of the values in
(\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) and cycle through
those values in an unestablished order but with perfect
periodicity. To accomplish this proof, we must establish
something about the order in which the \texttt{state} variable attains
the values in the set in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}).
We can partition the set in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09})
into two sets; the set of values of \texttt{state} for which if the
base subroutine is invoked with \texttt{state} in this set, subroutine
``\texttt{A()}'' will not be invoked (we call this set $\phi_1$),
and the set of values of \texttt{state} for which if the
base subroutine is invoked with \texttt{state} in this set, subroutine
``\texttt{A()}'' will be invoked (we call this set $\phi_2$).
$\phi_1$ and $\phi_2$ are identified below.
\begin{eqnarray}
\label{eq:lem:crat0:sfdv0:sprc0:03:03}
& \phi_1 = & \\
\nonumber &
\displaystyle{\left\{
0, \gcd(K_2, K_4), 2 \gcd(K_2, K_4), \ldots ,
\left(\frac{K_4-K_2}{\gcd(K_2,K_4)} - 1 \right) \gcd(K_2, K_4)
\right\}} &
\end{eqnarray}
\begin{eqnarray}
\label{eq:lem:crat0:sfdv0:sprc0:03:04}
& \displaystyle{
\phi_2 = \left\{\left(\frac{K_4-K_2}{\gcd(K_2,K_4)}\right) \gcd(K_2, K_4),\right.} & \\
\nonumber & \displaystyle{\left.
\left(\frac{K_4-K_2}{\gcd(K_2,K_4)} + 1 \right) \gcd(K_2, K_4) ,
\ldots ,
K_4 - \gcd(K_2, K_4)
\right\}} &
\end{eqnarray}
We can also make the following four additional useful observations
about $\phi_1$ and $\phi_2$. Note that
(\ref{eq:lem:crat0:sfdv0:sprc0:03:07}) and
(\ref{eq:lem:crat0:sfdv0:sprc0:03:08}) become equality
if $\gcd(K_2, K_4) = 1$.
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:03:05}
n(\phi_1) = \frac{K_4 - K_2}{\gcd(K_2, K_4)}
\end{equation}
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:03:06}
n(\phi_2) = \frac{K_2}{\gcd(K_2, K_4)}
\end{equation}
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:03:07}
\phi_1 \subseteq \{ 0, 1, \ldots , K_4 - K_2 - 1 \}
\end{equation}
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:03:08}
\phi_2 \subseteq \{K_4 - K_2, \ldots , K_4 - 1 \}
\end{equation}
We first prove (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}).
If $state_n \in \phi_2$ at the time the base function
is invoked, then
``\texttt{A()}'' will be invoked. We also know that
since $state_n \in \phi_2$, $state_n + K_2 \geq K_4$, so
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc0:03:09}
state_{n+1} \;\; =|_{state_n \in \phi_2} \;\; state_n - (K_4 - K_2) .
\end{equation}
Thus so long as $state_n \in \phi_2$, $state_{n+1} < state_n$
as specified above in (\ref{eq:lem:crat0:sfdv0:sprc0:03:09}).
With each invocation of the base subroutine, \texttt{state} will
``walk downward'' through $\phi_2$. It can
also be observed that when \texttt{state} drops below the smallest
element of $\phi_2$, the next value of \texttt{state} will
be in $\phi_1$.
Note also that although the downward walk specified in
(\ref{eq:lem:crat0:sfdv0:sprc0:03:09}) walks downward in absolute steps
of $K_4-K_2$, this corresponds to $(K_4-K_2) / \gcd(K_2, K_4)$
\emph{elements} of $\phi_2$, since the elements of $\phi_2$ are
separated by $\gcd(K_2, K_4)$.
Given the ``downward walk'' specified in (\ref{eq:lem:crat0:sfdv0:sprc0:03:09}),
the only question to be answered is how many consecutive values of
\texttt{state}, separated by $K_4-K_2$ (or $(K_4-K_2)/\gcd(K_2, K_4)$ elements),
can ``fit'' into
$\phi_2$. Considering that $n(\phi_2) = K_2/\gcd(K_2, K_4)$
(Eq. \ref{eq:lem:crat0:sfdv0:sprc0:03:06}) and that the
downward step represents $(K_4-K_2)/\gcd(K_2, K_4)$ set elements,
(\ref{eq:lem:crat0:sfdv0:sprc0:03:01}) comes immediately by
a graphical argument.
We now prove (\ref{eq:lem:crat0:sfdv0:sprc0:03:02}).
This can be proved using exactly the same arguments
as for (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}), but
considering the upward walk through $\phi_1$ rather
than the downward walk through $\phi_2$.
As with Lemma \ref{lem:crat0:sfdv0:sprc0:02},
note that the choices of $K_1$ and $K_3$ do not
materially affect the proof above. $K_1$ and
$K_3$ only set a delay until the rational counting
algorithm reaches steady state. $K_3$ only shifts
the sets $\phi_1$ and $\phi_2$.
\end{vworklemmaproof}
\begin{vworklemmaparsection}{Remark \#1}
This lemma proves an \emph{extremely} important property for the
usability of Algorithm \ref{alg:crat0:sfdv0:01a}. It says that once
steady state has been reached, the variability in the number of consecutive
times ``\texttt{A()}'' is run or not run is at most one count.
\end{vworklemmaparsection}
\begin{vworklemmaparsection}{Remark \#2}
It is probably also possible to construct a rational counting algorithm
so that the number of consecutive times ``\texttt{A()}'' is run is constant,
but the algorithm achieves long-term accuracy by varying only the number
of consecutive times ``\texttt{A()}'' is not run (or vice-versa), but this
is not done here.
\end{vworklemmaparsection}
\begin{vworklemmaparsection}{Remark \#3}
There is no requirement that $K_2$ and $K_4$ be coprime. In fact, as
demonstrated later, it may be advantageous to choose a large $K_2$ and
$K_4$ to approximate a simple ratio so that very fine adjustments can be
made. For example, if the ideal ratio is 1/2, it may be desirable
in some applications to
choose $K_2$=1,000 and $K_4$=2,000 so that fine adjustments can be made
by slightly perturbing $K_2$ or $K_4$. One might adjust 1,000/2,000 downward
to 999/2,000 or upward to 1,001/2,000 by modifying $K_2$
(both very fine adjustments).
\end{vworklemmaparsection}
\begin{vworklemmaparsection}{Remark \#4}
The most common choice of $K_1$ in practice is 0. If $K_1=0$ is chosen,
it can be shown that the number of initial invocations of the
base subroutine is in the set identified in
(\ref{eq:lem:crat0:sfdv0:sprc0:03:01}).
(See Exercise \ref{exe:crat0:sexe0:07}.)
\end{vworklemmaparsection}
\vworklemmafooter{}
For microcontroller work, it is considered
a desirable property that software components be resilient
to state upset
(see Section \chgrzeroxrefhyphen\ref{chgr0:sdda0:srob0}).
It can be observed that Algorithm \ref{alg:crat0:sfdv0:01a} will
exhibit very anomalous behavior if \texttt{state} is upset to a very negative
value. One possible correction to this shortcoming is illustrated
in Figure \ref{fig:crat0:sfdv0:sprc0:01}. Other possible
corrections are the topic of Exercise \ref{exe:crat0:sexe0:08}.
\begin{figure}
\begin{verbatim}
/* The constants K1 through K4, which parameterize the */
/* counting behavior, are assumed assigned elsewhere in */
/* the code. The solution is analyzed in terms of the */
/* parameters K1 through K4. */
/* */
/* We also place the following restrictions on K1 through */
/* K4: */
/* K1 : K1 <= K3 - K2. */
/* K2 : K4 > K2 > 0. */
/* K3 : No restrictions. */
/* K4 : K4 > K2 > 0. */
void base_rate_func(void)
{
static int state = K1;
state += K2;
if ((state < K1) || (state >= K3))
{
state -= K4;
A();
}
}
\end{verbatim}
\caption{Algorithm \ref{alg:crat0:sfdv0:01a} With State Upset Shortcoming
Corrected}
\label{fig:crat0:sfdv0:sprc0:01}
\end{figure}
\begin{vworkexamplestatement}
\label{ex:crat0:sfdv0:sprc0:01}
Determine the behavior of Algorithm \ref{alg:crat0:sfdv0:01a} with
$K_1=0$, $K_2=30$, and $K_3=K_4=50$.
\end{vworkexamplestatement}
\begin{vworkexampleparsection}{Solution}
We first predict the behavior, and then trace the algorithm to
verify whether the predictions are accurate.
We make the following predictions:
\begin{itemize}
\item The steady state sequence of invocations of ``\texttt{A()}'' will
be periodic with period
$K_4/\gcd(K_2, K_4) = 50/10 = 5$, as described
in Lemma \ref{lem:crat0:sfdv0:sprc0:02}.
\item The number of initial invocations of the
base subroutine in which ``\texttt{A()}''
is not run will be
$\lceil (K_4 - K_2) / K_2 \rceil = \lceil 2/3 \rceil = 1$,
as described in Remark \#4 of
Lemma \ref{lem:crat0:sfdv0:sprc0:03} and in the solution to
Exercise \ref{exe:crat0:sexe0:07}.
\item In steady state, the number of consecutive invocations of the
base subroutine during which ``\texttt{A()}''
is not executed will always be 1, as
described in Equation \ref{eq:lem:crat0:sfdv0:sprc0:03:02} of
Lemma \ref{lem:crat0:sfdv0:sprc0:03}.
(Applying Eq. \ref{eq:lem:crat0:sfdv0:sprc0:03:02}
yields \
$\{ \lfloor 20/30 \rfloor , \lceil 20/30 \rceil \} \cap \vworkintsetpos %
= \{ 0,1 \} \cap \{1, 2, \ldots \} = \{ 1 \}$.)
\item In steady state, the number of consecutive invocations of the
base subroutine during which ``\texttt{A()}''
is executed will always be 1 or 2, as
described in Equation \ref{eq:lem:crat0:sfdv0:sprc0:03:01} of
Lemma \ref{lem:crat0:sfdv0:sprc0:03}.
(Applying Eq. \ref{eq:lem:crat0:sfdv0:sprc0:03:01}
yields \
$\{ \lfloor 30/20 \rfloor , \lceil 30/20 \rceil \} \cap \vworkintsetpos %
= \{ 1,2 \} \cap \{1, 2, \ldots \} = \{ 1,2 \}$.)
\item The rational counting algorithm will have
perfect long-term accuracy.
\end{itemize}
We can verify the predictions above by tracing the behavior of
Algorithm \ref{alg:crat0:sfdv0:01a}. We adopt the convention
that $A_n = 1$ if subroutine ``\texttt{A()}'' is invoked during
the $n$th invocation of the base subroutine.
Table \ref{tbl:crat0:sfdv0:sprc0:01}
contains the results of tracing Algorithm \ref{alg:crat0:sfdv0:01a}
with $K_1=0$, $K_2=30$, and $K_3=K_4=50$.
\begin{table}
\caption{Trace Of Algorithm \ref{alg:crat0:sfdv0:01a} With
$K_1=0$, $K_2=30$, And $K_3=K_4=50$ (Example \ref{ex:crat0:sfdv0:sprc0:01})}
\label{tbl:crat0:sfdv0:sprc0:01}
\begin{center}
\begin{tabular}{|c|c|c|}
\hline
Index ($n$) & $state_n$ & $A_n$ \\
\hline
\hline
0 & 0 & N/A \\
\hline
1 & 30 & 0 \\
\hline
2 & 10 & 1 \\
\hline
3 & 40 & 0 \\
\hline
4 & 20 & 1 \\
\hline
5 & 0 & 1 \\
\hline
6 & 30 & 0 \\
\hline
7 & 10 & 1 \\
\hline
8 & 40 & 0 \\
\hline
9 & 20 & 1 \\
\hline
10 & 0 & 1 \\
\hline
\end{tabular}
\end{center}
\end{table}
It can be verfied from the table that all of the
predicted properties are exhibited by the
algorithm.
\end{vworkexampleparsection}
\vworkexamplefooter{}
A second characteristic of Algorithm \ref{alg:crat0:sfdv0:01a}
that should be analyzed carefully is the behavior
of the algorithm if parameters $K_2$ and $K_4$ are adjusted
``on the fly''. ``On-the-fly'' adjustment
raises the following concerns. We assume for convenience
that $K_1=0$ and $K_3=K_4$.
\begin{enumerate}
\item \label{enum:crat0:sfdv0:sprc0:01:01}
\textbf{Critical section protocol:} if the
rational counting algorithm is implemented in a process which
is asynchronous to the process which desires to change
$K_2$ and $K_4$, what precautions must be taken?
\item \label{enum:crat0:sfdv0:sprc0:01:02}
\textbf{Anomalous behavior:} will the rational
counting algorithm behave in a \emph{very} unexpected way
if $K_2$ and $K_4$ are changed on the fly?
\item \label{enum:crat0:sfdv0:sprc0:01:03}
\textbf{Preservation of accuracy:} even if the behavior
exhibited is not \emph{extremely} anomalous, how should
$K_2$ and $K_4$ be modified on the fly so as to preserve the
maximum accuracy?
\end{enumerate}
\textbf{(Concern \#\ref{enum:crat0:sfdv0:sprc0:01:02}):} It can be observed
with Algorithm \ref{alg:crat0:sfdv0:01a} that neither increasing
nor decreasing $K_2$ nor $K_4$ on the fly
will lead to \emph{highly} anomalous
behavior. Each invocation of the algorithm will map
\texttt{state} back into the set identified in
(\ref{eq:lem:crat0:sfdv0:sprc0:02:09}). Thus on-the-fly changes
to $K_2$ and $K_4$ will establish the rational counting algorithm
immediately into steady-state behavior, and the result will not be
\emph{highly} anomalous if such on-the-fly changes are not
made very often.
\textbf{(Concern \#\ref{enum:crat0:sfdv0:sprc0:01:03}):} It can be deduced
from
(\ref{eq:lem:crat0:sfdv0:sprc0:04:02}),
(\ref{eq:lem:crat0:sfdv0:sprc0:04:03}), and
(\ref{eq:lem:crat0:sfdv0:sprc0:04:04}) that the value of the
\texttt{state} variable in Algorithm \ref{alg:crat0:sfdv0:01a}
satisfies the relationship
\begin{equation}
\label{eq:crat0:sfdv0:sprc0:01}
\overline{N_O} - N_O = \frac{state}{K_4} ;
\end{equation}
\noindent{}in other words, the \texttt{state} variable
contains the remainder of an effective division by $K_4$
and thus maintains the fractional part of $\overline{N_O}$.
Altering $K_4$ on the fly to a new value
(say, $\overline{K_4}$) may be problematic, because
to preserve the current fractional part
of $\overline{N_O}$, one must adjust it for
the new denominator $\overline{K_4}$. This requires
solving the equation
\begin{equation}
\label{eq:crat0:sfdv0:sprc0:02}
\frac{state}{K_4} = \frac{n}{\;\;\overline{K_4}\;\;}
\end{equation}
\noindent{}for $n$ which must be an integer to avoid
loss of information. In general,
this would require that $K_4 \vworkdivides \overline{K_4}$,
a constraint which would be rarely met. Thus, for high-precision
applications where a new rational counting rate should become effective
seamlessly, the best strategy would seem to be to modify $K_2$ only.
It can be verified that modifying $K_2$ on the fly accomplishes
a perfect rate transition.
\textbf{(Concern \#\ref{enum:crat0:sfdv0:sprc0:01:01}):} In microcontroller work,
ordinal data types often represent machine-native data types. In such cases,
it may be possible for one process to set $K_2$ or $K_4$
for another process that is asynchronous with respect to it by relying
on the atomicity of machine instructions (i.e. without formal mutual
exclusion protocol). However, in other cases where the ordinal data types
of $K_2$ or $K_4$ are larger than can be accomodated by
a single machine instruction or where $K_2$ and $K_4$ must be modified
together atomically, mutual exclusion protocol should be used to
prevent anomalous behavior due to race conditions (see
Exercise \ref{exe:crat0:sexe0:14}).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01b}]
{Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01b}}
%Section tag: PRC1
\label{crat0:sfdv0:sprc1}
Algorithm \ref{alg:crat0:sfdv0:01a}
has the disadvantage that it requires $K_2/K_4 < 1$ (i.e. it can only
decrease frequency, but never increase frequency). This deficiency
can be corrected by using
Algorithm \ref{alg:crat0:sfdv0:01b}.
Note that Algorithm \ref{alg:crat0:sfdv0:01b} will properly deal with $K_2$ and
$K_4$ chosen such that $0 < K_2/K_4 < \infty$.
The most common reason that one may want a counting algorithm
that will correctly handle
$K_2/K_4 \geq 1$ is to conveniently handle $K_2/K_4 \approx 1$.
In practice, $K_2/K_4$ may represent a quantity that is
normally very close to
1 but may also be slightly less than or slightly greater than 1.
For example, one may use $K_2/K_4 \approx 1$ to correct for a
crystal or a resonator which deviates slightly from its nominal
frequency. We illustrate this with the following example.
\begin{vworkexamplestatement}
\label{ex:crat0:sfdv0:sprc1:01}
A microcontroller software load keeps time via an interrupt
service routine that runs every 1ms, but this frequency may be
off by as much as 1 part in 10,000 due to variations in
crystal or resonator manufacture. The interrupt service routine
updates a counter which represents the number of milliseconds elapsed since
the software load was reset. Devise a rational counting strategy
based on Algorithm \ref{alg:crat0:sfdv0:01b}
which will allow the time accuracy to be trimmed to within
one second per year or less by adjusting only $K_4$, and implement the counting strategy
in software.
\end{vworkexamplestatement}
\begin{vworkexampleparsection}{Solution}
$K_2/K_4$ will be nominally very close to 1 ($K_2 \approx K_4$).
If we assume that each year has 365.2422\footnote{The period of the earth's
rotation about the sun is not an integral number of days, which is why the
rules for leap years exist. Ironically, the assignment of leap years is itself
a problem very similar to the rational counting problems discussed in this chapter.} days
($\approx$ 31,556,926 seconds), then choosing
$K_2 \approx K_4 = 31,556,926$ will yield satisfactory results.
If we may need to compensate for up to 1 part in 10,000 of crystal or resonator
inaccuracy, we may need to adjust $K_2$ as low as 0.9999 $\times$ 31,556,926 $\approx$
31,553,770 (to compensate for a fast
crystal or resonator) or as
high as 1.0001 $\times$ 31,556,926
$\approx$ 31,560,082
(to compensate for a slow crystal or resonator). Choosing
$K_4 = 31,556,926$ yields the convenient relationship that each
count in $K_2$ corresponds to one second per year.
\begin{figure}
\begin{verbatim}
/* The constants K1 through K4, which parameterize the */
/* counting behavior, are assumed assigned elsewhere in */
/* the code. */
/* */
/* The variable time_count below is the number of milli- */
/* seconds since the software was reset. */
int time_count = 0;
/* It is assumed that the base rate subroutine below is */
/* called every millisecond (or, at least what should be */
/* every millisecond of the crystal or resonator were */
/* perfect). */
void base_rate_sub(void)
{
static int state = K1;
state += K2;
while (state >= K3)
{
state -= K4;
time_count++;
}
}
\end{verbatim}
\caption{Algorithm \ref{alg:crat0:sfdv0:01b} Applied To Timekeeping
(Example \ref{ex:crat0:sfdv0:sprc1:01})}
\label{fig:ex:crat0:sfdv0:sprc1:01:01}
\end{figure}
Figure \ref{fig:ex:crat0:sfdv0:sprc1:01:01} provides an illustration
of Algorithm \ref{alg:crat0:sfdv0:01b} applied in this scenario.
We assume that $K_4$ contains the constant value 31,556,926
and that $K_2$ is modified about this value either downwards or upwards
to trim the timekeeping. Note that Algorithm \ref{alg:crat0:sfdv0:01b} will correctly
handle $K_2 \geq K_4$.
Also note in the implementation illustrated in Figure
\ref{fig:ex:crat0:sfdv0:sprc1:01:01} that large integers (27 bits or more)
are required. (See also Exercise \ref{exe:crat0:sexe0:09}).
\end{vworkexampleparsection}
\vworkexamplefooter{}
It may not be obvious whether Algorithm \ref{alg:crat0:sfdv0:01b} has the
same or similar desirable properties as Algorithm \ref{alg:crat0:sfdv0:01a}
presented
in Lemmas
\ref{lem:crat0:sfdv0:sprc0:01},
\ref{lem:crat0:sfdv0:sprc0:02},
\ref{lem:crat0:sfdv0:sprc0:04},
and
\ref{lem:crat0:sfdv0:sprc0:03}.
Algorithm \ref{alg:crat0:sfdv0:01b} does have these desirable
properties, and these properties are presented as
Lemmas \ref{lem:crat0:sfdv0:sprc1:01},
\ref{lem:crat0:sfdv0:sprc1:02},
\ref{lem:crat0:sfdv0:sprc1:03}, and
\ref{lem:crat0:sfdv0:sprc1:04}.
The proofs of these lemmas are identical or very similar to the proofs
of Lemmas
\ref{lem:crat0:sfdv0:sprc0:01},
\ref{lem:crat0:sfdv0:sprc0:02},
\ref{lem:crat0:sfdv0:sprc0:04},
and
\ref{lem:crat0:sfdv0:sprc0:03};
and so these proofs when not identical are presented as exercises.
Note that Algorithm \ref{alg:crat0:sfdv0:01b} behaves identically to
Algorithm \ref{alg:crat0:sfdv0:01a} when $K_2 < K_4$, and the
case of $K_2=K_4$ is trivial, so in general only
the behavior when $K_2 > K_4$ remains to be proved.
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprc1:01}
$N_{STARTUP}$, the number of invocations of the base subroutine
in Algorithm \ref{alg:crat0:sfdv0:01b} before ``\texttt{A()}'' is called
for the first time, is given by
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc1:01:01}
N_{STARTUP} =
\left\lceil
{
\frac{-K_1 - K_2 + K_3}{K_2}
}
\right\rceil .
\end{equation}
\end{vworklemmastatement}
\begin{vworklemmaproof}
The proof is identical to the proof of Lemma
\ref{lem:crat0:sfdv0:sprc0:01}.
\end{vworklemmaproof}
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprc1:02}
Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:01b}
base subroutine
is called, let $N_O$ be the number of times the
``\texttt{A()}'' subroutine is called, let
$f_I$ be the frequency of invocation of the
Algorithm \ref{alg:crat0:sfdv0:01a} base subroutine, and let
$f_O$ be the frequency of invocation of
``\texttt{A()}''.
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc1:02:01}
\lim_{N_I\rightarrow\infty}\frac{N_O}{N_I}
=
\frac{f_O}{f_I}
=
\frac{K_2}{K_4} .
\end{equation}
\end{vworklemmastatement}
\begin{vworklemmaproof}
See Exercise \ref{exe:crat0:sexe0:10}.
\end{vworklemmaproof}
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprc1:03}
If $K_3=K_4$, $K_1=0$, and $\gcd(K_2, K_4)=1$\footnote{See also
footnote \ref{footnote:lem:crat0:sfdv0:sprc0:04:01} in this chapter.}, the error between
the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01b}
and the ``ideal'' mapping is always
in the set
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc1:03:01}
\left[ - \frac{K_4 - 1}{K_4} , 0 \right] ,
\end{equation}
and no algorithm can be constructed to
confine the error to a smaller interval.
\end{vworklemmastatement}
\begin{vworklemmaproof}
The proof is identical to the proof of Lemma \ref{lem:crat0:sfdv0:sprc0:04}.
\end{vworklemmaproof}
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprc1:04}
For Algorithm \ref{alg:crat0:sfdv0:01b}
with
$K_2 \geq K_4$, once steady
state has been achieved (see Exercise
\ref{exe:crat0:sexe0:13}), each invocation of the
base subroutine will result in
a number of invocations of
``\texttt{A()}'' which is in the set
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc1:04:01}
\left\{
\left\lfloor \frac{K_2}{K_4} \right\rfloor ,
\left\lceil \frac{K_2}{K_4} \right\rceil
\right\},
\end{equation}
which contains one integer if $K_4 \vworkdivides K_2$,
or two integers otherwise. With $K_2 < K_4$,
the behavior will be as specified in Lemma
\ref{lem:crat0:sfdv0:sprc0:03}.
\end{vworklemmastatement}
\begin{vworklemmaproof}
See Exercise \ref{exe:crat0:sexe0:12}.
\end{vworklemmaproof}
\begin{vworklemmaparsection}{Remark}
Note that Lemma \ref{lem:crat0:sfdv0:sprc0:03}
and this lemma specify different aspects of behavior,
which is why (\ref{eq:lem:crat0:sfdv0:sprc0:03:01})
and (\ref{eq:lem:crat0:sfdv0:sprc0:03:02}) take
different forms than
(\ref{eq:lem:crat0:sfdv0:sprc1:04:01}).
Lemma \ref{lem:crat0:sfdv0:sprc0:03} specifies the number of consecutive
invocations of the base subroutine for which ``\texttt{A()}''
will be run, but with $K_2 \geq K_4$ it does not make sense to
specify behavior in this way since ``\texttt{A()}'' will be run
on \emph{every} invocation of the base subroutine. This lemma specifies
the number of times ``\texttt{A()}'' will be run on a \emph{single}
invocation of the base subroutine (which is not meaningful if
$K_2 < K_4$ since the result will always be 0 or 1).
\end{vworklemmaparsection}
%\vworklemmafooter{}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01c}]
{Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01c}}
%Section tag: PRX0
\label{crat0:sfdv0:sprx0}
Algorithm \ref{alg:crat0:sfdv0:01c}\footnote{Algorithm \ref{alg:crat0:sfdv0:01c}
was contributed in March, 2003
by Chuck B. Falconer \cite{bibref:i:chuckbfalconer}
via the
\texttt{comp.arch.embedded} \cite{bibref:n:comparchembedded}
newsgroup.}
is a variant of Algorithm \ref{alg:crat0:sfdv0:01a}
which has one fewer
degrees of freedom than Algorithms \ref{alg:crat0:sfdv0:01a}
and \ref{alg:crat0:sfdv0:01b} and can be implemented
more efficiently under most instruction sets. Algorithm \ref{alg:crat0:sfdv0:01c}
is superior to Algorithms \ref{alg:crat0:sfdv0:01a}
and \ref{alg:crat0:sfdv0:01b}
from a computational efficiency
point of view, but is less intuitive.
The superiority in computational efficiency of Algorithm \ref{alg:crat0:sfdv0:01c}
comes from the possibility of using an implicit test against zero
(rather than an explicit
test against $K_3$, as is found in Algorithms \ref{alg:crat0:sfdv0:01a}
and \ref{alg:crat0:sfdv0:01b}).
Many machine instruction sets automatically set flags to indicate a negative
result when the
subtraction of $K_2$ is performed, thus often allowing a conditional branch
without an additional instruction. Whether an instruction will be saved in
the code of Figure \ref{fig:crat0:sfdv0:01c} depends on the sophistication
of the `C' compiler, but of course if the algorithm were coded in
assembly-language an instruction could be saved on most processors.
The properties of rational counting Algorithm \ref{alg:crat0:sfdv0:01c} are nearly
identical to those of Algorithm \ref{alg:crat0:sfdv0:01a},
and we prove the important properties
now.
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprx0:01}
$N_{STARTUP}$, the number of invocations of the base subroutine
in Algorithm \ref{alg:crat0:sfdv0:01c} before ``\texttt{A()}'' is called
for the first time, is given by
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprx0:01:01}
N_{STARTUP} =
\left\lfloor
{
\frac{K_1}{K_2}
}
\right\rfloor .
\end{equation}
\end{vworklemmastatement}
\begin{vworklemmaproof}
The value of \texttt{state} when tested against
zero in the \texttt{if()} statement during the $n$th invocation
of the base subroutine is $K_1 - n K_2$. In order for the test
not to be met on the $n$th invocation
of the base subroutine, we require that
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprx0:01:02}
K_1 - n K_2 \geq 0
\end{equation}
\noindent{}or equivalently that
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprx0:01:03}
n \leq \frac{K_1}{K_2} .
\end{equation}
Solving (\ref{eq:lem:crat0:sfdv0:sprx0:01:03}) for the
largest value of $n \in \vworkintset$ which still meets the criterion
yields (\ref{eq:lem:crat0:sfdv0:sprx0:01:01}). Note that
the derivation of (\ref{eq:lem:crat0:sfdv0:sprx0:01:01}) requires
that the restrictions on $K_1$, $K_2$, and $K_3$ documented in
Figure \ref{fig:crat0:sfdv0:01c} be met.
\end{vworklemmaproof}
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprx0:02}
Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:01c}
base subroutine
is called, let $N_O$ be the number of times the
``\texttt{A()}'' subroutine is called, let
$f_I$ be the frequency of invocation of the
Algorithm \ref{alg:crat0:sfdv0:01a}
base subroutine, and let
$f_O$ be the frequency of invocation of
``\texttt{A()}''. Provided the constraints
on $K_1$, $K_2$, and $K_3$ documented in
Figure \ref{fig:crat0:sfdv0:01c} are met,
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprx0:02:01}
\lim_{N_I\rightarrow\infty}\frac{N_O}{N_I}
=
\frac{f_O}{f_I}
=
\frac{K_2}{K_4} .
\end{equation}
\end{vworklemmastatement}
\begin{vworklemmaproof}
(\ref{eq:lem:crat0:sfdv0:sprx0:02:01}) indicates that once
an initial delay (determined by $K_1$) has finished,
$N_O/N_I$ will converge on a steady-state value of
$K_2/K_4$.
The most straightforward way to analyze Algorithm \ref{alg:crat0:sfdv0:01c}
is to show how an algorithm already
understood (Algorithm \ref{alg:crat0:sfdv0:01a})
can be transformed to
Algorithm \ref{alg:crat0:sfdv0:01c}
in a way where the analysis of Algorithm \ref{alg:crat0:sfdv0:01a}
also applies to Algorithm \ref{alg:crat0:sfdv0:01c}.
Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01} shows
how such a transformation can be performed in
four steps.
\begin{figure}
(a) Algorithm \ref{alg:crat0:sfdv0:01a} unchanged.
$state_{a,n} \in \{0, 1, \ldots, K_4 - 1 \}$.
\begin{verbatim}
state += K2;
if (state >= K4)
{
state -= K4;
A();
}
\end{verbatim}
(b) ``\texttt{>=}'' changed to ``\texttt{>}''. $state_{b,n} \in \{1, 2, \ldots, K_4 \}$,
$state_{b,n} = state_{a,n} + 1$.
\begin{verbatim}
state += K2;
if (state > K4)
{
state -= K4;
A();
}
\end{verbatim}
(c) Test against $K_4$ changed to test against zero.
$state_{c,n} \in \{-K_4 + 1, -K_4 + 2, \ldots, 0 \}$,
$state_{c,n} = state_{b,n} - K_4$.
\begin{verbatim}
state += K2;
if (state > 0)
{
state -= K4;
A();
}
\end{verbatim}
(d) Sign inversion.
$state_{d,n} \in \{ 0, 1, \ldots, K_4 - 1 \}$,
$state_{d,n} = - state_{c,n}$.
\begin{verbatim}
state -= K2;
if (state < 0)
{
state += K4;
A();
}
\end{verbatim}
(e) `C' expression rearrangement.
$state_{e,n} \in \{ 0, 1, \ldots, K_4 - 1 \}$,
$state_{e,n} = state_{d,n}$.
\begin{verbatim}
if ((state -= K2) < 0)
{
state += K4;
A();
}
\end{verbatim}
\caption{4-Step Transformation Of Algorithm \ref{alg:crat0:sfdv0:01a}
To Algorithm \ref{alg:crat0:sfdv0:01c}}
\label{fig:lem:crat0:sfdv0:sprx0:02:01}
\end{figure}
In Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01}, each of the
four steps required to transform from Algorithm \ref{alg:crat0:sfdv0:01a} to
Algorithm \ref{alg:crat0:sfdv0:01c} includes an equation to transform the
\texttt{state} variable. Combining all of these
transformations yields
\begin{eqnarray}
\label{eq:lem:crat0:sfdv0:sprx0:02:02}
state_{e,n} & = & K_4 - 1 - state_{a,n} \\
\label{eq:lem:crat0:sfdv0:sprx0:02:03}
state_{a,n} & = & K_4 - 1 - state_{e,n}
\end{eqnarray}
We thus see that Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01}(a)
(corresponding to Algorithm \ref{alg:crat0:sfdv0:01a}) and
Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01}(e)
(corresponding to Algorithm \ref{alg:crat0:sfdv0:01c}) have
\texttt{state} semantics which involve the same range
but a reversed order. (\ref{eq:lem:crat0:sfdv0:sprx0:02:01})
follows directly from this observation and from
Lemma \ref{lem:crat0:sfdv0:sprc0:02}.
\end{vworklemmaproof}
%\vworklemmafooter{}
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprx0:03}
If $K_1=0$ and $\gcd(K_2, K_4)=1$\footnote{See also
footnote \ref{footnote:lem:crat0:sfdv0:sprc0:04:01} in this chapter.}, the error between
the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01c}
and the ``ideal'' mapping is always
in the set
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprx0:03:01}
\left[ 0, \frac{K_4 - 1}{K_4} \right] ,
\end{equation}
and no algorithm can be constructed to
confine the error to a smaller interval.
\end{vworklemmastatement}
\begin{vworklemmaproof}
Using the duality illustrated by
(\ref{eq:lem:crat0:sfdv0:sprx0:02:02}) and
(\ref{eq:lem:crat0:sfdv0:sprx0:02:03}),
starting Algorithm \ref{alg:crat0:sfdv0:01c} with
$state_0=0$ will yield a dual state vector
with respect to starting Algorithm \ref{alg:crat0:sfdv0:01a} with
$state_0=K_4-1$. Thus,
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprx0:03:02}
N_O = \left\lfloor \frac{n K_2 + K_4 - 1}{K_4} \right\rfloor .
\end{equation}
Using this altered value of $N_O$ in (\ref{eq:lem:crat0:sfdv0:sprc0:04:04})
leads directly to (\ref{eq:lem:crat0:sfdv0:sprx0:03:01}).
The proof that there can be no better algorithm is identical
to the same proof for Lemma \ref{lem:crat0:sfdv0:sprc0:04} (Exercise \ref{exe:crat0:sexe0:06}).
\end{vworklemmaproof}
%\vworklemmafooter{}
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprx0:04}
For Algorithm \ref{alg:crat0:sfdv0:01c}, once steady
state has been achieved, the number of consecutive
base subroutine invocations during which subroutine
``\texttt{A()}'' is executed is always in the set
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprx0:04:01}
\left\{
\left\lfloor \frac{K_2}{K_4 - K_2} \right\rfloor ,
\left\lceil \frac{K_2}{K_4 - K_2} \right\rceil
\right\} \cap \vworkintsetpos,
\end{equation}
which contains one integer if $K_2/K_4 \leq 1/2$ or $(K_4-K_2) \vworkdivides K_2$,
or two integers otherwise.
Once steady state has been achieved, the number of
consecutive base function invocations during which
subroutine ``\texttt{A()}'' is not executed is
always in the set
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprx0:04:02}
\left\{
\left\lfloor \frac{K_4-K_2}{K_2} \right\rfloor ,
\left\lceil \frac{K_4-K_2}{K_2} \right\rceil
\right\} \cap \vworkintsetpos,
\end{equation}
which contains one integer if $K_2/K_4 \geq 1/2$ or $K_2 \vworkdivides K_4$,
or two integers otherwise.
\end{vworklemmastatement}
\begin{vworklemmaproof}
The proof comes directly from the duality between algorithm
Algorithms \ref{alg:crat0:sfdv0:01a}
and \ref{alg:crat0:sfdv0:01c} established in the
proof of Lemma \ref{lem:crat0:sfdv0:sprx0:01}, so that the results
from Lemma \ref{lem:crat0:sfdv0:sprc0:03} apply without modification.
\end{vworklemmaproof}
\vworklemmafooter{}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01d}]
{Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01d}}
%Section tag: PRX1
\label{crat0:sfdv0:sprx1}
Algorithm \ref{alg:crat0:sfdv0:01d}\footnote{Algorithm \ref{alg:crat0:sfdv0:01d}
was contributed in March, 2003
by John Larkin \cite{bibref:i:johnlarkin}
via the
\texttt{comp.arch.embedded} \cite{bibref:n:comparchembedded}
newsgroup.}
(Figure \ref{fig:crat0:sfdv0:01d}) is a further
economization of Algorithms \ref{alg:crat0:sfdv0:01a}
through \ref{alg:crat0:sfdv0:01c} that can be made by eliminating
the addition or subtraction of $K_4$ and test against $K_3$
and instead using the
inherent machine integer size of $W$ bits to perform
arithmetic modulo $2^W$. Thus, effectively, Algorithm \ref{alg:crat0:sfdv0:01d}
is equivalent to Algorithm \ref{alg:crat0:sfdv0:01a} with
$K_4 = K_3 = 2^W$.
Figure \ref{fig:crat0:sfdv0:01d} shows both
assembly-language (Texas Instruments TMS-370C8) and
`C' implementations of the algorithm. The assembly-language
version uses the carry flag of the processor and thus
is \emph{very} efficient. Because `C' does not have access
to the processor flags, the 'C' version is less efficient.
The ``less than'' comparison when
using unsigned integers is equivalent to a rollover test.
It is easy to see from the figure that Algorithm \ref{alg:crat0:sfdv0:01d}
is equivalent in all
respects to Algorithm \ref{alg:crat0:sfdv0:01a} with
$K_3 = K_4$ fixed at $2^W$. It is not necessary to enforce any constraints
on $K_2$ because $K_2 < K_3 = K_4 = 2^W$ due to the inherent size of
a machine integer. Note that unlike Algorithms \ref{alg:crat0:sfdv0:01a}
through \ref{alg:crat0:sfdv0:01c} which allow $K_2$ and $K_4$ to be chosen independently
and from the Farey series of appropriate order, Algorithm \ref{alg:crat0:sfdv0:01c}
only allows
$K_2/K_4$ of the form $K_2/2^W$.
The properties below follow immediately
from the properties of Algorithm \ref{alg:crat0:sfdv0:01a}.
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprx1:01}
$N_{STARTUP}$, the number of invocations of the base subroutine
in Algorithm \ref{alg:crat0:sfdv0:01d} before ``\texttt{A()}'' is called
for the first time, is given by
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprx1:01:01}
N_{STARTUP} =
\left\lfloor
{
\frac{2^W - K_1 - 1}{K_2}
}
\right\rfloor .
\end{equation}
\end{vworklemmastatement}
\begin{vworklemmaproof}
The value of \texttt{state} after the $n$th invocation
is $state_n = K_1 + n K_2$. In order for the test in the
\texttt{if()} statement not to be met, we require that
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprx1:01:02}
K_1 + n K_2 \leq 2^W - 1
\end{equation}
\noindent{}or equivalently that
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprx1:01:03}
n \leq \frac{2^W - K_1 - 1}{K_2} .
\end{equation}
Solving (\ref{eq:lem:crat0:sfdv0:sprx1:01:03}) for the largest
value of $n \in \vworkintset$ which still meets the criterion
yields (\ref{eq:lem:crat0:sfdv0:sprx1:01:01}).
\end{vworklemmaproof}
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprx1:02}
Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:01d} base subroutine
is called, let $N_O$ be the number of times the
``\texttt{A()}'' subroutine is called, let
$f_I$ be the frequency of invocation of the
Algorithm \ref{alg:crat0:sfdv0:01d} base subroutine, and let
$f_O$ be the frequency of invocation of
``\texttt{A()}''. Then
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprx1:02:01}
\lim_{N_I\rightarrow\infty}\frac{N_O}{N_I}
=
\frac{f_O}{f_I}
=
\frac{K_2}{2^W} ,
\end{equation}
where $W$ is the number of bits in a machine unsigned integer.
Note that $K_2 < 2^W$ since $K_2 \in \{ 0, 1, \ldots , 2^W-1 \}$.
\end{vworklemmastatement}
\begin{vworklemmaproof}
The proof is identical to the proof of
Lemma \ref{lem:crat0:sfdv0:sprc0:02} with $K_3=K_4=2^W$.
Note that Algorithm \ref{alg:crat0:sfdv0:01a} calculates $n K_2 \bmod K_4$ by
subtraction, whereas Algorithm \ref{alg:crat0:sfdv0:01d} calculates
$n K_2 \bmod 2^W$ by the properties of a $W$-bit counter
which is allowed to roll over.
\end{vworklemmaproof}
%\vworklemmafooter{}
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprx1:03}
If $\gcd(K_2, 2^W)=1$\footnote{See also footnote \ref{footnote:lem:crat0:sfdv0:sprc0:04:01}
in this chapter. Note also that in this context the condition $\gcd(K_2, 2^W)=1$
is equivalent to the condition that $K_2$ be odd.}, the error between
the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01d}
and the ``ideal'' mapping is always
in the set
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprx1:03:01}
\left[ - \frac{2^W - 1}{2^W} , 0 \right] ,
\end{equation}
and no algorithm can be constructed to
confine the error to a smaller interval.
\end{vworklemmastatement}
\begin{vworklemmaproof}
The proof is identical to the proof of Lemma
\ref{lem:crat0:sfdv0:sprc0:04} with $K_4 = 2^W$.
\end{vworklemmaproof}
%\vworklemmafooter{}
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprx1:04}
For Algorithm \ref{alg:crat0:sfdv0:01d}
(Figure \ref{fig:crat0:sfdv0:01d}), once steady
state has been achieved, the number of consecutive
base subroutine invocations during which subroutine
``\texttt{A()}'' is executed is always in the set
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprx1:04:01}
\left\{
\left\lfloor \frac{K_2}{2^W - K_2} \right\rfloor ,
\left\lceil \frac{K_2}{2^W - K_2} \right\rceil
\right\} \cap \vworkintsetpos,
\end{equation}
which contains one integer if $K_2/2^W \leq 1/2$ or $(2^W-K_2) \vworkdivides K_2$,
or two integers otherwise.
Once steady state has been achieved, the number of
consecutive base function invocations during which
subroutine ``\texttt{A()}'' is not executed is
always in the set
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprx1:04:02}
\left\{
\left\lfloor \frac{2^W-K_2}{K_2} \right\rfloor ,
\left\lceil \frac{2^W-K_2}{K_2} \right\rceil
\right\} \cap \vworkintsetpos,
\end{equation}
which contains one integer if $K_2/2^W \geq 1/2$ or $K_2 \vworkdivides 2^W$,
or two integers otherwise.
\end{vworklemmastatement}
\begin{vworklemmaproof}
The proof is identical to the proof of Lemma
\ref{lem:crat0:sfdv0:sprc0:03} with $K_4 = 2^W$.
\end{vworklemmaproof}
\vworklemmafooter{}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:02a}]
{Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:02a}}
%Section tag: PRC2
\label{crat0:sfdv0:sprc2}
Another useful rational counting algorithm is Algorithm \ref{alg:crat0:sfdv0:02a}.
At first glance, it may appear that Algorithm \ref{alg:crat0:sfdv0:02a}
is qualitatively
different than Algorithms \ref{alg:crat0:sfdv0:01a}
and \ref{alg:crat0:sfdv0:01b}.
However, as the following lemmas demonstrate, Algorithm \ref{alg:crat0:sfdv0:02a}
can be easily rearranged to be in the form
of Algorithm \ref{alg:crat0:sfdv0:01a}.
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprc2:01}
$N_{STARTUP}$, the number of invocations of the base subroutine
in Algorithm \ref{alg:crat0:sfdv0:02a} before ``\texttt{A()}'' is called
for the first time, is given by
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc2:01:01}
N_{STARTUP} =
\left\lceil
{
\frac{K_3 - K_1}{K_2}
}
\right\rceil .
\end{equation}
\end{vworklemmastatement}
\begin{vworklemmaproof}
The value of \texttt{state} after the $n$th invocation
is $K_1 + n K_2$. In order for the test in the
\texttt{if()} statement to be met on the $n+1$'th invocation
of the base subroutine, we require that
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc2:01:02}
K_1 + n K_2 \geq K_3
\end{equation}
\noindent{}or equivalently that
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc2:01:03}
n \geq \frac{K_3 - K_1}{K_2} .
\end{equation}
Solving (\ref{eq:lem:crat0:sfdv0:sprc2:01:03}) for the smallest
value of $n \in \vworkintset$ which still meets the criterion
yields (\ref{eq:lem:crat0:sfdv0:sprc2:01:01}). Note that
the derivation of (\ref{eq:lem:crat0:sfdv0:sprc2:01:01}) requires
that the restrictions on $K_1$ through $K_4$ documented in
Figure \ref{fig:crat0:sfdv0:02a} be met.
\end{vworklemmaproof}
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprc2:02}
Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:02a} base subroutine
is called, let $N_{OA}$ be the number of times the
``\texttt{A()}'' subroutine is called, let
$f_I$ be the frequency of invocation of the
Algorithm \ref{alg:crat0:sfdv0:02a} base subroutine, and let
$f_{OA}$ be the frequency of invocation of
``\texttt{A()}''. Then, the proportion of times the
``\texttt{A()}'' subroutine is called is given by
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc2:02:01}
\lim_{N_I\rightarrow\infty}\frac{N_{OA}}{N_I}
=
\frac{f_{OA}}{f_I}
=
\frac{K_2}{K_4 + K_2} ,
\end{equation}
and the proportion of times the ``\texttt{B()}'' subroutine is called
is given by
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc2:02:02}
\lim_{N_I\rightarrow\infty}\frac{N_{OB}}{N_I}
=
\frac{f_{OB}}{f_I}
=
1 - \frac{f_{OA}}{f_I}
=
\frac{K_4}{K_4 + K_2} .
\end{equation}
\end{vworklemmastatement}
\begin{vworklemmaproof}
As in Lemma \ref{} and without
loss of generality, we assume for analytic
convenience that $K_1=0$ and $K_3=K_4$. Note that
$K_1$ and $K_3$ influence only the transient startup
behavior of the algorithm.
It can be observed from the algorithm that once steady
state is achieved, \texttt{state} will be confined to the set
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc2:02:10}
state \in \{ 0, 1, \ldots , K_4 + K_2 - 1 \} .
\end{equation}
It is certainly possible to use results from
number theory and analyze which values in the
set (\ref{eq:lem:crat0:sfdv0:sprc2:02:10}) can be
attained and the order in which they can be attained.
However, an easier approach is to observe that
Algorithm \ref{alg:crat0:sfdv0:02a}
can be rearranged to take the form of
rational counting Algorithm \ref{alg:crat0:sfdv0:01a}.
This rearranged
algorithm is presented as
Figure \ref{fig:lem:crat0:sfdv0:sprc2:02:01}. Note that the
algorithm is rearranged only for easier analysis.
\begin{figure}
\begin{verbatim}
void base_rate_sub(void)
{
static unsigned int state = K1;
state += K2;
if (state >= (K4 + K2))
{
state -= (K4 + K2);
A();
}
else
{
B();
}
}
\end{verbatim}
\caption{Algorithm \ref{alg:crat0:sfdv0:02a} Modified To Resemble Algorithm \ref{alg:crat0:sfdv0:01a}
(Proof Of Lemma \ref{lem:crat0:sfdv0:sprc2:02})}
\label{fig:lem:crat0:sfdv0:sprc2:02:01}
\end{figure}
In Figure \ref{fig:lem:crat0:sfdv0:sprc2:02:01}, the
statement ``\texttt{state += K2}'' has been removed from the
\texttt{else} clause and placed above the \texttt{if()} statement,
and other constants have been adjusted accordingly.
It can be observed that the figure
is structurally identical to rational counting algorithm, except for the
\texttt{else} clause (which does not affect the counting behavior) and
the specific constants for testing and incrementation.
In Figure \ref{fig:lem:crat0:sfdv0:sprc2:02:01}, as contrasted with
Algorithm \ref{alg:crat0:sfdv0:01a}, ``$K_4 + K_2$'' takes the
place of $K_4$. $\gcd(K_2, K_4 + K_2) = \gcd(K_2, K_4)$
(see Lemma \cprizeroxrefhyphen\ref{lem:cpri0:gcd0:01}), so the
results from
\end{vworklemmaproof}
\begin{vworklemmastatement}
\label{lem:crat0:sfdv0:sprc2:03}
If $K_3=K_4$, $K_1=0$, and $\gcd(K_2, K_4)=1$, the error between
the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01b}
and the ``ideal'' mapping is always
in the set
\begin{equation}
\label{eq:lem:crat0:sfdv0:sprc2:03:01}
\left[ - \frac{K_4 - 1}{K_4} , 0 \right] ,
\end{equation}
and no algorithm can be constructed to
confine the error to a smaller interval.
\end{vworklemmastatement}
\begin{vworklemmaproof}
The proof is identical to Lemma \ref{lem:crat0:sfdv0:sprc0:04}.
\end{vworklemmaproof}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Bresenham's Line Algorithm}
%Section tag: BLA0
\label{crat0:sbla0}
\index{Bresenham's line algorithm}\emph{Bresenham's line algorithm} is a
very efficient algorithm for drawing lines on devices that have
a rectangular array of pixels which can be individually illuminated.
Bresenham's line algorithm is efficient for small microcontrollers
because it relies only
on integer addition, subtraction, shifting, and comparison.
Bresenham's line algorithm is presented for two reasons:
\begin{itemize}
\item The algorithm is useful for drawing lines on LCD
displays and other devices typically controlled by
microcontrollers.
\item The algorithm is an [extremely optimized] application
of the rational
counting algorithms presented in this chapter.
\end{itemize}
\begin{figure}
\begin{center}
\begin{huge}
Figure Space Reserved
\end{huge}
\end{center}
\caption{Raster Grid For Development Of Bresenham's Line Algorithm}
\label{fig:crat0:sbla0:01}
\end{figure}
Assume that we wish to draw a line from $(0,0)$ to $(x_f, y_f)$ on
a raster device (Figure \ref{fig:crat0:sbla0:01}). For simplicity of
development, assume that $y_f \leq x_f$ (i.e. that the slope $m \leq 1$).
For each value of $x \in \vworkintset$, the ideal value of $y$ is given
by
\begin{equation}
\label{eq:crat0:sbla0:01}
y = mx = \frac{y_f}{x_f} x = \frac{y_f x}{x_f} .
\end{equation}
\noindent{}However, on a raster device, we must usually
choose an inexact pixel to illuminate, since it is typically
rare that $x_f \vworkdivides y_f x$. If
$x_f \vworkdivides y_f x$, then the ideal value of $y$ is
an integer, and we choose to illuminate
$(x, (y_f x)/x_f)$. However, if $x_f \vworknotdivides y_f x$,
then we must choose either a pixel with the same y-coordinate
as the previous pixel (we call this choice `D') or the pixel
with a y-coordinate one greater than the previous pixel (we
call this choice `U').
The fractional part of the quotient
$(y_f x) / x_f$ indicates whether D or U is closer to the ideal line.
If $y_f x \bmod x_f \geq x_f/2$, we choose U, otherwise we choose D
(note that the decision to choose U in the equality case is arbitrary).
Using the rational approximation techniques presented in
Section \ref{crat0:sfdv0}, it is straightforward to
develop an algorithm, which is presented as the code
in Figure \ref{fig:crat0:sbla0:02}.
Note that this code will only work if $m = y_f/x_f \leq 1$.
\begin{figure}
\begin{verbatim}
/* Draws a line from (0,0) to (x_f,y_f) on a raster */
/* device. */
void bresenham_line(int x_f, int y_f)
{
int d=0; /* The modulo counter. */
int x=0, y=0;
/* x- and y-coordinates currently being */
/* evaluated. */
int d_old; /* Remembers previous value of d. */
plotpoint(0,0); /* Plot initial point. */
while (x <= x_f)
{
d_old = d;
d += y_f;
if (d >= x_f)
d -= x_f;
x++;
if (
(
(d == 0) && (d_old < x_f/2)
)
||
(
(d >= x_f/2)
&&
((d_old < x_f/2) || (d_old >= d))
)
)
y++;
plotpoint(x,y);
}
}
\end{verbatim}
\caption{First Attempt At A Raster Device Line Algorithm
Using Rational Counting Techniques}
\label{fig:crat0:sbla0:02}
\end{figure}
There are a few efficiency refinements that can be made to
the code in Figure \ref{fig:crat0:sbla0:02}, but overall
it is a very efficient algorithm. Note that
nearly all compilers will handle the integer
division by two using a shift
operation rather than a division.
We can however substantially simplify and economize the code of
Figure \ref{fig:crat0:sbla0:02} by using the technique
presented in Figures \ref{fig:crat0:sfdv0:fab0:03} and
\ref{fig:crat0:sfdv0:fab0:04}, and this improved code is
presented as Figure \ref{fig:crat0:sbla0:03}.
\begin{figure}
\begin{verbatim}
/* Draws a line from (0,0) to (x_f,y_f) on a raster */
/* device. */
void bresenham_line(int x_f, int y_f)
{
int d=y_f; /* Position of the ideal line minus */
/* the position of the line we are */
/* drawing, in units of 1/x_f. The */
/* initialization value is y_f because */
/* the algorithm is looking one pixel */
/* ahead in the x direction, so we */
/* begin at x=1. */
int x=0, y=0;
/* x- and y-coordinates currently being */
/* evaluated. */
plotpoint(0,0); /* Plot initial point. */
while (x <= x_f)
{
x++; /* We move to the right regardless. */
if (d >= x_f/2)
{
/* The "U" choice. We must jump up a pixel */
/* to keep up with the ideal line. */
d += (y_f - x_f);
y++; /* Jump up a pixel. */
}
else /* d < x_f/2 */
{
/* The "D" choice. Distance is not large */
/* enough to jump up a pixel. */
d += y_f;
}
plotpoint(x,y);
}
}
\end{verbatim}
\caption{Second Attempt At A Raster Device Line Algorithm
Using Rational Counting Techniques}
\label{fig:crat0:sbla0:03}
\end{figure}
In order to understand the code of Figure \ref{fig:crat0:sbla0:03},
it is helpful to view the problem in an alternate way.
For any $x \in \vworkintset$, let
$d$ be the distance between the position of the ideal line
(characterized by $y = y_f x / x_f$) and
the actual pixel which will be illuminated. It is easy to
observe that:
\begin{itemize}
\item When drawing a raster line, if one proceeds from
$(x, y)$ to $(x+1, y)$ (i.e. makes the ``D'' choice),
$d$ will increase by $y_f/x_f$.
\item When drawing a raster line, if one proceeds from
$(x,y)$ to $(x+1, y+1)$ (i.e. makes the ``U'' choice),
$d$ will increase by $(y_f - x_f)/x_f$. (The increase
of $y_f/x_f$ comes about because the ideal line proceeds
upward from $x$ to $x+1$, while the decrease of $x_f/x_f = 1$
comes about because the line being drawn jumps upward by one
unit, thus tending to ``catch'' the ideal line.)
\end{itemize}
The code of Figure \ref{fig:crat0:sbla0:03} implements the
two observations above in a straightforward way. $d$ is maintained
in units of $1/x_f$, and when ``U'' is chosen over ``D'' whenever
the gap between the ideal line and the current row of pixels
being drawn becomes too large.
The code in Figure \ref{fig:crat0:sbla0:03} does however contain logical
and performance problems which should be corrected:
\begin{itemize}
\item The test of $d$ against $x_f/2$ will perform as intended.
For example, if $d=2$ and $x_f=5$, the test
``\texttt{d >= x\_f/2}'' in the code will evaluate true
although the actual condition is false. To correct this
defect, the units of $d$ should be changed from
$1/x_f$ to $1/(2 x_f)$.
\item The quantity $y_f - x_f$ is calculated repeatedly. This
calculation should be moved out of the \emph{while()} loop.
\item The test against $x_f$ may be more economical if changed to
a test against 0 (but this requires a different initialization
assignment for $d$).
\end{itemize}
Figure \ref{fig:crat0:sbla0:04} corrects these defects
from Figure \ref{fig:crat0:sbla0:03}.
Figure \ref{fig:crat0:sbla0:04} is essentially the Bresenham
line algorithm, except that it only draws starting from the
origin and will only draw a line with a slope
$m = y_f/x_f \leq 1$.
\begin{figure}
\begin{verbatim}
/* Draws a line from (0,0) to (x_f,y_f) on a raster */
/* device. */
void bresenham_line(int x_f, int y_f)
{
int d = 2 * y_f - x_f;
/* Position of the ideal line minus */
/* the position of the line we are */
/* drawing, in units of 1/(2 * x_f). */
/* Initialization value of 2 * y_f is */
/* because algorithm is looking one */
/* pixel ahead. Value of -x_f is from */
/* shifting the midpoint test (the */
/* "if" statement below) downward to a */
/* test against zero. */
int dD = 2 * y_f;
int dU = dD - x_f;
/* Amounts to add to d if "D" and "U" */
/* pixels are chosen, respectively. */
/* Calculated here outside of loop. */
int x=0, y=0;
/* x- and y-coordinates currently being */
/* evaluated. */
plotpoint(0,0); /* Plot initial point. */
while (x <= x_f)
{
x++; /* We move to the right regardless. */
if (d >= 0)
{
/* The "U" choice. We must jump up a pixel */
/* to keep up with the ideal line. */
d += dU;
y++; /* Jump up a pixel. */
}
else /* d < 0 */
{
/* The "D" choice. Distance is not large */
/* enough to jump up a pixel. */
d += dD;
}
plotpoint(x,y);
}
}
\end{verbatim}
\caption{Third Attempt At A Raster Device Line Algorithm
Using Rational Counting Techniques}
\label{fig:crat0:sbla0:04}
\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Authors And Acknowledgements}
%Section tag: ACK0
This chapter was primarily written by
\index{Ashley, David T.} David T. Ashley
\cite{bibref:i:daveashley}.
We would like to gratefully acknowledge the assistance of
\index{Falconer, Chuck B.} Chuck B. Falconer \cite{bibref:i:chuckbfalconer},
\index{Hoffmann, Klaus} Klaus Hoffmann \cite{bibref:i:klaushoffmann},
\index{Larkin, John} John Larkin \cite{bibref:i:johnlarkin},
\index{Smith, Thad} Thad Smith \cite{bibref:i:thadsmith},
and
\index{Voipio, Tauno} Tauno Voipio \cite{bibref:i:taunovoipio}
for insight into rational counting approaches, contributed via the
\texttt{sci.math} \cite{bibref:n:scimathnewsgroup}
and
\texttt{comp.arch.embedded} \cite{bibref:n:comparchembedded}
newsgroups.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Exercises}
%Section tag: EXE0
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection[\protect\mbox{\protect$h/2^q$} and \protect\mbox{\protect$2^q/k$} Rational Linear Approximation]
{\protect\mbox{\protect\boldmath$h/2^q$} and \protect\mbox{\protect\boldmath$2^q/k$} Rational Linear Approximation}
\begin{vworkexercisestatement}
\label{exe:crat0:sexe0:a01}
Derive equations analogous to (\ref{eq:crat0:shqq0:dph0:03})
and (\ref{eq:crat0:shqq0:dph0:07}) if $r_A$ is chosen
without rounding, i.e.
$h=\lfloor r_I 2^q \rfloor$ and therefore
$r_A=\lfloor r_I 2^q \rfloor/2^q$.
\end{vworkexercisestatement}
\vworkexercisefooter{}
\begin{vworkexercisestatement}
\label{exe:crat0:sexe0:a02}
Derive equations analogous to (\ref{eq:crat0:shqq0:dph0:03})
and (\ref{eq:crat0:shqq0:dph0:07}) if
$z$ is chosen for rounding with the midpoint case rounded
down, i.e. $z=2^{q-1}-1$, and applied as in
(\ref{eq:crat0:sint0:01}).
\end{vworkexercisestatement}
\vworkexercisefooter{}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Rational Counting}
\begin{vworkexercisestatement}
\label{exe:crat0:sexe0:01}
For Algorithm \ref{alg:crat0:sfdv0:01a},
assume that one chooses $K_1 > K_3 - K_2$ (in contradiction to the
restrictions in Figure \ref{fig:crat0:sfdv0:01a}).
Derive a result similar to Lemma \ref{lem:crat0:sfdv0:sprc0:01}
for the number of base subroutine invocations in which
``\texttt{A()}'' is run before it is
\emph{not} run for the first time.
\end{vworkexercisestatement}
\vworkexercisefooter{}
\begin{vworkexercisestatement}
\label{exe:crat0:sexe0:02}
This will be the $\epsilon$ lemma proof.
\end{vworkexercisestatement}
\vworkexercisefooter{}
\begin{vworkexercisestatement}
\label{exe:crat0:sexe0:03}
Rederive appropriate results similar to
Lemma \ref{lem:crat0:sfdv0:sprc0:04} in the case where
$\gcd(K_2, K_4) > 1$.
\end{vworkexercisestatement}
\vworkexercisefooter{}
\begin{vworkexercisestatement}
\label{exe:crat0:sexe0:04}
Rederive appropriate results similar to
Lemma \ref{lem:crat0:sfdv0:sprc0:04} in the case where
$K_1 \neq 0$.
\end{vworkexercisestatement}
\vworkexercisefooter{}
\begin{vworkexercisestatement}
\label{exe:crat0:sexe0:05}
Rederive appropriate results similar to
Lemma \ref{lem:crat0:sfdv0:sprc0:04} in the case where
$\gcd(K_2, K_4) > 1$ and $K_1 \neq 0$.
\end{vworkexercisestatement}
\vworkexercisefooter{}
\begin{vworkexercisestatement}
\label{exe:crat0:sexe0:06}
For Lemma \ref{lem:crat0:sfdv0:sprc0:04},
complete the missing proof:
show that if $\gcd(K_2, K_4) = 1$, no algorithm can
lead to a tighter bound than (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}).
\textbf{Hint:} start with the observation
that with
$\gcd(K_2, K_4) = 1$, $n K_2 \bmod K_4$ will attain every value in
the set $\{ 0, \ldots , K_4-1 \}$.
\end{vworkexercisestatement}
\vworkexercisefooter{}
\begin{vworkexercisestatement}
\label{exe:crat0:sexe0:07}
For Lemma \ref{lem:crat0:sfdv0:sprc0:03},
show that if $K_1=0$, the number of initial invocations
of the base subroutine before ``\texttt{A()}'' is first
called is in the set specified in
(\ref{eq:lem:crat0:sfdv0:sprc0:03:01}).
\end{vworkexercisestatement}
\vworkexercisefooter{}
\begin{vworkexercisestatement}
\label{exe:crat0:sexe0:08}
Develop other techniques to correct the state upset vulnerability
of Algorithm \ref{alg:crat0:sfdv0:01a} besides
the technique illustrated in
Figure \ref{fig:crat0:sfdv0:sprc0:01}.
\end{vworkexercisestatement}
\vworkexercisefooter{}
\begin{vworkexercisestatement}
\label{exe:crat0:sexe0:09}
Show for Example \ref{ex:crat0:sfdv0:sprc1:01} that integers of at least
27 bits are required.
\end{vworkexercisestatement}
\vworkexercisefooter{}
\begin{vworkexercisestatement}
\label{exe:crat0:sexe0:10}
Prove Lemma \ref{lem:crat0:sfdv0:sprc1:02}.
\end{vworkexercisestatement}
\vworkexercisefooter{}
\begin{vworkexercisestatement}
\label{exe:crat0:sexe0:12}
Prove Lemma \ref{lem:crat0:sfdv0:sprc1:04}.
\end{vworkexercisestatement}
\vworkexercisefooter{}
\begin{vworkexercisestatement}
\label{exe:crat0:sexe0:13}
Define the term \emph{steady state} as used in
Lemma \ref{lem:crat0:sfdv0:sprc1:04} in terms of
set membership of the \texttt{state} variable.
\end{vworkexercisestatement}
\vworkexercisefooter{}
\begin{vworkexercisestatement}
\label{exe:crat0:sexe0:14}
For Algorithm \ref{alg:crat0:sfdv0:01a}, devise examples of anomalous behavior due to
race conditions that may occur if $K_2$ and/or $K_4$ are set in a process
which is asynchronous with respect to the process which implements the
rational counting algorithm if mutual exclusion protocol is not
implemented.
\end{vworkexercisestatement}
\vworkexercisefooter{}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\vfill
\noindent\begin{figure}[!b]
\noindent\rule[-0.25in]{\textwidth}{1pt}
\begin{tiny}
\begin{verbatim}
$HeadURL$
$Revision$
$Date$
$Author$
\end{verbatim}
\end{tiny}
\noindent\rule[0.25in]{\textwidth}{1pt}
\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%End of file C_RAT0.TEX