/[dtapublic]/pubs/books/ucbka/trunk/c_rat0/c_rat0.tex
ViewVC logotype

Annotation of /pubs/books/ucbka/trunk/c_rat0/c_rat0.tex

Parent Directory Parent Directory | Revision Log Revision Log


Revision 278 - (hide annotations) (download) (as text)
Wed Aug 14 23:10:36 2019 UTC (3 years, 5 months ago) by dashley
File MIME type: application/x-tex
File size: 105593 byte(s)
Change keyword substitution (migration from cvs to svn).
1 dashley 140 %$Header$
2    
3     \chapter{Rational Linear Approximation}
4    
5     \label{crat0}
6    
7     \beginchapterquote{``Die ganzen Zahlen hat der liebe Gott gemacht,
8     alles andere ist Menschenwerk.''\footnote{German
9     language: God made the integers; everything
10     else was made by man.}}
11     {Leopold Kronecker}
12    
13     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
14     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
15     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
16     \section{Introduction}
17     %Section tag: INT0
18     \label{crat0:sint0}
19    
20     In this chapter, we consider practical applications of
21     rational approximation.
22     Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0}
23     have presented algorithms for finding
24     the closest rational numbers to an arbitrary real number,
25     subject to constraints on the numerator and denominator.
26     The basis of these algorithms is complex and comes from number theory, and so
27     these algorithms and their basis have been presented in separate chapters.
28    
29     In Section \ref{crat0:srla0}, rational linear approximation itself
30     and associated error bounds are presented. By \emph{rational linear
31     approximation} we mean simply the approximation of a line
32     $y = r_I x$ ($y, r_I, x \in \vworkrealset$) by a line
33    
34     \begin{equation}
35     \label{eq:crat0:sint0:01}
36     y = \left\lfloor
37     \frac{h \lfloor x \rfloor + z}{k}
38     \right\rfloor ,
39     \end{equation}
40    
41     \noindent{}where we choose $h/k \approx r_I$ and optionally choose $z$ to
42     shift the error introduced. Note that (\ref{eq:crat0:sint0:01}) is
43     very economical for microcontroller instruction sets, since only integer
44     arithmetic is required. We may choose $h/k$ from a Farey series (see
45     Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0}), or
46     we may choose a ratio $h/2^q$ so that the division in (\ref{eq:crat0:sint0:01})
47     can be implemented
48     by a bitwise right shift.
49    
50     Section \ref{crat0:srla0} discusses linear rational approximation
51     in general, with a special eye on error analysis.
52    
53     Section \ref{crat0:spwi0} discusses piecewise linear rational approximation,
54     which is the approximation of a curve or complex mapping by a
55     number of joined line segments.
56    
57     Section \ref{crat0:sfdv0} discusses frequency division and rational counting.
58     Such techniques share the same mathematical framework as rational linear
59     approximation, and as with rational linear approximation the ratio
60     involved may be chosen from a Farey series or with a denominator of $2^q$, depending
61     on the algorithm employed.
62    
63     Section \ref{crat0:sbla0} discusses Bresenham's classic line algorithm,
64     which is a practical application of rational linear approximation.
65    
66     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
67     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
68     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
69     \section{Rational Linear Approximation}
70     %Section tag: RLA0
71     \label{crat0:srla0}
72    
73     It occurs frequently in embedded software design that one wishes to
74     implement a linear scaling from a domain to a range of the form
75    
76     \begin{equation}
77     \label{eq:crat0:srla0:01}
78     f(x) = r_I x ,
79     \end{equation}
80    
81     \noindent{}where $r_I$ is the \emph{ideal}
82    
83    
84     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
85     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
86     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
87     \subsection{Model Functions}
88     %Section tag: mfu0
89     \label{crat0:srla0:smfu0}
90    
91     In general, we seek to approximate the ideal function
92    
93    
94     \noindent{}by some less ideal function where
95    
96     \begin{itemize}
97     \item $r_A \neq r_I$, although we seek to choose $r_A \approx r_I$.
98     \item The input to the function, $x$, may already contain
99     quantization error.
100     \item Although $r_I x \in \vworkrealsetnonneg$, we must choose
101     an integer as the function output.
102     \end{itemize}
103    
104     In modeling quantization error, we use the floor function\index{floor function}
105     ($\lfloor\cdot\rfloor$)
106     for algebraic simplicity. The floor function precisely
107     describes the behavior of integer division instructions (where
108     remainders are discarded), but may not describe other sources of
109     quantization, such as quantization that occurs in A/D conversion.
110     However, techniques identical to those presented in this
111     section may be used when quantization is not best described
112     by the floor function, and these results are left to the reader.
113    
114     Traditionally, because addition of integers is an inexpensive
115     machine operation, a parameter $z \in \vworkintset$ may optionally
116     be added to the product $hx$ in order to round or otherwise
117     shift the result.
118    
119     If $x$ is assumed to be without error, the ideal function is
120     given by (\ref{eq:crat0:srla0:smfu0:01}), whereas the function
121     that can be economically implemented is
122    
123     \begin{equation}
124     \label{eq:crat0:srla0:smfu0:02}
125     g(x) = \left\lfloor \frac{hx + z}{k} \right\rfloor
126     =
127     \left\lfloor r_A x + \frac{z}{k} \right\rfloor .
128     \end{equation}
129    
130     If, on the other hand, $x$ may be already quantized,
131     the function that can actually be implemented is
132    
133     \begin{equation}
134     \label{eq:crat0:srla0:smfu0:03}
135     h(x) = \left\lfloor \frac{h \lfloor x \rfloor + z}{k} \right\rfloor
136     =
137     \left\lfloor r_A \lfloor x \rfloor + \frac{z}{k} \right\rfloor .
138     \end{equation}
139    
140    
141    
142     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
143     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
144     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
145     \section[\protect\mbox{\protect$h/2^q$} and \protect\mbox{\protect$2^q/k$} Rational Linear Approximation]
146     {\protect\mbox{\protect\boldmath$h/2^q$} and \protect\mbox{\protect\boldmath$2^q/k$} Rational Linear Approximation}
147     %Section tag: HQQ0
148     \label{crat0:shqq0}
149    
150     \index{h/2q@$h/2^q$ rational linear approximation}
151     \index{rational linear approximation!h/2q@$h/2^q$}
152     The algorithms presented in
153     Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0}
154     will always provide the rational number $h/k$ closest to
155     an arbitrary real number $r_I$ subject to the constraints
156     $h \leq h_{MAX}$ and $k \leq k_{MAX}$.
157    
158     However, because shifting in order
159     to implement multiplication or division by a power of 2
160     is at least as fast (and often \emph{much} faster)
161     on all processors as arbitrary multiplication or division,
162     and because not all processors have multiplication and division instructions,
163     it is worthwhile to examine choosing $h/k$ so that either $h$ or $k$ are
164     powers of 2.
165    
166     There are thus three rational linear approximation techniques to be
167     examined:
168    
169     \begin{enumerate}
170     \item \emph{$h/k$ rational linear approximation}, in which an arbitrary
171     $h \leq h_{MAX}$ and an arbitrary $k \leq k_{MAX}$ are used,
172     with $r_A = h/k$. $h$ and $k$ can be chosen using the algorithms
173     presented in Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0}.
174     Implementation of this technique would most often involve a single integer
175     multiplication instruction to form the product $hx$, followed by an optional single
176     addition instruction to form the sum $hx+z$, and then
177     followed by by a single division instruction
178     to form the quotient $\lfloor (hx+z)/k \rfloor$. Implementation may also less commonly involve
179     multiplication, addition, and division of operands too large to be processed
180     with single machine instructions.
181     \item \emph{$h/2^q$ rational linear approximation}, in which an arbitrary
182     $h \leq h_{MAX}$ and an integral power of two $k=2^q$ are used, with
183     $r_A = h/2^q$.
184     Implementation of this technique would most often involve a single integer
185     multiplication instruction to form the product $hx$, followed by an optional single
186     addition instruction to form the sum $hx+z$, and then
187     followed by right shift instruction(s)
188     to form the quotient $\lfloor (hx+z)/2^q \rfloor$. Implementation may also less commonly involve
189     multiplication, addition, and right shift of operands too large to be processed
190     with single machine instructions.
191     \item \emph{$2^q/k$ rational linear approximation}, in which an integral
192     power of two $h=2^q$ and an arbitrary $k \leq k_{MAX}$ are used, with
193     $r_A = 2^q/k$.
194     Implementation of this technique would most often involve left shift
195     instruction(s) to form the product $2^qx$, followed by an optional single
196     addition instruction to form the sum $2^qx+z$, and then
197     followed by a single division instruction to form
198     the quotient $\lfloor (2^qx+z)/k \rfloor$. Implementation may also less
199     commonly involve
200     left shift, addition, and division of operands too large to be processed
201     with single machine instructions.
202     \end{enumerate}
203    
204     We use the nomenclature ``\emph{$h/k$ rational linear approximation}'',
205     ``\emph{$h/2^q$ rational linear approximation}'', and
206     ``\emph{$2^q/k$ rational linear approximation}'' to identify the three
207     techniques enumerated above.
208    
209    
210     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
211     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
212     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
213     \subsection{Integer Arithmetic and Processor Instruction Set Characteristics}
214     %Subsection tag: pis0
215     \label{crat0:shqq0:pis0}
216    
217     The following observations about integer arithmetic and about processors
218     used in embedded control can be made:
219    
220     \begin{enumerate}
221     \item \label{enum:crat0:shqq0:pis0:01:01a}
222     \emph{Shifting is the fastest method of integer multiplication or division
223     (by $2^q$ only),
224     followed by utilization of the processor multiplication or division instructions (for arbitrary
225     operands),
226     followed by software implementation of multiplication or division (for arbitrary operands).}
227     Relative costs vary depending on the processor, but the monotonic
228     ordering always holds.
229     $h/2^q$ and $2^q/k$ rational linear
230     approximation are thus worthy of investigation. (Note also that in many practical
231     applications of $h/2^q$ and $2^q/k$ rational linear approximation,
232     the required shift is performed by
233     addressing the operand with an offset,
234     and so has no cost.)
235     \item \label{enum:crat0:shqq0:pis0:01:01b}
236     \emph{Shifting is $O(N)$ (where $N$ is the number of bits in the argument),
237     but both
238     multiplication and division are $O(N^2)$ for
239     practical\footnote{\index{Karatsuba multiplication}Karatsuba
240     multiplication, for example, is
241     $O(N^{\log_2 3}) \approx O(N^{1.58}) \ll O(N^2)$. However, Karatsuba
242     multiplication cannot be applied economically to the small
243     operands that typically occur in embedded control work. It would
244     be rare in embedded control applications
245     for the length of a multiplication operand to exceed four
246     times the length that is accommodated by a machine instruction; and this
247     is far below the threshold at which Karatsuba multiplication is
248     economical. Thus, for all intents and purposes in embedded control work,
249     multiplication is $O(N^2)$.} operands (where
250     $N$ is the number of bits in each
251     operand).} It follows that $2^q/k$ and $h/2^q$ rational
252     linear approximation
253     will scale to large operands better than $h/k$ rational linear approximation.
254     \item \label{enum:crat0:shqq0:pis0:01:02a}
255     \emph{Integer division instructions take as long or longer than
256     integer multiplication instructions.} In designing digital logic
257     to implement basic integer arithmetic, division is the operation most difficult
258     to perform economically.\footnote{For some processors, the penalty is extreme.
259     For example, on the NEC V850 (a RISC processor),
260     a division requires 36 clock cycles,
261     whereas multiplication, addition, and subtraction each effectively
262     require 1 clock cycle.}
263     It follows that multiplication using operands that exceed the machine's word size
264     is often far less expensive than division using operands that exceed the
265     machine's word size.
266     \item \label{enum:crat0:shqq0:pis0:01:03a}
267     \emph{All processors that have an integer division instruction also
268     have an integer multiplication instruction.}
269     Phrased
270     differently, no processor has an integer division instruction but no
271     integer multiplication instruction.
272     \end{enumerate}
273    
274     Enumerated items
275     (\ref{enum:crat0:shqq0:pis0:01:01a}) through
276     (\ref{enum:crat0:shqq0:pis0:01:03a}) above lead to the following conclusions.
277    
278     \begin{enumerate}
279     \item $h/2^q$ rational linear approximation is likely to be implementable
280     more efficiently on most processors than $h/k$ rational linear approximation.
281     (\emph{Rationale:} shift instruction(s) or accessing a
282     memory address with an offset
283     is
284     likely to be more economical than division, particularly if $k$ would exceed
285     the native
286     operand size of the processor.)
287     \item $h/2^q$ rational linear approximation is likely to be a more useful
288     technique than $2^q/k$ rational linear approximation.
289     (\emph{Rationale:} the generally high cost of division compared to
290     multiplication, and the existence of processors that possess a multiplication
291     instruction but no division instruction.)
292     \end{enumerate}
293    
294    
295     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
296     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
297     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
298     \subsection[Design Procedure For \protect\mbox{\protect$h/2^q$} Rational Linear Approximations]
299     {Design Procedure For \protect\mbox{\protect\boldmath$h/2^q$} Rational Linear Approximation}
300     %Subsection tag: dph0
301     \label{crat0:shqq0:dph0}
302    
303     An $h/2^q$ rational linear approximation is parameterized by:
304    
305     \begin{itemize}
306     \item The unsigned or signed nature of $h$ and $x$. (Rational linear approximations
307     may involve either signed or unsigned domains and ranges. Furthermore,
308     signed integers may be maintained using either 2's-complement
309     or sign-magnitude representation, and the processor instruction set
310     may or may not directly support signed multiplication.)
311     \item $r_I$, the real number we wish to approximate by $r_A = h/2^q$.
312     \item $x_{MAX}$, the maximum possible value of the input argument $x$. (Typically,
313     software contains a test to clip the output if $x > x_{MAX}$.)
314     \item $w_h$, the width in bits allowed for $h$. (Typically, $w_h$ is
315     the maximum operand size of a machine multiplication instruction.)
316     \item $w_r$, the width in bits allowed for the result $hx$. (Typically,
317     $w_r$ is the maximum result size of a machine multiplication instruction.)
318     \item The rounding mode when choosing $h$ (and thus effectively $r_A$)
319     based on $r_I$. It is common to choose the
320     closest value,
321     $r_A=\lfloor r_I 2^q + 1/2 \rfloor/2^q$
322     or
323     $r_A=\lceil r_I 2^q - 1/2 \rceil/2^q$,
324     but other choices are possible.
325     \item The rounding mode for the result (i.e. the choice of $z$ in
326     Eq. \ref{eq:crat0:sint0:01}).
327     \end{itemize}
328    
329     This section develops a design procedure for $h/2^q$ rational linear
330     approximations with the most typical set of assumptions: unsigned arithmetic,
331     $r_A=\lfloor r_I 2^q + 1/2 \rfloor/2^q$,
332     and $z=0$. Design procedures for other scenarios are presented as exercises.
333    
334     By definition, $h$ is constrained in two ways:
335    
336     \begin{equation}
337     \label{eq:crat0:shqq0:dph0:00}
338     h \leq 2^{w_h} - 1
339     \end{equation}
340    
341     \noindent{}and
342    
343     \begin{equation}
344     \label{eq:crat0:shqq0:dph0:01}
345     h \leq \frac{2^{w_r} - 1}{x_{MAX}} .
346     \end{equation}
347    
348     \noindent{}(\ref{eq:crat0:shqq0:dph0:00}) comes directly from the
349     requirement that $h$ fit in $w_h$ bits.
350     (\ref{eq:crat0:shqq0:dph0:01}) comes directly from the requirement
351     that $hx$ fit in $w_r$ bits.
352     (\ref{eq:crat0:shqq0:dph0:00}) and (\ref{eq:crat0:shqq0:dph0:01})
353     may be combined to form one inequality:
354    
355     \begin{equation}
356     \label{eq:crat0:shqq0:dph0:02}
357     h \leq \min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) .
358     \end{equation}
359    
360     If $q$ is known, the choice of $h$ that will be made so as to minimize
361     $|r_A-r_I| = |h/2^q - r_I|$ is
362    
363     \begin{equation}
364     \label{eq:crat0:shqq0:dph0:03}
365     h=\left\lfloor r_I 2^q + \frac{1}{2} \right\rfloor .
366     \end{equation}
367    
368     \noindent{}It is required that the choice of $h$ specified by
369     (\ref{eq:crat0:shqq0:dph0:03}) meet
370     (\ref{eq:crat0:shqq0:dph0:02}). Making the most pessimistic
371     assumption about the rounding of $h$ and substituting into
372     (\ref{eq:crat0:shqq0:dph0:02}) leads to
373    
374     \begin{equation}
375     \label{eq:crat0:shqq0:dph0:04}
376     r_I 2^q + \frac{1}{2}
377     \leq
378     \min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) .
379     \end{equation}
380    
381     \noindent{}Isolating $q$ in (\ref{eq:crat0:shqq0:dph0:04})
382     yields
383    
384     \begin{equation}
385     \label{eq:crat0:shqq0:dph0:05}
386     2^q
387     \leq
388     \frac{\min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) - \frac{1}{2}}
389     {r_I}.
390     \end{equation}
391    
392     \noindent{}Solving
393     (\ref{eq:crat0:shqq0:dph0:05})
394     for maximum value of $q$ that meets the constraint yields
395    
396     \begin{equation}
397     \label{eq:crat0:shqq0:dph0:06}
398     q=
399     \left\lfloor
400     {
401     \log_2
402     \left(
403     {
404     \frac{\min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) - \frac{1}{2}}{r_I}
405     }
406     \right)
407     }
408     \right\rfloor .
409     \end{equation}
410    
411     \noindent{}(\ref{eq:crat0:shqq0:dph0:06})
412     can be rewritten for easier calculation using most calculators (which do
413     not allow the direct evaluation of base-2 logarithms):
414    
415     \begin{equation}
416     \label{eq:crat0:shqq0:dph0:07}
417     q=
418     \left\lfloor
419     \frac
420     {
421     {
422     \ln
423     \left(
424     {
425     \frac{\min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) - \frac{1}{2}}{r_I}
426     }
427     \right)
428     }
429     }
430     {\ln 2}
431     \right\rfloor .
432     \end{equation}
433    
434     \noindent{}Once $q$ is established using (\ref{eq:crat0:shqq0:dph0:07}),
435     $h$ can be calculated using (\ref{eq:crat0:shqq0:dph0:03}).
436    
437     In embedded control work (as well as in operating system internals),
438     $h/2^q$ rational linear approximations are often used in conjunction with
439     tabulated constants or calibratable parameters
440     where each constant or calibratable parameter may vary over a range of
441     $[0, r_I]$, and where $r_I$ is the value used in the design procedure
442     presented above. In these applications, the values of $h$ are
443     tabulated, but $q$ is invariant (usually hard-coded)
444     and is chosen at design time based on the upper bound $r_I$
445     of the interval $[0, r_I]$ in which each tabulated constant or calibratable
446     parameter will fall. With $q$ fixed,
447     $r_A$ can be adjusted in steps of $1/2^q$.
448    
449     If $r_I$ is invariant, a final design step may be to reduce the rational
450     number $h/2^q$ by dividing some or all occurrences of 2 as a factor from both the
451     numerator and denominator. With some processors and in some applications, this
452     may save execution time by reducing the number of shift instructions that
453     must be executed, reducing the execution time of the shift instructions
454     that are executed, or allowing shifting via offset addressing.
455     For example, on a byte-addressible machine, if the design procedure
456     yields $h=608$ and $q=10$, it may be desirable to divide both $h$ and $2^q$ by 4 to
457     yield $h=152$ and $q=8$, as this allows the shift by 8 to be done by fetching
458     alternate bytes (rather than by actual shifting). In other applications, it may
459     be desirable to remove \emph{all} occurrences of 2 as a prime factor
460     from $h$.
461    
462     For an invariant $r_I$, a suitable design procedure is:
463    
464     \begin{enumerate}
465     \item Choose $q$ using (\ref{eq:crat0:shqq0:dph0:07}).
466     \item With $q$ fixed, choose $h$ using (\ref{eq:crat0:shqq0:dph0:03}).
467     \item If economies can be achieved on the target processor,
468     examine the possibility of removing some or all occurrences
469     of 2 as a prime factor from $h$ and decreasing $q$.
470     \end{enumerate}
471    
472     For tabulated or calibratable constants in the
473     interval $[0,r_I]$, a suitable design procedure is to use the
474     procedure presented immediately above but without the third step.
475     Each tabulated value of $h$ is chosen using (\ref{eq:crat0:shqq0:dph0:03}).
476    
477     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
478     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
479     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
480     \subsection[Design Procedure For \protect\mbox{\protect$2^q/k$} Rational Linear Approximations]
481     {Design Procedure For \protect\mbox{\protect\boldmath$2^q/k$} Rational Linear Approximation}
482     %Subsection tag: dpk0
483     \label{crat0:shqq0:dpk0}
484    
485     TBD.
486    
487     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
488     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
489     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
490     \section{Piecewise Rational Linear Approximation}
491     %Section tag: PWI0
492     \label{crat0:spwi0}
493    
494     TBD.
495    
496     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
497     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
498     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
499     \section[Frequency Division And Rational Counting]
500     {Frequency Division And Rational Counting Techniques}
501     %Section tag: FDV0
502     \label{crat0:sfdv0}
503    
504     \index{frequency division}\index{rational counting}\index{counting}%
505     Often, software must ``divide down'' an execution rate. For example,
506     an interrupt service routine may be scheduled by hardware every
507     10ms, but may perform useful processing only every 50ms. This requires
508     that the ISR maintain a counter and only perform useful processing
509     every fifth invocation. This section deals with counting strategies
510     used to achieve invocation frequency division and other similar results.
511    
512     Frequency division and
513     rational counting techniques presented in this section find application
514     primarily in the following scenarios:
515    
516     \begin{itemize}
517     \item ISRs and other software components which must divide down
518     their invocation rate.
519     \item Pulse counting and scaling from encoders and other
520     similar systems.
521     \item The correction of inaccuracies in timebases (such as crystals
522     which oscillate at a frequency different than the
523     nominal rate).
524     \end{itemize}
525    
526     Because the techniques presented must be usable with inexpensive
527     microcontrollers, such techniques must meet these constraints:
528    
529     \begin{enumerate}
530     \item \label{enum:01:crat0:sfdv0:econex}
531     The counting techniques must be economical to execute on
532     an inexpensive microcontroller.
533     \item \label{enum:01:crat0:sfdv0:econcccalc}
534     An inexpensive microcontroller must be capable of calculating any
535     constants used as limits in counting (i.e. it cannot necessarily
536     be assumed that a more powerful computer calculates these constants,
537     and it cannot be assumed that these limits do not change on the fly).
538     \end{enumerate}
539    
540     In this section, we analyze the behavior of several types of
541     rational counting algorithms, supplied as Algorithms
542     \ref{alg:crat0:sfdv0:01a}
543     through
544     \ref{alg:crat0:sfdv0:02a}.
545    
546     \begin{algorithm}
547     \begin{verbatim}
548     /* The constants K1 through K4, which parameterize the */
549     /* counting behavior, are assumed assigned elsewhere in */
550     /* the code. The solution is analyzed in terms of the */
551     /* parameters K1 through K4. */
552     /* */
553     /* We also place the following restrictions on K1 through */
554     /* K4: */
555     /* K1 : K1 <= K3 - K2. */
556     /* K2 : K4 > K2 > 0. */
557     /* K3 : No restrictions. */
558     /* K4 : K4 > K2 > 0. */
559    
560     void base_rate_sub(void)
561     {
562     static int state = K1;
563    
564     state += K2;
565    
566     if (state >= K3)
567     {
568     state -= K4;
569     A();
570     }
571     }
572     \end{verbatim}
573     \caption{Rational Counting Algorithm For $K_2/K_4 < 1$}
574     \label{alg:crat0:sfdv0:01a}
575     \end{algorithm}
576    
577     \begin{algorithm}
578     \begin{verbatim}
579     /* The constants K1 through K4, which parameterize the */
580     /* counting behavior, are assumed assigned elsewhere in */
581     /* the code. The solution is analyzed in terms of the */
582     /* parameters K1 through K4. */
583     /* */
584     /* We also place the following restrictions on K1 through */
585     /* K4: */
586     /* K1 : K1 <= K3 - K2. */
587     /* K2 : K2 > 0. */
588     /* K3 : No restrictions. */
589     /* K4 : K4 > 0. */
590    
591     void base_rate_sub(void)
592     {
593     static int state = K1;
594    
595     state += K2;
596    
597     while (state >= K3)
598     {
599     state -= K4;
600     A();
601     }
602     }
603     \end{verbatim}
604     \caption{Rational Counting Algorithm For $K_2/K_4 \geq 1$}
605     \label{alg:crat0:sfdv0:01b}
606     \end{algorithm}
607    
608     \begin{algorithm}
609     \begin{verbatim}
610     /* The constants K1, K2, and K4, which parameterize the */
611     /* counting behavior, are assumed assigned elsewhere in */
612     /* the code. The solution is analyzed in terms of the */
613     /* parameters K1 through K4. */
614     /* */
615     /* We also place the following restrictions on K1, K2, */
616     /* and K4: */
617     /* K1 : K1 >= 0. */
618     /* K2 : K4 > K2 > 0. */
619     /* K4 : K4 > K2 > 0. */
620     /* */
621     /* Special thanks to Chuck B. Falconer (of the */
622     /* comp.arch.embedded newsgroup) for this rational */
623     /* counting algorithm. */
624     /* */
625     /* Note below that the test against K3 does not exist, */
626     /* instead a test against zero is used, which many */
627     /* machine instruction sets will do as part of the */
628     /* subtraction (but perhaps this needs to be coded in */
629     /* A/L). This saves machine code and also eliminates */
630     /* one unnecessary degree of freedom (K3). */
631    
632     void base_rate_sub(void)
633     {
634     static int state = K1;
635    
636     if ((state -= K2) < 0)
637     {
638     state += K4;
639     A();
640     }
641     }
642     \end{verbatim}
643     \caption{Zero-Test Rational Counting Algorithm For $K_2/K_4 < 1$}
644     \label{alg:crat0:sfdv0:01c}
645     \end{algorithm}
646    
647     \begin{algorithm}
648     \begin{verbatim}
649     ;Special thanks to John Larkin (of the comp.arch.embedded
650     ;newsgroup) for this rational counting algorithm.
651     ;
652     ;This is the TMS-370C8 assembly-language version of the
653     ;algorithm. The algorithm is parameterized solely by
654     ;K1 and K2, with no restrictions on their values, because
655     ;the values are naturally constrained by the data types.
656     ;K1, which is the initial value of "state", is assumed
657     ;assigned elsewhere. The snippet shown here uses only
658     ;K2.
659     MOV state, A ;Get "state".
660     ADD #K2, A ;Increase by K2. Carry flag
661     ;will be set if rollover to or
662     ;past zero.
663     PUSH ST ;Save carry flag.
664     MOV A, state ;Move new value back.
665     POP ST ;Restore carry flag.
666     JNC done ;If didn't roll, don't run sub.
667     CALL A_SUBROUTINE ;Run sub.
668     done:
669    
670     /* This is the 'C' version of the algorithm. It is not */
671     /* as easy or efficient in 'C' to detect rollover. */
672    
673     void base_rate_sub(void)
674     {
675     static unsigned int state = K1;
676     unsigned int old_state;
677    
678     old_state = state;
679     state += K2;
680     if (state < old_state)
681     {
682     A();
683     }
684     }
685     \end{verbatim}
686     \caption{$2^q$ Rollover Rational Counting Algorithm}
687     \label{alg:crat0:sfdv0:01d}
688     \end{algorithm}
689    
690     \begin{algorithm}
691     \begin{verbatim}
692     /* The constants K1 through K4, which parameterize the */
693     /* counting behavior, are assumed assigned elsewhere in */
694     /* the code. The solution is analyzed in terms of the */
695     /* parameters K1 through K4. */
696     /* */
697     /* We also place the following restrictions on K1 through */
698     /* K4: */
699     /* K1 : K1 <= K3. */
700     /* K2 : K2 > 0. */
701     /* K3 : No restrictions. */
702     /* K4 : K4 > 0. */
703    
704     void base_rate_sub(void)
705     {
706     static unsigned int state = K1;
707    
708     if (state >= K3)
709     {
710     state -= K4;
711     A();
712     }
713     else
714     {
715     state += K2;
716     B();
717     }
718     }
719     \end{verbatim}
720     \caption{Rational Counting Algorithm With \texttt{else} Clause}
721     \label{alg:crat0:sfdv0:02a}
722     \end{algorithm}
723    
724    
725     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
726     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
727     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
728     \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01a}]
729     {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01a}}
730     %Section tag: PRC0
731     \label{crat0:sfdv0:sprc0}
732    
733     Algorithm \ref{alg:crat0:sfdv0:01a}
734     is used frequently in microcontroller
735     software. A base rate subroutine\footnote{For brevity, we usually
736     call this just the \emph{base subroutine}.} (named ``\texttt{base\_rate\_sub()}''
737     in the algorithm) is called at a periodic rate, and subroutine
738     ``\texttt{A()}'' is called at a lesser rate.
739     We are interested in determining the relationships between the rates
740     as a function of $K_1$, $K_2$, $K_3$, and $K_4$; and we are interested
741     in developing other properties.
742    
743     Notationally when analyzing rational counting algorithms, we agree
744     that $state_n$ denotes the value of the \texttt{state} variable
745     after the $n$th invocation and before the $n+1$'th invocation
746     of the base rate subroutine.
747     Using this convention with Algorithm \ref{alg:crat0:sfdv0:01a},
748     $state_0 = K_1$.\footnote{Algorithm \ref{alg:crat0:sfdv0:01a}
749     requires a knowledge of
750     `C' to fully understand. The \texttt{static} keyword ensures that the
751     variable \texttt{state} is initialized only once, at the time the program
752     is loaded. \texttt{state} is \emph{not} initialized each time the
753     base subroutine runs.}
754    
755     We can first easily derive the number of initial invocations of
756     the base subroutine before ``\texttt{A()}'' is called for the first
757     time.
758    
759     \begin{vworklemmastatement}
760     \label{lem:crat0:sfdv0:sprc0:01}
761     $N_{STARTUP}$, the number of invocations of the base subroutine
762     in Algorithm \ref{alg:crat0:sfdv0:01a} before ``\texttt{A()}'' is called
763     for the first time, is given by
764    
765     \begin{equation}
766     \label{eq:lem:crat0:sfdv0:sprc0:01:01}
767     N_{STARTUP} =
768     \left\lceil
769     {
770     \frac{-K_1 - K_2 + K_3}{K_2}
771     }
772     \right\rceil .
773     \end{equation}
774     \end{vworklemmastatement}
775     \begin{vworklemmaproof}
776     The value of \texttt{state} after the $n$th invocation
777     is $state_n = K_1 + n K_2$. In order for the test in the
778     \texttt{if()} statement not to be met, we require that
779    
780     \begin{equation}
781     \label{eq:lem:crat0:sfdv0:sprc0:01:02}
782     K_1 + n K_2 < K_3
783     \end{equation}
784    
785     \noindent{}or equivalently that
786    
787     \begin{equation}
788     \label{eq:lem:crat0:sfdv0:sprc0:01:03}
789     n < \frac{K_3 - K_1}{K_2} .
790     \end{equation}
791    
792     Solving (\ref{eq:lem:crat0:sfdv0:sprc0:01:03}) for the largest
793     value of $n \in \vworkintset$ which still meets the criterion
794     yields (\ref{eq:lem:crat0:sfdv0:sprc0:01:01}). Note that
795     the derivation of (\ref{eq:lem:crat0:sfdv0:sprc0:01:01}) requires
796     that the restrictions on $K_1$ through $K_4$ documented in
797     Algorithm \ref{alg:crat0:sfdv0:01a} be met.
798     \end{vworklemmaproof}
799     \begin{vworklemmaparsection}{Remarks}
800     Note that if one chooses $K_1 > K_3 - K_2$ (in contradiction to the
801     restrictions in Algorithm \ref{alg:crat0:sfdv0:01a}), it is possible
802     to devise a counting scheme (and results analogous to this lemma) where
803     ``\texttt{A()}'' is run a number of times before it is
804     \emph{not} run for the first time. The construction of an analogous
805     lemma is the topic of Exercise \ref{exe:crat0:sexe0:01}.
806     \end{vworklemmaparsection}
807    
808     \begin{vworklemmastatement}
809     \label{lem:crat0:sfdv0:sprc0:02}
810     Let $N_I$ be the number of times the Algorithm
811     \ref{alg:crat0:sfdv0:01a} base subroutine
812     is called, let $N_O$ be the number of times the
813     ``\texttt{A()}'' subroutine is called, let
814     $f_I$ be the frequency of invocation of the
815     Algorithm \ref{alg:crat0:sfdv0:01a} base subroutine, and let
816     $f_O$ be the frequency of invocation of
817     ``\texttt{A()}''. Provided the constraints
818     on $K_1$ through $K_4$ documented in
819     Algorithm \ref{alg:crat0:sfdv0:01a} are met,
820    
821     \begin{equation}
822     \label{eq:lem:crat0:sfdv0:sprc0:02:01}
823     \lim_{N_I\rightarrow\infty}\frac{N_O}{N_I}
824     =
825     \frac{f_O}{f_I}
826     =
827     \frac{K_2}{K_4} .
828     \end{equation}
829     \end{vworklemmastatement}
830     \begin{vworklemmaproof}
831     (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) indicates that once
832     the initial delay (determined by $K_1$ and $K_3$) has finished,
833     $N_O/N_I$ will converge on a steady-state value of
834     $K_2/K_4$.
835    
836     Assume that $K_1=0$ and $K_3=K_4$. The
837     conditional subtraction then calculates
838     $state \bmod K_4$. After the $n$th
839     invocation of the base subroutine, the value
840     of \texttt{state} will be
841    
842     \begin{equation}
843     \label{eq:lem:crat0:sfdv0:sprc0:02:02}
844     state_n|_{K_1=0, K_3=K_4} = n K_2 \bmod K_4 .
845     \end{equation}
846    
847     Assume that for two distinct values of
848     $n \in \vworkintsetnonneg$, $n_1$ and $n_2$,
849     the value of the \texttt{state} variable is the same:
850    
851     \begin{equation}
852     \label{eq:lem:crat0:sfdv0:sprc0:02:03}
853     n_1 K_2 \bmod K_4 = n_2 K_2 \bmod K_4.
854     \end{equation}
855    
856     Then
857    
858     \begin{equation}
859     \label{eq:lem:crat0:sfdv0:sprc0:02:04}
860     (n_2 - n_1) K_2 = i K_4, \; \exists i \in \vworkintsetpos .
861     \end{equation}
862    
863     However, we have no knowledge of whether $K_2$ and $K_4$ are
864     coprime (they are not required to be). We may rewrite
865     (\ref{eq:lem:crat0:sfdv0:sprc0:02:04}) equivalently as
866    
867     \begin{equation}
868     \label{eq:lem:crat0:sfdv0:sprc0:02:05}
869     (n_2 - n_1) \frac{K_2}{\gcd(K_2, K_4)} = i \frac{K_4}{\gcd(K_2, K_4)},
870     \; \exists i \in \vworkintsetpos
871     \end{equation}
872    
873     where of course by definition
874    
875     \begin{equation}
876     \label{eq:lem:crat0:sfdv0:sprc0:02:06}
877     \gcd \left( { \frac{K_2}{\gcd(K_2, K_4)}, \frac{K_4}{\gcd(K_2, K_4)} } \right) = 1.
878     \end{equation}
879    
880     In order to satisfy (\ref{eq:lem:crat0:sfdv0:sprc0:02:05}),
881     $n_2 - n_1$ must contain all of the prime factors of
882     $K_4/\gcd(K_2,K_4)$ in at least the same multiplicities,
883     and it follows that the set of values
884     of $n_2-n_1$ that satisfies
885     (\ref{eq:lem:crat0:sfdv0:sprc0:02:03}) is
886     precisely the set of multiples of $K_4/\gcd(K_2,K_4)$:
887    
888     \begin{equation}
889     \label{eq:lem:crat0:sfdv0:sprc0:02:07}
890     n_2 - n_1 = j \frac{K_4}{\gcd(K_2, K_4)}, \; \exists j \in \vworkintsetpos .
891     \end{equation}
892    
893     Examining (\ref{eq:lem:crat0:sfdv0:sprc0:02:02}), it can
894     also be seen that
895    
896     \begin{equation}
897     \label{eq:lem:crat0:sfdv0:sprc0:02:08}
898     \gcd(K_2, K_4) \vworkdivides (n K_2 \bmod K_4),
899     \end{equation}
900    
901     and so
902    
903     \begin{eqnarray}
904     \label{eq:lem:crat0:sfdv0:sprc0:02:09}
905     & n K_2 \bmod K_4 \in & \\
906     \nonumber
907     & \{ 0, \gcd(K_2, K_4), 2 \gcd(K_2, K_4), \ldots , K_4 - \gcd(K_2, K_4) \} , &
908     \end{eqnarray}
909    
910     a set which contains exactly $K_4/\gcd(K_2, K_4)$ elements.
911    
912     Thus we've established by the pigeonhole principle
913     that the sequence of the
914     values of the variable \texttt{state}
915     specified by (\ref{eq:lem:crat0:sfdv0:sprc0:02:02})
916     repeats perfectly with periodicity $K_4/\gcd(K_2, K_4)$,
917     and we've established that in one period, every element of the set
918     specified in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) appears exactly
919     once. (However, we have not specified the order in which the
920     elements appear, but this is not important for this lemma. In general
921     the elements appear out of the order shown in
922     Eq. \ref{eq:lem:crat0:sfdv0:sprc0:02:09}.)
923    
924     To establish the frequency with which the test against
925     $K_4$ is met, note that if $state_n + K_2 \geq K_4$, then
926    
927     \begin{eqnarray}
928     \label{eq:lem:crat0:sfdv0:sprc0:02:10}
929     & \displaystyle{state_n \in \left\{ \frac{K_4-K_2}{\gcd(K_2,K_4)} \gcd(K_2, K_4), \right.} & \\
930     \nonumber & \displaystyle{\left. \left(\frac{K_4-K_2}{\gcd(K_2,K_4)} + 1 \right) \gcd(K_2, K_4), \ldots ,
931     K_4 - \gcd(K_2, K_4)\right\} ,} &
932     \end{eqnarray}
933    
934     which has a cardinality $K_2/K_4$ that of the set in
935     (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}). Since the
936     \texttt{state} variable cycles through the set in
937     (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) with perfect periodicity and since
938     $K_2/K_4$ of the set elements lead to the \texttt{if()} statement
939     test being
940     met, (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) is also met as
941     $N_I\rightarrow\infty$.
942    
943     Note that if $K_1 \neq 0$, it simply changes the startup
944     behavior of the rational counting. So long as $K_2 < K_4$,
945     Algorithm \ref{alg:crat0:sfdv0:01a} will reach a steady state where
946     (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) holds.
947     Note that if $K_3 \neq K_4$, it simply ``shifts'' the sets
948     specified in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09})
949     and (\ref{eq:lem:crat0:sfdv0:sprc0:02:10}), but
950     (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) still holds.
951     The lemma has thus been proved
952     for every case. (We have neglected to give
953     the formal proof as required by the definition of a limit that
954     for any arbitrarily small error $\epsilon$, a
955     finite $N_I$ can be found so that
956     the error is at or below $\epsilon$; however the skeptical reader
957     is encouraged to complete Exercise \ref{exe:crat0:sexe0:02}.)
958     \end{vworklemmaproof}
959     \begin{vworklemmaparsection}{Remarks}
960     It is possible to view the long-term accuracy of
961     Algorithm \ref{alg:crat0:sfdv0:01a} in terms of a limit, as is done in
962     (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}). However, it is also
963     possible to observe that $K_1$ and $K_3$ set a delay until
964     the counting algorithm reaches steady state.
965     With $K_3=K_4$, the attainment of
966     steady state is characterized by the \texttt{state} variable
967     being assigned for the first time to one of the values in
968     (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}). Once in steady state,
969     the algorithm cycles with perfect periodic behavior through all of the
970     $K_4/\gcd(K_2,K_4)$ elements in
971     (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}), but not necessarily in
972     the order shown in the equation.
973     During this period of length $K_4/\gcd(K_2,K_4)$,
974     exactly $K_2/\gcd(K_2,K_4)$ invocations of the base
975     subroutine result in
976     subroutine ``\texttt{A()}'' being run, and exactly
977     $(K_4-K_2)/\gcd(K_2,K_4)$ do not. Thus, after reaching steady-state the
978     algorithm has \emph{perfect} accuracy if one considers periods of
979     length $K_4/\gcd(K_2,K_4)$.
980     \end{vworklemmaparsection}
981     %\vworklemmafooter{}
982    
983     \begin{vworklemmastatement}
984     \label{lem:crat0:sfdv0:sprc0:04}
985     If $K_3=K_4$, $K_1=0$, and
986     $\gcd(K_2, K_4)=1$\footnote{\label{footnote:lem:crat0:sfdv0:sprc0:04:01}If
987     $\gcd(K_2, K_4) > 1$, then by Theorem
988     \cprizeroxrefhyphen\ref{thm:cpri0:ppn0:00a} the largest
989     value that $n K_2 \bmod K_4$ can attain is
990     $K_4-\gcd(K_2, K_4)$ and the interval in
991     (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}) is correspondingly
992     smaller. (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}) is
993     technically correct but not as conservative as possible.
994     This is a minor point and we do not dwell on it.}, the error between
995     the approximation to $N_O$ implemented by
996     Algorithm \ref{alg:crat0:sfdv0:01a} and the ``ideal'' mapping is always
997     in the set
998    
999     \begin{equation}
1000     \label{eq:lem:crat0:sfdv0:sprc0:04:01}
1001     \left[ - \frac{K_4 - 1}{K_4} , 0 \right] ,
1002     \end{equation}
1003    
1004     and no algorithm can be constructed to
1005     confine the error to a smaller interval.
1006     \end{vworklemmastatement}
1007     \begin{vworklemmaproof}
1008     With $K_1=0$ and $K_3 = K_4$, it can be verified analytically that
1009     the total number of times the function ``\texttt{A()}'' has been
1010     invoked up to and including the $n$th invocation of the base subroutine
1011     is
1012    
1013     \begin{equation}
1014     \label{eq:lem:crat0:sfdv0:sprc0:04:02}
1015     N_O = \left\lfloor \frac{n K_2}{K_4} \right\rfloor .
1016     \end{equation}
1017    
1018     On the other hand, the ``ideal'' number of invocations, which
1019     we denote $\overline{N_O}$, is given by
1020    
1021     \begin{equation}
1022     \label{eq:lem:crat0:sfdv0:sprc0:04:03}
1023     \overline{N_O} = \frac{n K_2}{K_4} .
1024     \end{equation}
1025    
1026     Quantization of the rational number in (\ref{eq:lem:crat0:sfdv0:sprc0:04:02})
1027     can introduce an error of up to $-(K_4-1)/K_4$, therefore
1028    
1029     \begin{equation}
1030     \label{eq:lem:crat0:sfdv0:sprc0:04:04}
1031     N_O - \overline{N_O} =
1032     \left\lfloor \frac{n K_2}{K_4} \right\rfloor - \frac{n K_2}{K_4}
1033     \in \left[ - \frac{K_4 - 1}{K_4} , 0 \right] .
1034     \end{equation}
1035    
1036     This proves the error bound for Algorithm \ref{alg:crat0:sfdv0:01a}.
1037     The proof that there can be no better algorithm is the topic
1038     of Exercise \ref{exe:crat0:sexe0:06}.
1039     \end{vworklemmaproof}
1040     \begin{vworklemmaparsection}{Remarks}
1041     Algorithm \ref{alg:crat0:sfdv0:01a} is \emph{optimal} in the
1042     sense that no algorithm can achieve a tighter error
1043     bound than (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}). As
1044     demonstrated in Exercises \ref{exe:crat0:sexe0:04}
1045     and \ref{exe:crat0:sexe0:05}, $K_1 \neq 0$ can be chosen
1046     to shift the interval in (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}), but
1047     the span of the interval cannot be reduced.
1048     \end{vworklemmaparsection}
1049     \vworklemmafooter{}
1050    
1051     Lemmas \ref{lem:crat0:sfdv0:sprc0:02}
1052     and \ref{lem:crat0:sfdv0:sprc0:04} have demonstrated that the ratio of
1053     counts $N_O/N_I$ will asymptotically
1054     approach $K_2/K_4$
1055     (i.e. the long-term accuracy of Algorithm \ref{alg:crat0:sfdv0:01a}
1056     is \emph{perfect}).
1057     However,
1058     for many applications it is also desirable to have a lack of
1059     ``bursty'' behavior. We demonstrate the lack of bursty
1060     behavior in the following lemma.
1061    
1062     \begin{vworklemmastatement}
1063     \label{lem:crat0:sfdv0:sprc0:03}
1064     For Algorithm \ref{alg:crat0:sfdv0:01a}, once steady
1065     state has been achieved, the number of consecutive
1066     base subroutine invocations during which subroutine
1067     ``\texttt{A()}'' is executed is always in the set
1068    
1069     \begin{equation}
1070     \label{eq:lem:crat0:sfdv0:sprc0:03:01}
1071     \left\{
1072     \left\lfloor \frac{K_2}{K_4 - K_2} \right\rfloor ,
1073     \left\lceil \frac{K_2}{K_4 - K_2} \right\rceil
1074     \right\} \cap \vworkintsetpos,
1075     \end{equation}
1076    
1077     which contains one integer if $K_2/K_4 \leq 1/2$ or $(K_4-K_2) \vworkdivides K_2$,
1078     or two integers otherwise.
1079    
1080     Once steady state has been achieved, the number of
1081     consecutive base function invocations during which
1082     subroutine ``\texttt{A()}'' is not executed is
1083     always in the set
1084    
1085     \begin{equation}
1086     \label{eq:lem:crat0:sfdv0:sprc0:03:02}
1087     \left\{
1088     \left\lfloor \frac{K_4-K_2}{K_2} \right\rfloor ,
1089     \left\lceil \frac{K_4-K_2}{K_2} \right\rceil
1090     \right\} \cap \vworkintsetpos,
1091     \end{equation}
1092    
1093     which contains one integer if $K_2/K_4 \geq 1/2$ or $K_2 \vworkdivides K_4$,
1094     or two integers otherwise.
1095     \end{vworklemmastatement}
1096     \begin{vworklemmaproof}
1097     As before in Lemma \ref{lem:crat0:sfdv0:sprc0:02}
1098     for convenience and without
1099     loss of generality, assume $K_3=K_4$ and
1100     $K_1=0$. Then after a transient period
1101     determined by $K_1$ and $K_3$, the \texttt{state}
1102     variable will be assigned one of the values in
1103     (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) and cycle through
1104     those values in an unestablished order but with perfect
1105     periodicity. To accomplish this proof, we must establish
1106     something about the order in which the \texttt{state} variable attains
1107     the values in the set in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}).
1108    
1109     We can partition the set in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09})
1110     into two sets; the set of values of \texttt{state} for which if the
1111     base subroutine is invoked with \texttt{state} in this set, subroutine
1112     ``\texttt{A()}'' will not be invoked (we call this set $\phi_1$),
1113     and the set of values of \texttt{state} for which if the
1114     base subroutine is invoked with \texttt{state} in this set, subroutine
1115     ``\texttt{A()}'' will be invoked (we call this set $\phi_2$).
1116     $\phi_1$ and $\phi_2$ are identified below.
1117    
1118     \begin{eqnarray}
1119     \label{eq:lem:crat0:sfdv0:sprc0:03:03}
1120     & \phi_1 = & \\
1121     \nonumber &
1122     \displaystyle{\left\{
1123     0, \gcd(K_2, K_4), 2 \gcd(K_2, K_4), \ldots ,
1124     \left(\frac{K_4-K_2}{\gcd(K_2,K_4)} - 1 \right) \gcd(K_2, K_4)
1125     \right\}} &
1126     \end{eqnarray}
1127    
1128     \begin{eqnarray}
1129     \label{eq:lem:crat0:sfdv0:sprc0:03:04}
1130     & \displaystyle{
1131     \phi_2 = \left\{\left(\frac{K_4-K_2}{\gcd(K_2,K_4)}\right) \gcd(K_2, K_4),\right.} & \\
1132     \nonumber & \displaystyle{\left.
1133     \left(\frac{K_4-K_2}{\gcd(K_2,K_4)} + 1 \right) \gcd(K_2, K_4) ,
1134     \ldots ,
1135     K_4 - \gcd(K_2, K_4)
1136     \right\}} &
1137     \end{eqnarray}
1138    
1139     We can also make the following four additional useful observations
1140     about $\phi_1$ and $\phi_2$. Note that
1141     (\ref{eq:lem:crat0:sfdv0:sprc0:03:07}) and
1142     (\ref{eq:lem:crat0:sfdv0:sprc0:03:08}) become equality
1143     if $\gcd(K_2, K_4) = 1$.
1144    
1145     \begin{equation}
1146     \label{eq:lem:crat0:sfdv0:sprc0:03:05}
1147     n(\phi_1) = \frac{K_4 - K_2}{\gcd(K_2, K_4)}
1148     \end{equation}
1149    
1150     \begin{equation}
1151     \label{eq:lem:crat0:sfdv0:sprc0:03:06}
1152     n(\phi_2) = \frac{K_2}{\gcd(K_2, K_4)}
1153     \end{equation}
1154    
1155     \begin{equation}
1156     \label{eq:lem:crat0:sfdv0:sprc0:03:07}
1157     \phi_1 \subseteq \{ 0, 1, \ldots , K_4 - K_2 - 1 \}
1158     \end{equation}
1159    
1160     \begin{equation}
1161     \label{eq:lem:crat0:sfdv0:sprc0:03:08}
1162     \phi_2 \subseteq \{K_4 - K_2, \ldots , K_4 - 1 \}
1163     \end{equation}
1164    
1165     We first prove (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}).
1166     If $state_n \in \phi_2$ at the time the base function
1167     is invoked, then
1168     ``\texttt{A()}'' will be invoked. We also know that
1169     since $state_n \in \phi_2$, $state_n + K_2 \geq K_4$, so
1170    
1171     \begin{equation}
1172     \label{eq:lem:crat0:sfdv0:sprc0:03:09}
1173     state_{n+1} \;\; =|_{state_n \in \phi_2} \;\; state_n - (K_4 - K_2) .
1174     \end{equation}
1175    
1176     Thus so long as $state_n \in \phi_2$, $state_{n+1} < state_n$
1177     as specified above in (\ref{eq:lem:crat0:sfdv0:sprc0:03:09}).
1178     With each invocation of the base subroutine, \texttt{state} will
1179     ``walk downward'' through $\phi_2$. It can
1180     also be observed that when \texttt{state} drops below the smallest
1181     element of $\phi_2$, the next value of \texttt{state} will
1182     be in $\phi_1$.
1183    
1184     Note also that although the downward walk specified in
1185     (\ref{eq:lem:crat0:sfdv0:sprc0:03:09}) walks downward in absolute steps
1186     of $K_4-K_2$, this corresponds to $(K_4-K_2) / \gcd(K_2, K_4)$
1187     \emph{elements} of $\phi_2$, since the elements of $\phi_2$ are
1188     separated by $\gcd(K_2, K_4)$.
1189    
1190     Given the ``downward walk'' specified in (\ref{eq:lem:crat0:sfdv0:sprc0:03:09}),
1191     the only question to be answered is how many consecutive values of
1192     \texttt{state}, separated by $K_4-K_2$ (or $(K_4-K_2)/\gcd(K_2, K_4)$ elements),
1193     can ``fit'' into
1194     $\phi_2$. Considering that $n(\phi_2) = K_2/\gcd(K_2, K_4)$
1195     (Eq. \ref{eq:lem:crat0:sfdv0:sprc0:03:06}) and that the
1196     downward step represents $(K_4-K_2)/\gcd(K_2, K_4)$ set elements,
1197     (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}) comes immediately by
1198     a graphical argument.
1199    
1200     We now prove (\ref{eq:lem:crat0:sfdv0:sprc0:03:02}).
1201     This can be proved using exactly the same arguments
1202     as for (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}), but
1203     considering the upward walk through $\phi_1$ rather
1204     than the downward walk through $\phi_2$.
1205    
1206     As with Lemma \ref{lem:crat0:sfdv0:sprc0:02},
1207     note that the choices of $K_1$ and $K_3$ do not
1208     materially affect the proof above. $K_1$ and
1209     $K_3$ only set a delay until the rational counting
1210     algorithm reaches steady state. $K_3$ only shifts
1211     the sets $\phi_1$ and $\phi_2$.
1212     \end{vworklemmaproof}
1213     \begin{vworklemmaparsection}{Remark \#1}
1214     This lemma proves an \emph{extremely} important property for the
1215     usability of Algorithm \ref{alg:crat0:sfdv0:01a}. It says that once
1216     steady state has been reached, the variability in the number of consecutive
1217     times ``\texttt{A()}'' is run or not run is at most one count.
1218     \end{vworklemmaparsection}
1219     \begin{vworklemmaparsection}{Remark \#2}
1220     It is probably also possible to construct a rational counting algorithm
1221     so that the number of consecutive times ``\texttt{A()}'' is run is constant,
1222     but the algorithm achieves long-term accuracy by varying only the number
1223     of consecutive times ``\texttt{A()}'' is not run (or vice-versa), but this
1224     is not done here.
1225     \end{vworklemmaparsection}
1226     \begin{vworklemmaparsection}{Remark \#3}
1227     There is no requirement that $K_2$ and $K_4$ be coprime. In fact, as
1228     demonstrated later, it may be advantageous to choose a large $K_2$ and
1229     $K_4$ to approximate a simple ratio so that very fine adjustments can be
1230     made. For example, if the ideal ratio is 1/2, it may be desirable
1231     in some applications to
1232     choose $K_2$=1,000 and $K_4$=2,000 so that fine adjustments can be made
1233     by slightly perturbing $K_2$ or $K_4$. One might adjust 1,000/2,000 downward
1234     to 999/2,000 or upward to 1,001/2,000 by modifying $K_2$
1235     (both very fine adjustments).
1236     \end{vworklemmaparsection}
1237     \begin{vworklemmaparsection}{Remark \#4}
1238     The most common choice of $K_1$ in practice is 0. If $K_1=0$ is chosen,
1239     it can be shown that the number of initial invocations of the
1240     base subroutine is in the set identified in
1241     (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}).
1242     (See Exercise \ref{exe:crat0:sexe0:07}.)
1243     \end{vworklemmaparsection}
1244     \vworklemmafooter{}
1245    
1246     For microcontroller work, it is considered
1247     a desirable property that software components be resilient
1248     to state upset
1249     (see Section \chgrzeroxrefhyphen\ref{chgr0:sdda0:srob0}).
1250     It can be observed that Algorithm \ref{alg:crat0:sfdv0:01a} will
1251     exhibit very anomalous behavior if \texttt{state} is upset to a very negative
1252     value. One possible correction to this shortcoming is illustrated
1253     in Figure \ref{fig:crat0:sfdv0:sprc0:01}. Other possible
1254     corrections are the topic of Exercise \ref{exe:crat0:sexe0:08}.
1255    
1256     \begin{figure}
1257     \begin{verbatim}
1258     /* The constants K1 through K4, which parameterize the */
1259     /* counting behavior, are assumed assigned elsewhere in */
1260     /* the code. The solution is analyzed in terms of the */
1261     /* parameters K1 through K4. */
1262     /* */
1263     /* We also place the following restrictions on K1 through */
1264     /* K4: */
1265     /* K1 : K1 <= K3 - K2. */
1266     /* K2 : K4 > K2 > 0. */
1267     /* K3 : No restrictions. */
1268     /* K4 : K4 > K2 > 0. */
1269    
1270     void base_rate_func(void)
1271     {
1272     static int state = K1;
1273    
1274     state += K2;
1275    
1276     if ((state < K1) || (state >= K3))
1277     {
1278     state -= K4;
1279     A();
1280     }
1281     }
1282     \end{verbatim}
1283     \caption{Algorithm \ref{alg:crat0:sfdv0:01a} With State Upset Shortcoming
1284     Corrected}
1285     \label{fig:crat0:sfdv0:sprc0:01}
1286     \end{figure}
1287    
1288     \begin{vworkexamplestatement}
1289     \label{ex:crat0:sfdv0:sprc0:01}
1290     Determine the behavior of Algorithm \ref{alg:crat0:sfdv0:01a} with
1291     $K_1=0$, $K_2=30$, and $K_3=K_4=50$.
1292     \end{vworkexamplestatement}
1293     \begin{vworkexampleparsection}{Solution}
1294     We first predict the behavior, and then trace the algorithm to
1295     verify whether the predictions are accurate.
1296    
1297     We make the following predictions:
1298    
1299     \begin{itemize}
1300     \item The steady state sequence of invocations of ``\texttt{A()}'' will
1301     be periodic with period
1302     $K_4/\gcd(K_2, K_4) = 50/10 = 5$, as described
1303     in Lemma \ref{lem:crat0:sfdv0:sprc0:02}.
1304     \item The number of initial invocations of the
1305     base subroutine in which ``\texttt{A()}''
1306     is not run will be
1307     $\lceil (K_4 - K_2) / K_2 \rceil = \lceil 2/3 \rceil = 1$,
1308     as described in Remark \#4 of
1309     Lemma \ref{lem:crat0:sfdv0:sprc0:03} and in the solution to
1310     Exercise \ref{exe:crat0:sexe0:07}.
1311     \item In steady state, the number of consecutive invocations of the
1312     base subroutine during which ``\texttt{A()}''
1313     is not executed will always be 1, as
1314     described in Equation \ref{eq:lem:crat0:sfdv0:sprc0:03:02} of
1315     Lemma \ref{lem:crat0:sfdv0:sprc0:03}.
1316     (Applying Eq. \ref{eq:lem:crat0:sfdv0:sprc0:03:02}
1317     yields \
1318     $\{ \lfloor 20/30 \rfloor , \lceil 20/30 \rceil \} \cap \vworkintsetpos %
1319     = \{ 0,1 \} \cap \{1, 2, \ldots \} = \{ 1 \}$.)
1320     \item In steady state, the number of consecutive invocations of the
1321     base subroutine during which ``\texttt{A()}''
1322     is executed will always be 1 or 2, as
1323     described in Equation \ref{eq:lem:crat0:sfdv0:sprc0:03:01} of
1324     Lemma \ref{lem:crat0:sfdv0:sprc0:03}.
1325     (Applying Eq. \ref{eq:lem:crat0:sfdv0:sprc0:03:01}
1326     yields \
1327     $\{ \lfloor 30/20 \rfloor , \lceil 30/20 \rceil \} \cap \vworkintsetpos %
1328     = \{ 1,2 \} \cap \{1, 2, \ldots \} = \{ 1,2 \}$.)
1329     \item The rational counting algorithm will have
1330     perfect long-term accuracy.
1331     \end{itemize}
1332    
1333     We can verify the predictions above by tracing the behavior of
1334     Algorithm \ref{alg:crat0:sfdv0:01a}. We adopt the convention
1335     that $A_n = 1$ if subroutine ``\texttt{A()}'' is invoked during
1336     the $n$th invocation of the base subroutine.
1337     Table \ref{tbl:crat0:sfdv0:sprc0:01}
1338     contains the results of tracing Algorithm \ref{alg:crat0:sfdv0:01a}
1339     with $K_1=0$, $K_2=30$, and $K_3=K_4=50$.
1340    
1341     \begin{table}
1342     \caption{Trace Of Algorithm \ref{alg:crat0:sfdv0:01a} With
1343     $K_1=0$, $K_2=30$, And $K_3=K_4=50$ (Example \ref{ex:crat0:sfdv0:sprc0:01})}
1344     \label{tbl:crat0:sfdv0:sprc0:01}
1345     \begin{center}
1346     \begin{tabular}{|c|c|c|}
1347     \hline
1348     Index ($n$) & $state_n$ & $A_n$ \\
1349     \hline
1350     \hline
1351     0 & 0 & N/A \\
1352     \hline
1353     1 & 30 & 0 \\
1354     \hline
1355     2 & 10 & 1 \\
1356     \hline
1357     3 & 40 & 0 \\
1358     \hline
1359     4 & 20 & 1 \\
1360     \hline
1361     5 & 0 & 1 \\
1362     \hline
1363     6 & 30 & 0 \\
1364     \hline
1365     7 & 10 & 1 \\
1366     \hline
1367     8 & 40 & 0 \\
1368     \hline
1369     9 & 20 & 1 \\
1370     \hline
1371     10 & 0 & 1 \\
1372     \hline
1373     \end{tabular}
1374     \end{center}
1375     \end{table}
1376    
1377     It can be verfied from the table that all of the
1378     predicted properties are exhibited by the
1379     algorithm.
1380     \end{vworkexampleparsection}
1381     \vworkexamplefooter{}
1382    
1383     A second characteristic of Algorithm \ref{alg:crat0:sfdv0:01a}
1384     that should be analyzed carefully is the behavior
1385     of the algorithm if parameters $K_2$ and $K_4$ are adjusted
1386     ``on the fly''. ``On-the-fly'' adjustment
1387     raises the following concerns. We assume for convenience
1388     that $K_1=0$ and $K_3=K_4$.
1389    
1390     \begin{enumerate}
1391     \item \label{enum:crat0:sfdv0:sprc0:01:01}
1392     \textbf{Critical section protocol:} if the
1393     rational counting algorithm is implemented in a process which
1394     is asynchronous to the process which desires to change
1395     $K_2$ and $K_4$, what precautions must be taken?
1396     \item \label{enum:crat0:sfdv0:sprc0:01:02}
1397     \textbf{Anomalous behavior:} will the rational
1398     counting algorithm behave in a \emph{very} unexpected way
1399     if $K_2$ and $K_4$ are changed on the fly?
1400     \item \label{enum:crat0:sfdv0:sprc0:01:03}
1401     \textbf{Preservation of accuracy:} even if the behavior
1402     exhibited is not \emph{extremely} anomalous, how should
1403     $K_2$ and $K_4$ be modified on the fly so as to preserve the
1404     maximum accuracy?
1405     \end{enumerate}
1406    
1407     \textbf{(Concern \#\ref{enum:crat0:sfdv0:sprc0:01:02}):} It can be observed
1408     with Algorithm \ref{alg:crat0:sfdv0:01a} that neither increasing
1409     nor decreasing $K_2$ nor $K_4$ on the fly
1410     will lead to \emph{highly} anomalous
1411     behavior. Each invocation of the algorithm will map
1412     \texttt{state} back into the set identified in
1413     (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}). Thus on-the-fly changes
1414     to $K_2$ and $K_4$ will establish the rational counting algorithm
1415     immediately into steady-state behavior, and the result will not be
1416     \emph{highly} anomalous if such on-the-fly changes are not
1417     made very often.
1418    
1419     \textbf{(Concern \#\ref{enum:crat0:sfdv0:sprc0:01:03}):} It can be deduced
1420     from
1421     (\ref{eq:lem:crat0:sfdv0:sprc0:04:02}),
1422     (\ref{eq:lem:crat0:sfdv0:sprc0:04:03}), and
1423     (\ref{eq:lem:crat0:sfdv0:sprc0:04:04}) that the value of the
1424     \texttt{state} variable in Algorithm \ref{alg:crat0:sfdv0:01a}
1425     satisfies the relationship
1426    
1427     \begin{equation}
1428     \label{eq:crat0:sfdv0:sprc0:01}
1429     \overline{N_O} - N_O = \frac{state}{K_4} ;
1430     \end{equation}
1431    
1432     \noindent{}in other words, the \texttt{state} variable
1433     contains the remainder of an effective division by $K_4$
1434     and thus maintains the fractional part of $\overline{N_O}$.
1435     Altering $K_4$ on the fly to a new value
1436     (say, $\overline{K_4}$) may be problematic, because
1437     to preserve the current fractional part
1438     of $\overline{N_O}$, one must adjust it for
1439     the new denominator $\overline{K_4}$. This requires
1440     solving the equation
1441    
1442     \begin{equation}
1443     \label{eq:crat0:sfdv0:sprc0:02}
1444     \frac{state}{K_4} = \frac{n}{\;\;\overline{K_4}\;\;}
1445     \end{equation}
1446    
1447     \noindent{}for $n$ which must be an integer to avoid
1448     loss of information. In general,
1449     this would require that $K_4 \vworkdivides \overline{K_4}$,
1450     a constraint which would be rarely met. Thus, for high-precision
1451     applications where a new rational counting rate should become effective
1452     seamlessly, the best strategy would seem to be to modify $K_2$ only.
1453     It can be verified that modifying $K_2$ on the fly accomplishes
1454     a perfect rate transition.
1455    
1456     \textbf{(Concern \#\ref{enum:crat0:sfdv0:sprc0:01:01}):} In microcontroller work,
1457     ordinal data types often represent machine-native data types. In such cases,
1458     it may be possible for one process to set $K_2$ or $K_4$
1459     for another process that is asynchronous with respect to it by relying
1460     on the atomicity of machine instructions (i.e. without formal mutual
1461     exclusion protocol). However, in other cases where the ordinal data types
1462     of $K_2$ or $K_4$ are larger than can be accomodated by
1463     a single machine instruction or where $K_2$ and $K_4$ must be modified
1464     together atomically, mutual exclusion protocol should be used to
1465     prevent anomalous behavior due to race conditions (see
1466     Exercise \ref{exe:crat0:sexe0:14}).
1467    
1468     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1469     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1470     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1471     \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01b}]
1472     {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01b}}
1473     %Section tag: PRC1
1474     \label{crat0:sfdv0:sprc1}
1475    
1476     Algorithm \ref{alg:crat0:sfdv0:01a}
1477     has the disadvantage that it requires $K_2/K_4 < 1$ (i.e. it can only
1478     decrease frequency, but never increase frequency). This deficiency
1479     can be corrected by using
1480     Algorithm \ref{alg:crat0:sfdv0:01b}.
1481    
1482     Note that Algorithm \ref{alg:crat0:sfdv0:01b} will properly deal with $K_2$ and
1483     $K_4$ chosen such that $0 < K_2/K_4 < \infty$.
1484    
1485     The most common reason that one may want a counting algorithm
1486     that will correctly handle
1487     $K_2/K_4 \geq 1$ is to conveniently handle $K_2/K_4 \approx 1$.
1488     In practice, $K_2/K_4$ may represent a quantity that is
1489     normally very close to
1490     1 but may also be slightly less than or slightly greater than 1.
1491     For example, one may use $K_2/K_4 \approx 1$ to correct for a
1492     crystal or a resonator which deviates slightly from its nominal
1493     frequency. We illustrate this with the following example.
1494    
1495     \begin{vworkexamplestatement}
1496     \label{ex:crat0:sfdv0:sprc1:01}
1497     A microcontroller software load keeps time via an interrupt
1498     service routine that runs every 1ms, but this frequency may be
1499     off by as much as 1 part in 10,000 due to variations in
1500     crystal or resonator manufacture. The interrupt service routine
1501     updates a counter which represents the number of milliseconds elapsed since
1502     the software load was reset. Devise a rational counting strategy
1503     based on Algorithm \ref{alg:crat0:sfdv0:01b}
1504     which will allow the time accuracy to be trimmed to within
1505     one second per year or less by adjusting only $K_4$, and implement the counting strategy
1506     in software.
1507     \end{vworkexamplestatement}
1508     \begin{vworkexampleparsection}{Solution}
1509     $K_2/K_4$ will be nominally very close to 1 ($K_2 \approx K_4$).
1510     If we assume that each year has 365.2422\footnote{The period of the earth's
1511     rotation about the sun is not an integral number of days, which is why the
1512     rules for leap years exist. Ironically, the assignment of leap years is itself
1513     a problem very similar to the rational counting problems discussed in this chapter.} days
1514     ($\approx$ 31,556,926 seconds), then choosing
1515     $K_2 \approx K_4 = 31,556,926$ will yield satisfactory results.
1516     If we may need to compensate for up to 1 part in 10,000 of crystal or resonator
1517     inaccuracy, we may need to adjust $K_2$ as low as 0.9999 $\times$ 31,556,926 $\approx$
1518     31,553,770 (to compensate for a fast
1519     crystal or resonator) or as
1520     high as 1.0001 $\times$ 31,556,926
1521     $\approx$ 31,560,082
1522     (to compensate for a slow crystal or resonator). Choosing
1523     $K_4 = 31,556,926$ yields the convenient relationship that each
1524     count in $K_2$ corresponds to one second per year.
1525    
1526     \begin{figure}
1527     \begin{verbatim}
1528     /* The constants K1 through K4, which parameterize the */
1529     /* counting behavior, are assumed assigned elsewhere in */
1530     /* the code. */
1531     /* */
1532     /* The variable time_count below is the number of milli- */
1533     /* seconds since the software was reset. */
1534     int time_count = 0;
1535    
1536     /* It is assumed that the base rate subroutine below is */
1537     /* called every millisecond (or, at least what should be */
1538     /* every millisecond of the crystal or resonator were */
1539     /* perfect). */
1540    
1541     void base_rate_sub(void)
1542     {
1543     static int state = K1;
1544    
1545     state += K2;
1546    
1547     while (state >= K3)
1548     {
1549     state -= K4;
1550     time_count++;
1551     }
1552     }
1553     \end{verbatim}
1554     \caption{Algorithm \ref{alg:crat0:sfdv0:01b} Applied To Timekeeping
1555     (Example \ref{ex:crat0:sfdv0:sprc1:01})}
1556     \label{fig:ex:crat0:sfdv0:sprc1:01:01}
1557     \end{figure}
1558    
1559     Figure \ref{fig:ex:crat0:sfdv0:sprc1:01:01} provides an illustration
1560     of Algorithm \ref{alg:crat0:sfdv0:01b} applied in this scenario.
1561     We assume that $K_4$ contains the constant value 31,556,926
1562     and that $K_2$ is modified about this value either downwards or upwards
1563     to trim the timekeeping. Note that Algorithm \ref{alg:crat0:sfdv0:01b} will correctly
1564     handle $K_2 \geq K_4$.
1565    
1566     Also note in the implementation illustrated in Figure
1567     \ref{fig:ex:crat0:sfdv0:sprc1:01:01} that large integers (27 bits or more)
1568     are required. (See also Exercise \ref{exe:crat0:sexe0:09}).
1569     \end{vworkexampleparsection}
1570     \vworkexamplefooter{}
1571    
1572     It may not be obvious whether Algorithm \ref{alg:crat0:sfdv0:01b} has the
1573     same or similar desirable properties as Algorithm \ref{alg:crat0:sfdv0:01a}
1574     presented
1575     in Lemmas
1576     \ref{lem:crat0:sfdv0:sprc0:01},
1577     \ref{lem:crat0:sfdv0:sprc0:02},
1578     \ref{lem:crat0:sfdv0:sprc0:04},
1579     and
1580     \ref{lem:crat0:sfdv0:sprc0:03}.
1581     Algorithm \ref{alg:crat0:sfdv0:01b} does have these desirable
1582     properties, and these properties are presented as
1583     Lemmas \ref{lem:crat0:sfdv0:sprc1:01},
1584     \ref{lem:crat0:sfdv0:sprc1:02},
1585     \ref{lem:crat0:sfdv0:sprc1:03}, and
1586     \ref{lem:crat0:sfdv0:sprc1:04}.
1587     The proofs of these lemmas are identical or very similar to the proofs
1588     of Lemmas
1589     \ref{lem:crat0:sfdv0:sprc0:01},
1590     \ref{lem:crat0:sfdv0:sprc0:02},
1591     \ref{lem:crat0:sfdv0:sprc0:04},
1592     and
1593     \ref{lem:crat0:sfdv0:sprc0:03};
1594     and so these proofs when not identical are presented as exercises.
1595     Note that Algorithm \ref{alg:crat0:sfdv0:01b} behaves identically to
1596     Algorithm \ref{alg:crat0:sfdv0:01a} when $K_2 < K_4$, and the
1597     case of $K_2=K_4$ is trivial, so in general only
1598     the behavior when $K_2 > K_4$ remains to be proved.
1599    
1600     \begin{vworklemmastatement}
1601     \label{lem:crat0:sfdv0:sprc1:01}
1602     $N_{STARTUP}$, the number of invocations of the base subroutine
1603     in Algorithm \ref{alg:crat0:sfdv0:01b} before ``\texttt{A()}'' is called
1604     for the first time, is given by
1605    
1606     \begin{equation}
1607     \label{eq:lem:crat0:sfdv0:sprc1:01:01}
1608     N_{STARTUP} =
1609     \left\lceil
1610     {
1611     \frac{-K_1 - K_2 + K_3}{K_2}
1612     }
1613     \right\rceil .
1614     \end{equation}
1615     \end{vworklemmastatement}
1616     \begin{vworklemmaproof}
1617     The proof is identical to the proof of Lemma
1618     \ref{lem:crat0:sfdv0:sprc0:01}.
1619     \end{vworklemmaproof}
1620    
1621    
1622     \begin{vworklemmastatement}
1623     \label{lem:crat0:sfdv0:sprc1:02}
1624     Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:01b}
1625     base subroutine
1626     is called, let $N_O$ be the number of times the
1627     ``\texttt{A()}'' subroutine is called, let
1628     $f_I$ be the frequency of invocation of the
1629     Algorithm \ref{alg:crat0:sfdv0:01a} base subroutine, and let
1630     $f_O$ be the frequency of invocation of
1631     ``\texttt{A()}''.
1632    
1633     \begin{equation}
1634     \label{eq:lem:crat0:sfdv0:sprc1:02:01}
1635     \lim_{N_I\rightarrow\infty}\frac{N_O}{N_I}
1636     =
1637     \frac{f_O}{f_I}
1638     =
1639     \frac{K_2}{K_4} .
1640     \end{equation}
1641     \end{vworklemmastatement}
1642     \begin{vworklemmaproof}
1643     See Exercise \ref{exe:crat0:sexe0:10}.
1644     \end{vworklemmaproof}
1645    
1646     \begin{vworklemmastatement}
1647     \label{lem:crat0:sfdv0:sprc1:03}
1648     If $K_3=K_4$, $K_1=0$, and $\gcd(K_2, K_4)=1$\footnote{See also
1649     footnote \ref{footnote:lem:crat0:sfdv0:sprc0:04:01} in this chapter.}, the error between
1650     the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01b}
1651     and the ``ideal'' mapping is always
1652     in the set
1653    
1654     \begin{equation}
1655     \label{eq:lem:crat0:sfdv0:sprc1:03:01}
1656     \left[ - \frac{K_4 - 1}{K_4} , 0 \right] ,
1657     \end{equation}
1658    
1659     and no algorithm can be constructed to
1660     confine the error to a smaller interval.
1661     \end{vworklemmastatement}
1662     \begin{vworklemmaproof}
1663     The proof is identical to the proof of Lemma \ref{lem:crat0:sfdv0:sprc0:04}.
1664     \end{vworklemmaproof}
1665    
1666     \begin{vworklemmastatement}
1667     \label{lem:crat0:sfdv0:sprc1:04}
1668     For Algorithm \ref{alg:crat0:sfdv0:01b}
1669     with
1670     $K_2 \geq K_4$, once steady
1671     state has been achieved (see Exercise
1672     \ref{exe:crat0:sexe0:13}), each invocation of the
1673     base subroutine will result in
1674     a number of invocations of
1675     ``\texttt{A()}'' which is in the set
1676    
1677     \begin{equation}
1678     \label{eq:lem:crat0:sfdv0:sprc1:04:01}
1679     \left\{
1680     \left\lfloor \frac{K_2}{K_4} \right\rfloor ,
1681     \left\lceil \frac{K_2}{K_4} \right\rceil
1682     \right\},
1683     \end{equation}
1684    
1685     which contains one integer if $K_4 \vworkdivides K_2$,
1686     or two integers otherwise. With $K_2 < K_4$,
1687     the behavior will be as specified in Lemma
1688     \ref{lem:crat0:sfdv0:sprc0:03}.
1689     \end{vworklemmastatement}
1690     \begin{vworklemmaproof}
1691     See Exercise \ref{exe:crat0:sexe0:12}.
1692     \end{vworklemmaproof}
1693     \begin{vworklemmaparsection}{Remark}
1694     Note that Lemma \ref{lem:crat0:sfdv0:sprc0:03}
1695     and this lemma specify different aspects of behavior,
1696     which is why (\ref{eq:lem:crat0:sfdv0:sprc0:03:01})
1697     and (\ref{eq:lem:crat0:sfdv0:sprc0:03:02}) take
1698     different forms than
1699     (\ref{eq:lem:crat0:sfdv0:sprc1:04:01}).
1700     Lemma \ref{lem:crat0:sfdv0:sprc0:03} specifies the number of consecutive
1701     invocations of the base subroutine for which ``\texttt{A()}''
1702     will be run, but with $K_2 \geq K_4$ it does not make sense to
1703     specify behavior in this way since ``\texttt{A()}'' will be run
1704     on \emph{every} invocation of the base subroutine. This lemma specifies
1705     the number of times ``\texttt{A()}'' will be run on a \emph{single}
1706     invocation of the base subroutine (which is not meaningful if
1707     $K_2 < K_4$ since the result will always be 0 or 1).
1708     \end{vworklemmaparsection}
1709     %\vworklemmafooter{}
1710    
1711    
1712     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1713     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1714     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1715     \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01c}]
1716     {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01c}}
1717     %Section tag: PRX0
1718     \label{crat0:sfdv0:sprx0}
1719    
1720     Algorithm \ref{alg:crat0:sfdv0:01c}\footnote{Algorithm \ref{alg:crat0:sfdv0:01c}
1721     was contributed in March, 2003
1722     by Chuck B. Falconer \cite{bibref:i:chuckbfalconer}
1723     via the
1724     \texttt{comp.arch.embedded} \cite{bibref:n:comparchembedded}
1725     newsgroup.}
1726     is a variant of Algorithm \ref{alg:crat0:sfdv0:01a}
1727     which has one fewer
1728     degrees of freedom than Algorithms \ref{alg:crat0:sfdv0:01a}
1729     and \ref{alg:crat0:sfdv0:01b} and can be implemented
1730     more efficiently under most instruction sets. Algorithm \ref{alg:crat0:sfdv0:01c}
1731     is superior to Algorithms \ref{alg:crat0:sfdv0:01a}
1732     and \ref{alg:crat0:sfdv0:01b}
1733     from a computational efficiency
1734     point of view, but is less intuitive.
1735    
1736     The superiority in computational efficiency of Algorithm \ref{alg:crat0:sfdv0:01c}
1737     comes from the possibility of using an implicit test against zero
1738     (rather than an explicit
1739     test against $K_3$, as is found in Algorithms \ref{alg:crat0:sfdv0:01a}
1740     and \ref{alg:crat0:sfdv0:01b}).
1741     Many machine instruction sets automatically set flags to indicate a negative
1742     result when the
1743     subtraction of $K_2$ is performed, thus often allowing a conditional branch
1744     without an additional instruction. Whether an instruction will be saved in
1745     the code of Figure \ref{fig:crat0:sfdv0:01c} depends on the sophistication
1746     of the `C' compiler, but of course if the algorithm were coded in
1747     assembly-language an instruction could be saved on most processors.
1748    
1749     The properties of rational counting Algorithm \ref{alg:crat0:sfdv0:01c} are nearly
1750     identical to those of Algorithm \ref{alg:crat0:sfdv0:01a},
1751     and we prove the important properties
1752     now.
1753    
1754     \begin{vworklemmastatement}
1755     \label{lem:crat0:sfdv0:sprx0:01}
1756     $N_{STARTUP}$, the number of invocations of the base subroutine
1757     in Algorithm \ref{alg:crat0:sfdv0:01c} before ``\texttt{A()}'' is called
1758     for the first time, is given by
1759    
1760     \begin{equation}
1761     \label{eq:lem:crat0:sfdv0:sprx0:01:01}
1762     N_{STARTUP} =
1763     \left\lfloor
1764     {
1765     \frac{K_1}{K_2}
1766     }
1767     \right\rfloor .
1768     \end{equation}
1769     \end{vworklemmastatement}
1770     \begin{vworklemmaproof}
1771     The value of \texttt{state} when tested against
1772     zero in the \texttt{if()} statement during the $n$th invocation
1773     of the base subroutine is $K_1 - n K_2$. In order for the test
1774     not to be met on the $n$th invocation
1775     of the base subroutine, we require that
1776    
1777     \begin{equation}
1778     \label{eq:lem:crat0:sfdv0:sprx0:01:02}
1779     K_1 - n K_2 \geq 0
1780     \end{equation}
1781    
1782     \noindent{}or equivalently that
1783    
1784     \begin{equation}
1785     \label{eq:lem:crat0:sfdv0:sprx0:01:03}
1786     n \leq \frac{K_1}{K_2} .
1787     \end{equation}
1788    
1789     Solving (\ref{eq:lem:crat0:sfdv0:sprx0:01:03}) for the
1790     largest value of $n \in \vworkintset$ which still meets the criterion
1791     yields (\ref{eq:lem:crat0:sfdv0:sprx0:01:01}). Note that
1792     the derivation of (\ref{eq:lem:crat0:sfdv0:sprx0:01:01}) requires
1793     that the restrictions on $K_1$, $K_2$, and $K_3$ documented in
1794     Figure \ref{fig:crat0:sfdv0:01c} be met.
1795     \end{vworklemmaproof}
1796    
1797     \begin{vworklemmastatement}
1798     \label{lem:crat0:sfdv0:sprx0:02}
1799     Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:01c}
1800     base subroutine
1801     is called, let $N_O$ be the number of times the
1802     ``\texttt{A()}'' subroutine is called, let
1803     $f_I$ be the frequency of invocation of the
1804     Algorithm \ref{alg:crat0:sfdv0:01a}
1805     base subroutine, and let
1806     $f_O$ be the frequency of invocation of
1807     ``\texttt{A()}''. Provided the constraints
1808     on $K_1$, $K_2$, and $K_3$ documented in
1809     Figure \ref{fig:crat0:sfdv0:01c} are met,
1810    
1811     \begin{equation}
1812     \label{eq:lem:crat0:sfdv0:sprx0:02:01}
1813     \lim_{N_I\rightarrow\infty}\frac{N_O}{N_I}
1814     =
1815     \frac{f_O}{f_I}
1816     =
1817     \frac{K_2}{K_4} .
1818     \end{equation}
1819     \end{vworklemmastatement}
1820     \begin{vworklemmaproof}
1821     (\ref{eq:lem:crat0:sfdv0:sprx0:02:01}) indicates that once
1822     an initial delay (determined by $K_1$) has finished,
1823     $N_O/N_I$ will converge on a steady-state value of
1824     $K_2/K_4$.
1825    
1826     The most straightforward way to analyze Algorithm \ref{alg:crat0:sfdv0:01c}
1827     is to show how an algorithm already
1828     understood (Algorithm \ref{alg:crat0:sfdv0:01a})
1829     can be transformed to
1830     Algorithm \ref{alg:crat0:sfdv0:01c}
1831     in a way where the analysis of Algorithm \ref{alg:crat0:sfdv0:01a}
1832     also applies to Algorithm \ref{alg:crat0:sfdv0:01c}.
1833     Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01} shows
1834     how such a transformation can be performed in
1835     four steps.
1836    
1837     \begin{figure}
1838     (a) Algorithm \ref{alg:crat0:sfdv0:01a} unchanged.
1839     $state_{a,n} \in \{0, 1, \ldots, K_4 - 1 \}$.
1840     \begin{verbatim}
1841     state += K2;
1842     if (state >= K4)
1843     {
1844     state -= K4;
1845     A();
1846     }
1847     \end{verbatim}
1848     (b) ``\texttt{>=}'' changed to ``\texttt{>}''. $state_{b,n} \in \{1, 2, \ldots, K_4 \}$,
1849     $state_{b,n} = state_{a,n} + 1$.
1850     \begin{verbatim}
1851     state += K2;
1852     if (state > K4)
1853     {
1854     state -= K4;
1855     A();
1856     }
1857     \end{verbatim}
1858     (c) Test against $K_4$ changed to test against zero.
1859     $state_{c,n} \in \{-K_4 + 1, -K_4 + 2, \ldots, 0 \}$,
1860     $state_{c,n} = state_{b,n} - K_4$.
1861     \begin{verbatim}
1862     state += K2;
1863     if (state > 0)
1864     {
1865     state -= K4;
1866     A();
1867     }
1868     \end{verbatim}
1869     (d) Sign inversion.
1870     $state_{d,n} \in \{ 0, 1, \ldots, K_4 - 1 \}$,
1871     $state_{d,n} = - state_{c,n}$.
1872     \begin{verbatim}
1873     state -= K2;
1874     if (state < 0)
1875     {
1876     state += K4;
1877     A();
1878     }
1879     \end{verbatim}
1880     (e) `C' expression rearrangement.
1881     $state_{e,n} \in \{ 0, 1, \ldots, K_4 - 1 \}$,
1882     $state_{e,n} = state_{d,n}$.
1883     \begin{verbatim}
1884     if ((state -= K2) < 0)
1885     {
1886     state += K4;
1887     A();
1888     }
1889     \end{verbatim}
1890     \caption{4-Step Transformation Of Algorithm \ref{alg:crat0:sfdv0:01a}
1891     To Algorithm \ref{alg:crat0:sfdv0:01c}}
1892     \label{fig:lem:crat0:sfdv0:sprx0:02:01}
1893     \end{figure}
1894    
1895     In Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01}, each of the
1896     four steps required to transform from Algorithm \ref{alg:crat0:sfdv0:01a} to
1897     Algorithm \ref{alg:crat0:sfdv0:01c} includes an equation to transform the
1898     \texttt{state} variable. Combining all of these
1899     transformations yields
1900    
1901     \begin{eqnarray}
1902     \label{eq:lem:crat0:sfdv0:sprx0:02:02}
1903     state_{e,n} & = & K_4 - 1 - state_{a,n} \\
1904     \label{eq:lem:crat0:sfdv0:sprx0:02:03}
1905     state_{a,n} & = & K_4 - 1 - state_{e,n}
1906     \end{eqnarray}
1907    
1908     We thus see that Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01}(a)
1909     (corresponding to Algorithm \ref{alg:crat0:sfdv0:01a}) and
1910     Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01}(e)
1911     (corresponding to Algorithm \ref{alg:crat0:sfdv0:01c}) have
1912     \texttt{state} semantics which involve the same range
1913     but a reversed order. (\ref{eq:lem:crat0:sfdv0:sprx0:02:01})
1914     follows directly from this observation and from
1915     Lemma \ref{lem:crat0:sfdv0:sprc0:02}.
1916     \end{vworklemmaproof}
1917     %\vworklemmafooter{}
1918    
1919     \begin{vworklemmastatement}
1920     \label{lem:crat0:sfdv0:sprx0:03}
1921     If $K_1=0$ and $\gcd(K_2, K_4)=1$\footnote{See also
1922     footnote \ref{footnote:lem:crat0:sfdv0:sprc0:04:01} in this chapter.}, the error between
1923     the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01c}
1924     and the ``ideal'' mapping is always
1925     in the set
1926    
1927     \begin{equation}
1928     \label{eq:lem:crat0:sfdv0:sprx0:03:01}
1929     \left[ 0, \frac{K_4 - 1}{K_4} \right] ,
1930     \end{equation}
1931    
1932     and no algorithm can be constructed to
1933     confine the error to a smaller interval.
1934     \end{vworklemmastatement}
1935     \begin{vworklemmaproof}
1936     Using the duality illustrated by
1937     (\ref{eq:lem:crat0:sfdv0:sprx0:02:02}) and
1938     (\ref{eq:lem:crat0:sfdv0:sprx0:02:03}),
1939     starting Algorithm \ref{alg:crat0:sfdv0:01c} with
1940     $state_0=0$ will yield a dual state vector
1941     with respect to starting Algorithm \ref{alg:crat0:sfdv0:01a} with
1942     $state_0=K_4-1$. Thus,
1943    
1944     \begin{equation}
1945     \label{eq:lem:crat0:sfdv0:sprx0:03:02}
1946     N_O = \left\lfloor \frac{n K_2 + K_4 - 1}{K_4} \right\rfloor .
1947     \end{equation}
1948    
1949     Using this altered value of $N_O$ in (\ref{eq:lem:crat0:sfdv0:sprc0:04:04})
1950     leads directly to (\ref{eq:lem:crat0:sfdv0:sprx0:03:01}).
1951    
1952     The proof that there can be no better algorithm is identical
1953     to the same proof for Lemma \ref{lem:crat0:sfdv0:sprc0:04} (Exercise \ref{exe:crat0:sexe0:06}).
1954     \end{vworklemmaproof}
1955     %\vworklemmafooter{}
1956    
1957     \begin{vworklemmastatement}
1958     \label{lem:crat0:sfdv0:sprx0:04}
1959     For Algorithm \ref{alg:crat0:sfdv0:01c}, once steady
1960     state has been achieved, the number of consecutive
1961     base subroutine invocations during which subroutine
1962     ``\texttt{A()}'' is executed is always in the set
1963    
1964     \begin{equation}
1965     \label{eq:lem:crat0:sfdv0:sprx0:04:01}
1966     \left\{
1967     \left\lfloor \frac{K_2}{K_4 - K_2} \right\rfloor ,
1968     \left\lceil \frac{K_2}{K_4 - K_2} \right\rceil
1969     \right\} \cap \vworkintsetpos,
1970     \end{equation}
1971    
1972     which contains one integer if $K_2/K_4 \leq 1/2$ or $(K_4-K_2) \vworkdivides K_2$,
1973     or two integers otherwise.
1974    
1975     Once steady state has been achieved, the number of
1976     consecutive base function invocations during which
1977     subroutine ``\texttt{A()}'' is not executed is
1978     always in the set
1979    
1980     \begin{equation}
1981     \label{eq:lem:crat0:sfdv0:sprx0:04:02}
1982     \left\{
1983     \left\lfloor \frac{K_4-K_2}{K_2} \right\rfloor ,
1984     \left\lceil \frac{K_4-K_2}{K_2} \right\rceil
1985     \right\} \cap \vworkintsetpos,
1986     \end{equation}
1987    
1988     which contains one integer if $K_2/K_4 \geq 1/2$ or $K_2 \vworkdivides K_4$,
1989     or two integers otherwise.
1990     \end{vworklemmastatement}
1991     \begin{vworklemmaproof}
1992     The proof comes directly from the duality between algorithm
1993     Algorithms \ref{alg:crat0:sfdv0:01a}
1994     and \ref{alg:crat0:sfdv0:01c} established in the
1995     proof of Lemma \ref{lem:crat0:sfdv0:sprx0:01}, so that the results
1996     from Lemma \ref{lem:crat0:sfdv0:sprc0:03} apply without modification.
1997     \end{vworklemmaproof}
1998     \vworklemmafooter{}
1999    
2000    
2001     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2002     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2003     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2004     \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01d}]
2005     {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01d}}
2006     %Section tag: PRX1
2007     \label{crat0:sfdv0:sprx1}
2008    
2009     Algorithm \ref{alg:crat0:sfdv0:01d}\footnote{Algorithm \ref{alg:crat0:sfdv0:01d}
2010     was contributed in March, 2003
2011     by John Larkin \cite{bibref:i:johnlarkin}
2012     via the
2013     \texttt{comp.arch.embedded} \cite{bibref:n:comparchembedded}
2014     newsgroup.}
2015     (Figure \ref{fig:crat0:sfdv0:01d}) is a further
2016     economization of Algorithms \ref{alg:crat0:sfdv0:01a}
2017     through \ref{alg:crat0:sfdv0:01c} that can be made by eliminating
2018     the addition or subtraction of $K_4$ and test against $K_3$
2019     and instead using the
2020     inherent machine integer size of $W$ bits to perform
2021     arithmetic modulo $2^W$. Thus, effectively, Algorithm \ref{alg:crat0:sfdv0:01d}
2022     is equivalent to Algorithm \ref{alg:crat0:sfdv0:01a} with
2023     $K_4 = K_3 = 2^W$.
2024    
2025     Figure \ref{fig:crat0:sfdv0:01d} shows both
2026     assembly-language (Texas Instruments TMS-370C8) and
2027     `C' implementations of the algorithm. The assembly-language
2028     version uses the carry flag of the processor and thus
2029     is \emph{very} efficient. Because `C' does not have access
2030     to the processor flags, the 'C' version is less efficient.
2031     The ``less than'' comparison when
2032     using unsigned integers is equivalent to a rollover test.
2033    
2034     It is easy to see from the figure that Algorithm \ref{alg:crat0:sfdv0:01d}
2035     is equivalent in all
2036     respects to Algorithm \ref{alg:crat0:sfdv0:01a} with
2037     $K_3 = K_4$ fixed at $2^W$. It is not necessary to enforce any constraints
2038     on $K_2$ because $K_2 < K_3 = K_4 = 2^W$ due to the inherent size of
2039     a machine integer. Note that unlike Algorithms \ref{alg:crat0:sfdv0:01a}
2040     through \ref{alg:crat0:sfdv0:01c} which allow $K_2$ and $K_4$ to be chosen independently
2041     and from the Farey series of appropriate order, Algorithm \ref{alg:crat0:sfdv0:01c}
2042     only allows
2043     $K_2/K_4$ of the form $K_2/2^W$.
2044    
2045     The properties below follow immediately
2046     from the properties of Algorithm \ref{alg:crat0:sfdv0:01a}.
2047    
2048     \begin{vworklemmastatement}
2049     \label{lem:crat0:sfdv0:sprx1:01}
2050     $N_{STARTUP}$, the number of invocations of the base subroutine
2051     in Algorithm \ref{alg:crat0:sfdv0:01d} before ``\texttt{A()}'' is called
2052     for the first time, is given by
2053    
2054     \begin{equation}
2055     \label{eq:lem:crat0:sfdv0:sprx1:01:01}
2056     N_{STARTUP} =
2057     \left\lfloor
2058     {
2059     \frac{2^W - K_1 - 1}{K_2}
2060     }
2061     \right\rfloor .
2062     \end{equation}
2063     \end{vworklemmastatement}
2064     \begin{vworklemmaproof}
2065     The value of \texttt{state} after the $n$th invocation
2066     is $state_n = K_1 + n K_2$. In order for the test in the
2067     \texttt{if()} statement not to be met, we require that
2068    
2069     \begin{equation}
2070     \label{eq:lem:crat0:sfdv0:sprx1:01:02}
2071     K_1 + n K_2 \leq 2^W - 1
2072     \end{equation}
2073    
2074     \noindent{}or equivalently that
2075    
2076     \begin{equation}
2077     \label{eq:lem:crat0:sfdv0:sprx1:01:03}
2078     n \leq \frac{2^W - K_1 - 1}{K_2} .
2079     \end{equation}
2080    
2081     Solving (\ref{eq:lem:crat0:sfdv0:sprx1:01:03}) for the largest
2082     value of $n \in \vworkintset$ which still meets the criterion
2083     yields (\ref{eq:lem:crat0:sfdv0:sprx1:01:01}).
2084     \end{vworklemmaproof}
2085    
2086     \begin{vworklemmastatement}
2087     \label{lem:crat0:sfdv0:sprx1:02}
2088     Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:01d} base subroutine
2089     is called, let $N_O$ be the number of times the
2090     ``\texttt{A()}'' subroutine is called, let
2091     $f_I$ be the frequency of invocation of the
2092     Algorithm \ref{alg:crat0:sfdv0:01d} base subroutine, and let
2093     $f_O$ be the frequency of invocation of
2094     ``\texttt{A()}''. Then
2095    
2096     \begin{equation}
2097     \label{eq:lem:crat0:sfdv0:sprx1:02:01}
2098     \lim_{N_I\rightarrow\infty}\frac{N_O}{N_I}
2099     =
2100     \frac{f_O}{f_I}
2101     =
2102     \frac{K_2}{2^W} ,
2103     \end{equation}
2104    
2105     where $W$ is the number of bits in a machine unsigned integer.
2106     Note that $K_2 < 2^W$ since $K_2 \in \{ 0, 1, \ldots , 2^W-1 \}$.
2107     \end{vworklemmastatement}
2108     \begin{vworklemmaproof}
2109     The proof is identical to the proof of
2110     Lemma \ref{lem:crat0:sfdv0:sprc0:02} with $K_3=K_4=2^W$.
2111     Note that Algorithm \ref{alg:crat0:sfdv0:01a} calculates $n K_2 \bmod K_4$ by
2112     subtraction, whereas Algorithm \ref{alg:crat0:sfdv0:01d} calculates
2113     $n K_2 \bmod 2^W$ by the properties of a $W$-bit counter
2114     which is allowed to roll over.
2115     \end{vworklemmaproof}
2116     %\vworklemmafooter{}
2117    
2118    
2119     \begin{vworklemmastatement}
2120     \label{lem:crat0:sfdv0:sprx1:03}
2121     If $\gcd(K_2, 2^W)=1$\footnote{See also footnote \ref{footnote:lem:crat0:sfdv0:sprc0:04:01}
2122     in this chapter. Note also that in this context the condition $\gcd(K_2, 2^W)=1$
2123     is equivalent to the condition that $K_2$ be odd.}, the error between
2124     the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01d}
2125     and the ``ideal'' mapping is always
2126     in the set
2127    
2128     \begin{equation}
2129     \label{eq:lem:crat0:sfdv0:sprx1:03:01}
2130     \left[ - \frac{2^W - 1}{2^W} , 0 \right] ,
2131     \end{equation}
2132    
2133     and no algorithm can be constructed to
2134     confine the error to a smaller interval.
2135     \end{vworklemmastatement}
2136     \begin{vworklemmaproof}
2137     The proof is identical to the proof of Lemma
2138     \ref{lem:crat0:sfdv0:sprc0:04} with $K_4 = 2^W$.
2139     \end{vworklemmaproof}
2140     %\vworklemmafooter{}
2141    
2142     \begin{vworklemmastatement}
2143     \label{lem:crat0:sfdv0:sprx1:04}
2144     For Algorithm \ref{alg:crat0:sfdv0:01d}
2145     (Figure \ref{fig:crat0:sfdv0:01d}), once steady
2146     state has been achieved, the number of consecutive
2147     base subroutine invocations during which subroutine
2148     ``\texttt{A()}'' is executed is always in the set
2149    
2150     \begin{equation}
2151     \label{eq:lem:crat0:sfdv0:sprx1:04:01}
2152     \left\{
2153     \left\lfloor \frac{K_2}{2^W - K_2} \right\rfloor ,
2154     \left\lceil \frac{K_2}{2^W - K_2} \right\rceil
2155     \right\} \cap \vworkintsetpos,
2156     \end{equation}
2157    
2158     which contains one integer if $K_2/2^W \leq 1/2$ or $(2^W-K_2) \vworkdivides K_2$,
2159     or two integers otherwise.
2160    
2161     Once steady state has been achieved, the number of
2162     consecutive base function invocations during which
2163     subroutine ``\texttt{A()}'' is not executed is
2164     always in the set
2165    
2166     \begin{equation}
2167     \label{eq:lem:crat0:sfdv0:sprx1:04:02}
2168     \left\{
2169     \left\lfloor \frac{2^W-K_2}{K_2} \right\rfloor ,
2170     \left\lceil \frac{2^W-K_2}{K_2} \right\rceil
2171     \right\} \cap \vworkintsetpos,
2172     \end{equation}
2173    
2174     which contains one integer if $K_2/2^W \geq 1/2$ or $K_2 \vworkdivides 2^W$,
2175     or two integers otherwise.
2176     \end{vworklemmastatement}
2177     \begin{vworklemmaproof}
2178     The proof is identical to the proof of Lemma
2179     \ref{lem:crat0:sfdv0:sprc0:03} with $K_4 = 2^W$.
2180     \end{vworklemmaproof}
2181     \vworklemmafooter{}
2182    
2183    
2184     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2185     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2186     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2187     \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:02a}]
2188     {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:02a}}
2189     %Section tag: PRC2
2190     \label{crat0:sfdv0:sprc2}
2191    
2192     Another useful rational counting algorithm is Algorithm \ref{alg:crat0:sfdv0:02a}.
2193     At first glance, it may appear that Algorithm \ref{alg:crat0:sfdv0:02a}
2194     is qualitatively
2195     different than Algorithms \ref{alg:crat0:sfdv0:01a}
2196     and \ref{alg:crat0:sfdv0:01b}.
2197     However, as the following lemmas demonstrate, Algorithm \ref{alg:crat0:sfdv0:02a}
2198     can be easily rearranged to be in the form
2199     of Algorithm \ref{alg:crat0:sfdv0:01a}.
2200    
2201     \begin{vworklemmastatement}
2202     \label{lem:crat0:sfdv0:sprc2:01}
2203     $N_{STARTUP}$, the number of invocations of the base subroutine
2204     in Algorithm \ref{alg:crat0:sfdv0:02a} before ``\texttt{A()}'' is called
2205     for the first time, is given by
2206    
2207     \begin{equation}
2208     \label{eq:lem:crat0:sfdv0:sprc2:01:01}
2209     N_{STARTUP} =
2210     \left\lceil
2211     {
2212     \frac{K_3 - K_1}{K_2}
2213     }
2214     \right\rceil .
2215     \end{equation}
2216     \end{vworklemmastatement}
2217     \begin{vworklemmaproof}
2218     The value of \texttt{state} after the $n$th invocation
2219     is $K_1 + n K_2$. In order for the test in the
2220     \texttt{if()} statement to be met on the $n+1$'th invocation
2221     of the base subroutine, we require that
2222    
2223     \begin{equation}
2224     \label{eq:lem:crat0:sfdv0:sprc2:01:02}
2225     K_1 + n K_2 \geq K_3
2226     \end{equation}
2227    
2228     \noindent{}or equivalently that
2229    
2230     \begin{equation}
2231     \label{eq:lem:crat0:sfdv0:sprc2:01:03}
2232     n \geq \frac{K_3 - K_1}{K_2} .
2233     \end{equation}
2234    
2235     Solving (\ref{eq:lem:crat0:sfdv0:sprc2:01:03}) for the smallest
2236     value of $n \in \vworkintset$ which still meets the criterion
2237     yields (\ref{eq:lem:crat0:sfdv0:sprc2:01:01}). Note that
2238     the derivation of (\ref{eq:lem:crat0:sfdv0:sprc2:01:01}) requires
2239     that the restrictions on $K_1$ through $K_4$ documented in
2240     Figure \ref{fig:crat0:sfdv0:02a} be met.
2241     \end{vworklemmaproof}
2242    
2243     \begin{vworklemmastatement}
2244     \label{lem:crat0:sfdv0:sprc2:02}
2245     Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:02a} base subroutine
2246     is called, let $N_{OA}$ be the number of times the
2247     ``\texttt{A()}'' subroutine is called, let
2248     $f_I$ be the frequency of invocation of the
2249     Algorithm \ref{alg:crat0:sfdv0:02a} base subroutine, and let
2250     $f_{OA}$ be the frequency of invocation of
2251     ``\texttt{A()}''. Then, the proportion of times the
2252     ``\texttt{A()}'' subroutine is called is given by
2253    
2254     \begin{equation}
2255     \label{eq:lem:crat0:sfdv0:sprc2:02:01}
2256     \lim_{N_I\rightarrow\infty}\frac{N_{OA}}{N_I}
2257     =
2258     \frac{f_{OA}}{f_I}
2259     =
2260     \frac{K_2}{K_4 + K_2} ,
2261     \end{equation}
2262    
2263     and the proportion of times the ``\texttt{B()}'' subroutine is called
2264     is given by
2265    
2266     \begin{equation}
2267     \label{eq:lem:crat0:sfdv0:sprc2:02:02}
2268     \lim_{N_I\rightarrow\infty}\frac{N_{OB}}{N_I}
2269     =
2270     \frac{f_{OB}}{f_I}
2271     =
2272     1 - \frac{f_{OA}}{f_I}
2273     =
2274     \frac{K_4}{K_4 + K_2} .
2275     \end{equation}
2276     \end{vworklemmastatement}
2277     \begin{vworklemmaproof}
2278     As in Lemma \ref{} and without
2279     loss of generality, we assume for analytic
2280     convenience that $K_1=0$ and $K_3=K_4$. Note that
2281     $K_1$ and $K_3$ influence only the transient startup
2282     behavior of the algorithm.
2283    
2284     It can be observed from the algorithm that once steady
2285     state is achieved, \texttt{state} will be confined to the set
2286    
2287     \begin{equation}
2288     \label{eq:lem:crat0:sfdv0:sprc2:02:10}
2289     state \in \{ 0, 1, \ldots , K_4 + K_2 - 1 \} .
2290     \end{equation}
2291    
2292     It is certainly possible to use results from
2293     number theory and analyze which values in the
2294     set (\ref{eq:lem:crat0:sfdv0:sprc2:02:10}) can be
2295     attained and the order in which they can be attained.
2296     However, an easier approach is to observe that
2297     Algorithm \ref{alg:crat0:sfdv0:02a}
2298     can be rearranged to take the form of
2299     rational counting Algorithm \ref{alg:crat0:sfdv0:01a}.
2300     This rearranged
2301     algorithm is presented as
2302     Figure \ref{fig:lem:crat0:sfdv0:sprc2:02:01}. Note that the
2303     algorithm is rearranged only for easier analysis.
2304    
2305     \begin{figure}
2306     \begin{verbatim}
2307     void base_rate_sub(void)
2308     {
2309     static unsigned int state = K1;
2310    
2311     state += K2;
2312    
2313     if (state >= (K4 + K2))
2314     {
2315     state -= (K4 + K2);
2316     A();
2317     }
2318     else
2319     {
2320     B();
2321     }
2322     }
2323     \end{verbatim}
2324     \caption{Algorithm \ref{alg:crat0:sfdv0:02a} Modified To Resemble Algorithm \ref{alg:crat0:sfdv0:01a}
2325     (Proof Of Lemma \ref{lem:crat0:sfdv0:sprc2:02})}
2326     \label{fig:lem:crat0:sfdv0:sprc2:02:01}
2327     \end{figure}
2328    
2329     In Figure \ref{fig:lem:crat0:sfdv0:sprc2:02:01}, the
2330     statement ``\texttt{state += K2}'' has been removed from the
2331     \texttt{else} clause and placed above the \texttt{if()} statement,
2332     and other constants have been adjusted accordingly.
2333     It can be observed that the figure
2334     is structurally identical to rational counting algorithm, except for the
2335     \texttt{else} clause (which does not affect the counting behavior) and
2336     the specific constants for testing and incrementation.
2337    
2338     In Figure \ref{fig:lem:crat0:sfdv0:sprc2:02:01}, as contrasted with
2339     Algorithm \ref{alg:crat0:sfdv0:01a}, ``$K_4 + K_2$'' takes the
2340     place of $K_4$. $\gcd(K_2, K_4 + K_2) = \gcd(K_2, K_4)$
2341     (see Lemma \cprizeroxrefhyphen\ref{lem:cpri0:gcd0:01}), so the
2342     results from
2343     \end{vworklemmaproof}
2344    
2345     \begin{vworklemmastatement}
2346     \label{lem:crat0:sfdv0:sprc2:03}
2347     If $K_3=K_4$, $K_1=0$, and $\gcd(K_2, K_4)=1$, the error between
2348     the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01b}
2349     and the ``ideal'' mapping is always
2350     in the set
2351    
2352     \begin{equation}
2353     \label{eq:lem:crat0:sfdv0:sprc2:03:01}
2354     \left[ - \frac{K_4 - 1}{K_4} , 0 \right] ,
2355     \end{equation}
2356    
2357     and no algorithm can be constructed to
2358     confine the error to a smaller interval.
2359     \end{vworklemmastatement}
2360     \begin{vworklemmaproof}
2361     The proof is identical to Lemma \ref{lem:crat0:sfdv0:sprc0:04}.
2362     \end{vworklemmaproof}
2363    
2364    
2365    
2366    
2367    
2368    
2369    
2370     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2371     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2372     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2373     \section{Bresenham's Line Algorithm}
2374     %Section tag: BLA0
2375     \label{crat0:sbla0}
2376    
2377     \index{Bresenham's line algorithm}\emph{Bresenham's line algorithm} is a
2378     very efficient algorithm for drawing lines on devices that have
2379     a rectangular array of pixels which can be individually illuminated.
2380     Bresenham's line algorithm is efficient for small microcontrollers
2381     because it relies only
2382     on integer addition, subtraction, shifting, and comparison.
2383    
2384     Bresenham's line algorithm is presented for two reasons:
2385    
2386     \begin{itemize}
2387     \item The algorithm is useful for drawing lines on LCD
2388     displays and other devices typically controlled by
2389     microcontrollers.
2390     \item The algorithm is an [extremely optimized] application
2391     of the rational
2392     counting algorithms presented in this chapter.
2393     \end{itemize}
2394    
2395     \begin{figure}
2396     \begin{center}
2397     \begin{huge}
2398     Figure Space Reserved
2399     \end{huge}
2400     \end{center}
2401     \caption{Raster Grid For Development Of Bresenham's Line Algorithm}
2402     \label{fig:crat0:sbla0:01}
2403     \end{figure}
2404    
2405     Assume that we wish to draw a line from $(0,0)$ to $(x_f, y_f)$ on
2406     a raster device (Figure \ref{fig:crat0:sbla0:01}). For simplicity of
2407     development, assume that $y_f \leq x_f$ (i.e. that the slope $m \leq 1$).
2408    
2409     For each value of $x \in \vworkintset$, the ideal value of $y$ is given
2410     by
2411    
2412     \begin{equation}
2413     \label{eq:crat0:sbla0:01}
2414     y = mx = \frac{y_f}{x_f} x = \frac{y_f x}{x_f} .
2415     \end{equation}
2416    
2417     \noindent{}However, on a raster device, we must usually
2418     choose an inexact pixel to illuminate, since it is typically
2419     rare that $x_f \vworkdivides y_f x$. If
2420     $x_f \vworkdivides y_f x$, then the ideal value of $y$ is
2421     an integer, and we choose to illuminate
2422     $(x, (y_f x)/x_f)$. However, if $x_f \vworknotdivides y_f x$,
2423     then we must choose either a pixel with the same y-coordinate
2424     as the previous pixel (we call this choice `D') or the pixel
2425     with a y-coordinate one greater than the previous pixel (we
2426     call this choice `U').
2427     The fractional part of the quotient
2428     $(y_f x) / x_f$ indicates whether D or U is closer to the ideal line.
2429     If $y_f x \bmod x_f \geq x_f/2$, we choose U, otherwise we choose D
2430     (note that the decision to choose U in the equality case is arbitrary).
2431    
2432     Using the rational approximation techniques presented in
2433     Section \ref{crat0:sfdv0}, it is straightforward to
2434     develop an algorithm, which is presented as the code
2435     in Figure \ref{fig:crat0:sbla0:02}.
2436     Note that this code will only work if $m = y_f/x_f \leq 1$.
2437    
2438     \begin{figure}
2439     \begin{verbatim}
2440     /* Draws a line from (0,0) to (x_f,y_f) on a raster */
2441     /* device. */
2442    
2443     void bresenham_line(int x_f, int y_f)
2444     {
2445     int d=0; /* The modulo counter. */
2446     int x=0, y=0;
2447     /* x- and y-coordinates currently being */
2448     /* evaluated. */
2449     int d_old; /* Remembers previous value of d. */
2450    
2451     plotpoint(0,0); /* Plot initial point. */
2452     while (x <= x_f)
2453     {
2454     d_old = d;
2455     d += y_f;
2456     if (d >= x_f)
2457     d -= x_f;
2458     x++;
2459     if (
2460     (
2461     (d == 0) && (d_old < x_f/2)
2462     )
2463     ||
2464     (
2465     (d >= x_f/2)
2466     &&
2467     ((d_old < x_f/2) || (d_old >= d))
2468     )
2469     )
2470     y++;
2471     plotpoint(x,y);
2472     }
2473     }
2474     \end{verbatim}
2475     \caption{First Attempt At A Raster Device Line Algorithm
2476     Using Rational Counting Techniques}
2477     \label{fig:crat0:sbla0:02}
2478     \end{figure}
2479    
2480     There are a few efficiency refinements that can be made to
2481     the code in Figure \ref{fig:crat0:sbla0:02}, but overall
2482     it is a very efficient algorithm. Note that
2483     nearly all compilers will handle the integer
2484     division by two using a shift
2485     operation rather than a division.
2486    
2487     We can however substantially simplify and economize the code of
2488     Figure \ref{fig:crat0:sbla0:02} by using the technique
2489     presented in Figures \ref{fig:crat0:sfdv0:fab0:03} and
2490     \ref{fig:crat0:sfdv0:fab0:04}, and this improved code is
2491     presented as Figure \ref{fig:crat0:sbla0:03}.
2492    
2493     \begin{figure}
2494     \begin{verbatim}
2495     /* Draws a line from (0,0) to (x_f,y_f) on a raster */
2496     /* device. */
2497    
2498     void bresenham_line(int x_f, int y_f)
2499     {
2500     int d=y_f; /* Position of the ideal line minus */
2501     /* the position of the line we are */
2502     /* drawing, in units of 1/x_f. The */
2503     /* initialization value is y_f because */
2504     /* the algorithm is looking one pixel */
2505     /* ahead in the x direction, so we */
2506     /* begin at x=1. */
2507     int x=0, y=0;
2508     /* x- and y-coordinates currently being */
2509     /* evaluated. */
2510     plotpoint(0,0); /* Plot initial point. */
2511     while (x <= x_f)
2512     {
2513     x++; /* We move to the right regardless. */
2514     if (d >= x_f/2)
2515     {
2516     /* The "U" choice. We must jump up a pixel */
2517     /* to keep up with the ideal line. */
2518     d += (y_f - x_f);
2519     y++; /* Jump up a pixel. */
2520     }
2521     else /* d < x_f/2 */
2522     {
2523     /* The "D" choice. Distance is not large */
2524     /* enough to jump up a pixel. */
2525     d += y_f;
2526     }
2527     plotpoint(x,y);
2528     }
2529     }
2530     \end{verbatim}
2531     \caption{Second Attempt At A Raster Device Line Algorithm
2532     Using Rational Counting Techniques}
2533     \label{fig:crat0:sbla0:03}
2534     \end{figure}
2535    
2536     In order to understand the code of Figure \ref{fig:crat0:sbla0:03},
2537     it is helpful to view the problem in an alternate way.
2538     For any $x \in \vworkintset$, let
2539     $d$ be the distance between the position of the ideal line
2540     (characterized by $y = y_f x / x_f$) and
2541     the actual pixel which will be illuminated. It is easy to
2542     observe that:
2543    
2544     \begin{itemize}
2545     \item When drawing a raster line, if one proceeds from
2546     $(x, y)$ to $(x+1, y)$ (i.e. makes the ``D'' choice),
2547     $d$ will increase by $y_f/x_f$.
2548     \item When drawing a raster line, if one proceeds from
2549     $(x,y)$ to $(x+1, y+1)$ (i.e. makes the ``U'' choice),
2550     $d$ will increase by $(y_f - x_f)/x_f$. (The increase
2551     of $y_f/x_f$ comes about because the ideal line proceeds
2552     upward from $x$ to $x+1$, while the decrease of $x_f/x_f = 1$
2553     comes about because the line being drawn jumps upward by one
2554     unit, thus tending to ``catch'' the ideal line.)
2555     \end{itemize}
2556    
2557     The code of Figure \ref{fig:crat0:sbla0:03} implements the
2558     two observations above in a straightforward way. $d$ is maintained
2559     in units of $1/x_f$, and when ``U'' is chosen over ``D'' whenever
2560     the gap between the ideal line and the current row of pixels
2561     being drawn becomes too large.
2562    
2563     The code in Figure \ref{fig:crat0:sbla0:03} does however contain logical
2564     and performance problems which should be corrected:
2565    
2566     \begin{itemize}
2567     \item The test of $d$ against $x_f/2$ will perform as intended.
2568     For example, if $d=2$ and $x_f=5$, the test
2569     ``\texttt{d >= x\_f/2}'' in the code will evaluate true
2570     although the actual condition is false. To correct this
2571     defect, the units of $d$ should be changed from
2572     $1/x_f$ to $1/(2 x_f)$.
2573     \item The quantity $y_f - x_f$ is calculated repeatedly. This
2574     calculation should be moved out of the \emph{while()} loop.
2575     \item The test against $x_f$ may be more economical if changed to
2576     a test against 0 (but this requires a different initialization
2577     assignment for $d$).
2578     \end{itemize}
2579    
2580     Figure \ref{fig:crat0:sbla0:04} corrects these defects
2581     from Figure \ref{fig:crat0:sbla0:03}.
2582     Figure \ref{fig:crat0:sbla0:04} is essentially the Bresenham
2583     line algorithm, except that it only draws starting from the
2584     origin and will only draw a line with a slope
2585     $m = y_f/x_f \leq 1$.
2586    
2587     \begin{figure}
2588     \begin{verbatim}
2589     /* Draws a line from (0,0) to (x_f,y_f) on a raster */
2590     /* device. */
2591    
2592     void bresenham_line(int x_f, int y_f)
2593     {
2594     int d = 2 * y_f - x_f;
2595     /* Position of the ideal line minus */
2596     /* the position of the line we are */
2597     /* drawing, in units of 1/(2 * x_f). */
2598     /* Initialization value of 2 * y_f is */
2599     /* because algorithm is looking one */
2600     /* pixel ahead. Value of -x_f is from */
2601     /* shifting the midpoint test (the */
2602     /* "if" statement below) downward to a */
2603     /* test against zero. */
2604     int dD = 2 * y_f;
2605     int dU = dD - x_f;
2606     /* Amounts to add to d if "D" and "U" */
2607     /* pixels are chosen, respectively. */
2608     /* Calculated here outside of loop. */
2609     int x=0, y=0;
2610     /* x- and y-coordinates currently being */
2611     /* evaluated. */
2612     plotpoint(0,0); /* Plot initial point. */
2613     while (x <= x_f)
2614     {
2615     x++; /* We move to the right regardless. */
2616     if (d >= 0)
2617     {
2618     /* The "U" choice. We must jump up a pixel */
2619     /* to keep up with the ideal line. */
2620     d += dU;
2621     y++; /* Jump up a pixel. */
2622     }
2623     else /* d < 0 */
2624     {
2625     /* The "D" choice. Distance is not large */
2626     /* enough to jump up a pixel. */
2627     d += dD;
2628     }
2629     plotpoint(x,y);
2630     }
2631     }
2632     \end{verbatim}
2633     \caption{Third Attempt At A Raster Device Line Algorithm
2634     Using Rational Counting Techniques}
2635     \label{fig:crat0:sbla0:04}
2636     \end{figure}
2637    
2638    
2639     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2640     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2641     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2642     \section{Authors And Acknowledgements}
2643     %Section tag: ACK0
2644     This chapter was primarily written by
2645     \index{Ashley, David T.} David T. Ashley
2646     \cite{bibref:i:daveashley}.
2647    
2648     We would like to gratefully acknowledge the assistance of
2649     \index{Falconer, Chuck B.} Chuck B. Falconer \cite{bibref:i:chuckbfalconer},
2650     \index{Hoffmann, Klaus} Klaus Hoffmann \cite{bibref:i:klaushoffmann},
2651     \index{Larkin, John} John Larkin \cite{bibref:i:johnlarkin},
2652     \index{Smith, Thad} Thad Smith \cite{bibref:i:thadsmith},
2653     and
2654     \index{Voipio, Tauno} Tauno Voipio \cite{bibref:i:taunovoipio}
2655     for insight into rational counting approaches, contributed via the
2656     \texttt{sci.math} \cite{bibref:n:scimathnewsgroup}
2657     and
2658     \texttt{comp.arch.embedded} \cite{bibref:n:comparchembedded}
2659     newsgroups.
2660    
2661    
2662     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2663     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2664     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2665     \section{Exercises}
2666     %Section tag: EXE0
2667    
2668     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2669     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2670     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2671     \subsection[\protect\mbox{\protect$h/2^q$} and \protect\mbox{\protect$2^q/k$} Rational Linear Approximation]
2672     {\protect\mbox{\protect\boldmath$h/2^q$} and \protect\mbox{\protect\boldmath$2^q/k$} Rational Linear Approximation}
2673    
2674     \begin{vworkexercisestatement}
2675     \label{exe:crat0:sexe0:a01}
2676     Derive equations analogous to (\ref{eq:crat0:shqq0:dph0:03})
2677     and (\ref{eq:crat0:shqq0:dph0:07}) if $r_A$ is chosen
2678     without rounding, i.e.
2679     $h=\lfloor r_I 2^q \rfloor$ and therefore
2680     $r_A=\lfloor r_I 2^q \rfloor/2^q$.
2681     \end{vworkexercisestatement}
2682     \vworkexercisefooter{}
2683    
2684     \begin{vworkexercisestatement}
2685     \label{exe:crat0:sexe0:a02}
2686     Derive equations analogous to (\ref{eq:crat0:shqq0:dph0:03})
2687     and (\ref{eq:crat0:shqq0:dph0:07}) if
2688     $z$ is chosen for rounding with the midpoint case rounded
2689     down, i.e. $z=2^{q-1}-1$, and applied as in
2690     (\ref{eq:crat0:sint0:01}).
2691     \end{vworkexercisestatement}
2692     \vworkexercisefooter{}
2693    
2694     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2695     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2696     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2697     \subsection{Rational Counting}
2698    
2699    
2700     \begin{vworkexercisestatement}
2701     \label{exe:crat0:sexe0:01}
2702     For Algorithm \ref{alg:crat0:sfdv0:01a},
2703     assume that one chooses $K_1 > K_3 - K_2$ (in contradiction to the
2704     restrictions in Figure \ref{fig:crat0:sfdv0:01a}).
2705     Derive a result similar to Lemma \ref{lem:crat0:sfdv0:sprc0:01}
2706     for the number of base subroutine invocations in which
2707     ``\texttt{A()}'' is run before it is
2708     \emph{not} run for the first time.
2709     \end{vworkexercisestatement}
2710     \vworkexercisefooter{}
2711    
2712     \begin{vworkexercisestatement}
2713     \label{exe:crat0:sexe0:02}
2714     This will be the $\epsilon$ lemma proof.
2715     \end{vworkexercisestatement}
2716     \vworkexercisefooter{}
2717    
2718     \begin{vworkexercisestatement}
2719     \label{exe:crat0:sexe0:03}
2720     Rederive appropriate results similar to
2721     Lemma \ref{lem:crat0:sfdv0:sprc0:04} in the case where
2722     $\gcd(K_2, K_4) > 1$.
2723     \end{vworkexercisestatement}
2724     \vworkexercisefooter{}
2725    
2726     \begin{vworkexercisestatement}
2727     \label{exe:crat0:sexe0:04}
2728     Rederive appropriate results similar to
2729     Lemma \ref{lem:crat0:sfdv0:sprc0:04} in the case where
2730     $K_1 \neq 0$.
2731     \end{vworkexercisestatement}
2732     \vworkexercisefooter{}
2733    
2734     \begin{vworkexercisestatement}
2735     \label{exe:crat0:sexe0:05}
2736     Rederive appropriate results similar to
2737     Lemma \ref{lem:crat0:sfdv0:sprc0:04} in the case where
2738     $\gcd(K_2, K_4) > 1$ and $K_1 \neq 0$.
2739     \end{vworkexercisestatement}
2740     \vworkexercisefooter{}
2741    
2742     \begin{vworkexercisestatement}
2743     \label{exe:crat0:sexe0:06}
2744     For Lemma \ref{lem:crat0:sfdv0:sprc0:04},
2745     complete the missing proof:
2746     show that if $\gcd(K_2, K_4) = 1$, no algorithm can
2747     lead to a tighter bound than (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}).
2748     \textbf{Hint:} start with the observation
2749     that with
2750     $\gcd(K_2, K_4) = 1$, $n K_2 \bmod K_4$ will attain every value in
2751     the set $\{ 0, \ldots , K_4-1 \}$.
2752     \end{vworkexercisestatement}
2753     \vworkexercisefooter{}
2754    
2755     \begin{vworkexercisestatement}
2756     \label{exe:crat0:sexe0:07}
2757     For Lemma \ref{lem:crat0:sfdv0:sprc0:03},
2758     show that if $K_1=0$, the number of initial invocations
2759     of the base subroutine before ``\texttt{A()}'' is first
2760     called is in the set specified in
2761     (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}).
2762     \end{vworkexercisestatement}
2763     \vworkexercisefooter{}
2764    
2765     \begin{vworkexercisestatement}
2766     \label{exe:crat0:sexe0:08}
2767     Develop other techniques to correct the state upset vulnerability
2768     of Algorithm \ref{alg:crat0:sfdv0:01a} besides
2769     the technique illustrated in
2770     Figure \ref{fig:crat0:sfdv0:sprc0:01}.
2771     \end{vworkexercisestatement}
2772     \vworkexercisefooter{}
2773    
2774     \begin{vworkexercisestatement}
2775     \label{exe:crat0:sexe0:09}
2776     Show for Example \ref{ex:crat0:sfdv0:sprc1:01} that integers of at least
2777     27 bits are required.
2778     \end{vworkexercisestatement}
2779     \vworkexercisefooter{}
2780    
2781     \begin{vworkexercisestatement}
2782     \label{exe:crat0:sexe0:10}
2783     Prove Lemma \ref{lem:crat0:sfdv0:sprc1:02}.
2784     \end{vworkexercisestatement}
2785     \vworkexercisefooter{}
2786    
2787     \begin{vworkexercisestatement}
2788     \label{exe:crat0:sexe0:12}
2789     Prove Lemma \ref{lem:crat0:sfdv0:sprc1:04}.
2790     \end{vworkexercisestatement}
2791     \vworkexercisefooter{}
2792    
2793     \begin{vworkexercisestatement}
2794     \label{exe:crat0:sexe0:13}
2795     Define the term \emph{steady state} as used in
2796     Lemma \ref{lem:crat0:sfdv0:sprc1:04} in terms of
2797     set membership of the \texttt{state} variable.
2798     \end{vworkexercisestatement}
2799     \vworkexercisefooter{}
2800    
2801     \begin{vworkexercisestatement}
2802     \label{exe:crat0:sexe0:14}
2803     For Algorithm \ref{alg:crat0:sfdv0:01a}, devise examples of anomalous behavior due to
2804     race conditions that may occur if $K_2$ and/or $K_4$ are set in a process
2805     which is asynchronous with respect to the process which implements the
2806     rational counting algorithm if mutual exclusion protocol is not
2807     implemented.
2808     \end{vworkexercisestatement}
2809     \vworkexercisefooter{}
2810    
2811     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2812     \vfill
2813     \noindent\begin{figure}[!b]
2814     \noindent\rule[-0.25in]{\textwidth}{1pt}
2815     \begin{tiny}
2816     \begin{verbatim}
2817 dashley 278 $HeadURL$
2818     $Revision$
2819     $Date$
2820     $Author$
2821 dashley 140 \end{verbatim}
2822     \end{tiny}
2823     \noindent\rule[0.25in]{\textwidth}{1pt}
2824     \end{figure}
2825 dashley 278 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2826 dashley 140 %
2827     %End of file C_RAT0.TEX

Properties

Name Value
svn:eol-style native
svn:keywords Author Date Id Revision URL Header

dashley@gmail.com
ViewVC Help
Powered by ViewVC 1.1.25