/[dtapublic]/pubs/books/ucbka/trunk/c_rat0/c_rat0.tex
ViewVC logotype

Contents of /pubs/books/ucbka/trunk/c_rat0/c_rat0.tex

Parent Directory Parent Directory | Revision Log Revision Log


Revision 6 - (show annotations) (download) (as text)
Fri Oct 7 01:36:46 2016 UTC (8 years, 1 month ago) by dashley
File MIME type: application/x-tex
File size: 111163 byte(s)
Initial commit after migrating from CVS.
1 %$Header: /home/dashley/cvsrep/e3ft_gpl01/e3ft_gpl01/dtaipubs/esrgubka/c_rat0/c_rat0.tex,v 1.28 2004/02/22 19:27:48 dtashley Exp $
2
3 \chapter{Rational Linear Approximation}
4
5 \label{crat0}
6
7 \beginchapterquote{``Die ganzen Zahlen hat der liebe Gott gemacht,
8 alles andere ist Menschenwerk.''\footnote{German
9 language: God made the integers; everything
10 else was made by man.}}
11 {Leopold Kronecker}
12
13 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
14 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
15 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
16 \section{Introduction}
17 %Section tag: INT0
18 \label{crat0:sint0}
19
20 In this chapter, we consider practical applications of
21 rational approximation.
22 Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0}
23 have presented algorithms for finding
24 the closest rational numbers to an arbitrary real number,
25 subject to constraints on the numerator and denominator.
26 The basis of these algorithms is complex and comes from number theory, and so
27 these algorithms and their basis have been presented in separate chapters.
28
29 In Section \ref{crat0:srla0}, rational linear approximation itself
30 and associated error bounds are presented. By \emph{rational linear
31 approximation} we mean simply the approximation of a line
32 $y = r_I x$ ($y, r_I, x \in \vworkrealset$) by a line
33
34 \begin{equation}
35 \label{eq:crat0:sint0:01}
36 y = \left\lfloor
37 \frac{h \lfloor x \rfloor + z}{k}
38 \right\rfloor ,
39 \end{equation}
40
41 \noindent{}where we choose $h/k \approx r_I$ and optionally choose $z$ to
42 shift the error introduced. Note that (\ref{eq:crat0:sint0:01}) is
43 very economical for microcontroller instruction sets, since only integer
44 arithmetic is required. We may choose $h/k$ from a Farey series (see
45 Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0}), or
46 we may choose a ratio $h/2^q$ so that the division in (\ref{eq:crat0:sint0:01})
47 can be implemented
48 by a bitwise right shift.
49
50 Section \ref{crat0:srla0} discusses linear rational approximation
51 in general, with a special eye on error analysis.
52
53 Section \ref{crat0:spwi0} discusses piecewise linear rational approximation,
54 which is the approximation of a curve or complex mapping by a
55 number of joined line segments.
56
57 Section \ref{crat0:sfdv0} discusses frequency division and rational counting.
58 Such techniques share the same mathematical framework as rational linear
59 approximation, and as with rational linear approximation the ratio
60 involved may be chosen from a Farey series or with a denominator of $2^q$, depending
61 on the algorithm employed.
62
63 Section \ref{crat0:sbla0} discusses Bresenham's classic line algorithm,
64 which is a practical application of rational linear approximation.
65
66 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
67 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
68 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
69 \section{Rational Linear Approximation}
70 %Section tag: RLA0
71 \label{crat0:srla0}
72
73 It occurs frequently in embedded software design that one wishes to
74 implement a linear scaling from a domain to a range of the form
75
76 \begin{equation}
77 \label{eq:crat0:srla0:01}
78 f(x) = r_I x ,
79 \end{equation}
80
81 \noindent{}where $r_I$ is the \emph{ideal}
82
83
84 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
85 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
86 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
87 \subsection{Model Functions}
88 %Section tag: mfu0
89 \label{crat0:srla0:smfu0}
90
91 In general, we seek to approximate the ideal function
92
93
94 \noindent{}by some less ideal function where
95
96 \begin{itemize}
97 \item $r_A \neq r_I$, although we seek to choose $r_A \approx r_I$.
98 \item The input to the function, $x$, may already contain
99 quantization error.
100 \item Although $r_I x \in \vworkrealsetnonneg$, we must choose
101 an integer as the function output.
102 \end{itemize}
103
104 In modeling quantization error, we use the floor function\index{floor function}
105 ($\lfloor\cdot\rfloor$)
106 for algebraic simplicity. The floor function precisely
107 describes the behavior of integer division instructions (where
108 remainders are discarded), but may not describe other sources of
109 quantization, such as quantization that occurs in A/D conversion.
110 However, techniques identical to those presented in this
111 section may be used when quantization is not best described
112 by the floor function, and these results are left to the reader.
113
114 Traditionally, because addition of integers is an inexpensive
115 machine operation, a parameter $z \in \vworkintset$ may optionally
116 be added to the product $hx$ in order to round or otherwise
117 shift the result.
118
119 If $x$ is assumed to be without error, the ideal function is
120 given by (\ref{eq:crat0:srla0:smfu0:01}), whereas the function
121 that can be economically implemented is
122
123 \begin{equation}
124 \label{eq:crat0:srla0:smfu0:02}
125 g(x) = \left\lfloor \frac{hx + z}{k} \right\rfloor
126 =
127 \left\lfloor r_A x + \frac{z}{k} \right\rfloor .
128 \end{equation}
129
130 If, on the other hand, $x$ may be already quantized,
131 the function that can actually be implemented is
132
133 \begin{equation}
134 \label{eq:crat0:srla0:smfu0:03}
135 h(x) = \left\lfloor \frac{h \lfloor x \rfloor + z}{k} \right\rfloor
136 =
137 \left\lfloor r_A \lfloor x \rfloor + \frac{z}{k} \right\rfloor .
138 \end{equation}
139
140
141
142 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
143 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
144 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
145 \section[\protect\mbox{\protect$h/2^q$} and \protect\mbox{\protect$2^q/k$} Rational Linear Approximation]
146 {\protect\mbox{\protect\boldmath$h/2^q$} and \protect\mbox{\protect\boldmath$2^q/k$} Rational Linear Approximation}
147 %Section tag: HQQ0
148 \label{crat0:shqq0}
149
150 \index{h/2q@$h/2^q$ rational linear approximation}
151 \index{rational linear approximation!h/2q@$h/2^q$}
152 The algorithms presented in
153 Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0}
154 will always provide the rational number $h/k$ closest to
155 an arbitrary real number $r_I$ subject to the constraints
156 $h \leq h_{MAX}$ and $k \leq k_{MAX}$.
157
158 However, because shifting in order
159 to implement multiplication or division by a power of 2
160 is at least as fast (and often \emph{much} faster)
161 on all processors as arbitrary multiplication or division,
162 and because not all processors have multiplication and division instructions,
163 it is worthwhile to examine choosing $h/k$ so that either $h$ or $k$ are
164 powers of 2.
165
166 There are thus three rational linear approximation techniques to be
167 examined:
168
169 \begin{enumerate}
170 \item \emph{$h/k$ rational linear approximation}, in which an arbitrary
171 $h \leq h_{MAX}$ and an arbitrary $k \leq k_{MAX}$ are used,
172 with $r_A = h/k$. $h$ and $k$ can be chosen using the algorithms
173 presented in Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0}.
174 Implementation of this technique would most often involve a single integer
175 multiplication instruction to form the product $hx$, followed by an optional single
176 addition instruction to form the sum $hx+z$, and then
177 followed by by a single division instruction
178 to form the quotient $\lfloor (hx+z)/k \rfloor$. Implementation may also less commonly involve
179 multiplication, addition, and division of operands too large to be processed
180 with single machine instructions.
181 \item \emph{$h/2^q$ rational linear approximation}, in which an arbitrary
182 $h \leq h_{MAX}$ and an integral power of two $k=2^q$ are used, with
183 $r_A = h/2^q$.
184 Implementation of this technique would most often involve a single integer
185 multiplication instruction to form the product $hx$, followed by an optional single
186 addition instruction to form the sum $hx+z$, and then
187 followed by right shift instruction(s)
188 to form the quotient $\lfloor (hx+z)/2^q \rfloor$. Implementation may also less commonly involve
189 multiplication, addition, and right shift of operands too large to be processed
190 with single machine instructions.
191 \item \emph{$2^q/k$ rational linear approximation}, in which an integral
192 power of two $h=2^q$ and an arbitrary $k \leq k_{MAX}$ are used, with
193 $r_A = 2^q/k$.
194 Implementation of this technique would most often involve left shift
195 instruction(s) to form the product $2^qx$, followed by an optional single
196 addition instruction to form the sum $2^qx+z$, and then
197 followed by a single division instruction to form
198 the quotient $\lfloor (2^qx+z)/k \rfloor$. Implementation may also less
199 commonly involve
200 left shift, addition, and division of operands too large to be processed
201 with single machine instructions.
202 \end{enumerate}
203
204 We use the nomenclature ``\emph{$h/k$ rational linear approximation}'',
205 ``\emph{$h/2^q$ rational linear approximation}'', and
206 ``\emph{$2^q/k$ rational linear approximation}'' to identify the three
207 techniques enumerated above.
208
209
210 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
211 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
212 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
213 \subsection{Integer Arithmetic and Processor Instruction Set Characteristics}
214 %Subsection tag: pis0
215 \label{crat0:shqq0:pis0}
216
217 The following observations about integer arithmetic and about processors
218 used in embedded control can be made:
219
220 \begin{enumerate}
221 \item \label{enum:crat0:shqq0:pis0:01:01a}
222 \emph{Shifting is the fastest method of integer multiplication or division
223 (by $2^q$ only),
224 followed by utilization of the processor multiplication or division instructions (for arbitrary
225 operands),
226 followed by software implementation of multiplication or division (for arbitrary operands).}
227 Relative costs vary depending on the processor, but the monotonic
228 ordering always holds.
229 $h/2^q$ and $2^q/k$ rational linear
230 approximation are thus worthy of investigation. (Note also that in many practical
231 applications of $h/2^q$ and $2^q/k$ rational linear approximation,
232 the required shift is performed by
233 addressing the operand with an offset,
234 and so has no cost.)
235 \item \label{enum:crat0:shqq0:pis0:01:01b}
236 \emph{Shifting is $O(N)$ (where $N$ is the number of bits in the argument),
237 but both
238 multiplication and division are $O(N^2)$ for
239 practical\footnote{\index{Karatsuba multiplication}Karatsuba
240 multiplication, for example, is
241 $O(N^{\log_2 3}) \approx O(N^{1.58}) \ll O(N^2)$. However, Karatsuba
242 multiplication cannot be applied economically to the small
243 operands that typically occur in embedded control work. It would
244 be rare in embedded control applications
245 for the length of a multiplication operand to exceed four
246 times the length that is accommodated by a machine instruction; and this
247 is far below the threshold at which Karatsuba multiplication is
248 economical. Thus, for all intents and purposes in embedded control work,
249 multiplication is $O(N^2)$.} operands (where
250 $N$ is the number of bits in each
251 operand).} It follows that $2^q/k$ and $h/2^q$ rational
252 linear approximation
253 will scale to large operands better than $h/k$ rational linear approximation.
254 \item \label{enum:crat0:shqq0:pis0:01:02a}
255 \emph{Integer division instructions take as long or longer than
256 integer multiplication instructions.} In designing digital logic
257 to implement basic integer arithmetic, division is the operation most difficult
258 to perform economically.\footnote{For some processors, the penalty is extreme.
259 For example, on the NEC V850 (a RISC processor),
260 a division requires 36 clock cycles,
261 whereas multiplication, addition, and subtraction each effectively
262 require 1 clock cycle.}
263 It follows that multiplication using operands that exceed the machine's word size
264 is often far less expensive than division using operands that exceed the
265 machine's word size.
266 \item \label{enum:crat0:shqq0:pis0:01:03a}
267 \emph{All processors that have an integer division instruction also
268 have an integer multiplication instruction.}
269 Phrased
270 differently, no processor has an integer division instruction but no
271 integer multiplication instruction.
272 \end{enumerate}
273
274 Enumerated items
275 (\ref{enum:crat0:shqq0:pis0:01:01a}) through
276 (\ref{enum:crat0:shqq0:pis0:01:03a}) above lead to the following conclusions.
277
278 \begin{enumerate}
279 \item $h/2^q$ rational linear approximation is likely to be implementable
280 more efficiently on most processors than $h/k$ rational linear approximation.
281 (\emph{Rationale:} shift instruction(s) or accessing a
282 memory address with an offset
283 is
284 likely to be more economical than division, particularly if $k$ would exceed
285 the native
286 operand size of the processor.)
287 \item $h/2^q$ rational linear approximation is likely to be a more useful
288 technique than $2^q/k$ rational linear approximation.
289 (\emph{Rationale:} the generally high cost of division compared to
290 multiplication, and the existence of processors that possess a multiplication
291 instruction but no division instruction.)
292 \end{enumerate}
293
294
295 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
296 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
297 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
298 \subsection[Design Procedure For \protect\mbox{\protect$h/2^q$} Rational Linear Approximations]
299 {Design Procedure For \protect\mbox{\protect\boldmath$h/2^q$} Rational Linear Approximation}
300 %Subsection tag: dph0
301 \label{crat0:shqq0:dph0}
302
303 An $h/2^q$ rational linear approximation is parameterized by:
304
305 \begin{itemize}
306 \item The unsigned or signed nature of $h$ and $x$. (Rational linear approximations
307 may involve either signed or unsigned domains and ranges. Furthermore,
308 signed integers may be maintained using either 2's-complement
309 or sign-magnitude representation, and the processor instruction set
310 may or may not directly support signed multiplication.)
311 \item $r_I$, the real number we wish to approximate by $r_A = h/2^q$.
312 \item $x_{MAX}$, the maximum possible value of the input argument $x$. (Typically,
313 software contains a test to clip the output if $x > x_{MAX}$.)
314 \item $w_h$, the width in bits allowed for $h$. (Typically, $w_h$ is
315 the maximum operand size of a machine multiplication instruction.)
316 \item $w_r$, the width in bits allowed for the result $hx$. (Typically,
317 $w_r$ is the maximum result size of a machine multiplication instruction.)
318 \item The rounding mode when choosing $h$ (and thus effectively $r_A$)
319 based on $r_I$. It is common to choose the
320 closest value,
321 $r_A=\lfloor r_I 2^q + 1/2 \rfloor/2^q$
322 or
323 $r_A=\lceil r_I 2^q - 1/2 \rceil/2^q$,
324 but other choices are possible.
325 \item The rounding mode for the result (i.e. the choice of $z$ in
326 Eq. \ref{eq:crat0:sint0:01}).
327 \end{itemize}
328
329 This section develops a design procedure for $h/2^q$ rational linear
330 approximations with the most typical set of assumptions: unsigned arithmetic,
331 $r_A=\lfloor r_I 2^q + 1/2 \rfloor/2^q$,
332 and $z=0$. Design procedures for other scenarios are presented as exercises.
333
334 By definition, $h$ is constrained in two ways:
335
336 \begin{equation}
337 \label{eq:crat0:shqq0:dph0:00}
338 h \leq 2^{w_h} - 1
339 \end{equation}
340
341 \noindent{}and
342
343 \begin{equation}
344 \label{eq:crat0:shqq0:dph0:01}
345 h \leq \frac{2^{w_r} - 1}{x_{MAX}} .
346 \end{equation}
347
348 \noindent{}(\ref{eq:crat0:shqq0:dph0:00}) comes directly from the
349 requirement that $h$ fit in $w_h$ bits.
350 (\ref{eq:crat0:shqq0:dph0:01}) comes directly from the requirement
351 that $hx$ fit in $w_r$ bits.
352 (\ref{eq:crat0:shqq0:dph0:00}) and (\ref{eq:crat0:shqq0:dph0:01})
353 may be combined to form one inequality:
354
355 \begin{equation}
356 \label{eq:crat0:shqq0:dph0:02}
357 h \leq \min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) .
358 \end{equation}
359
360 If $q$ is known, the choice of $h$ that will be made so as to minimize
361 $|r_A-r_I| = |h/2^q - r_I|$ is
362
363 \begin{equation}
364 \label{eq:crat0:shqq0:dph0:03}
365 h=\left\lfloor r_I 2^q + \frac{1}{2} \right\rfloor .
366 \end{equation}
367
368 \noindent{}It is required that the choice of $h$ specified by
369 (\ref{eq:crat0:shqq0:dph0:03}) meet
370 (\ref{eq:crat0:shqq0:dph0:02}). Making the most pessimistic
371 assumption about the rounding of $h$ and substituting into
372 (\ref{eq:crat0:shqq0:dph0:02}) leads to
373
374 \begin{equation}
375 \label{eq:crat0:shqq0:dph0:04}
376 r_I 2^q + \frac{1}{2}
377 \leq
378 \min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) .
379 \end{equation}
380
381 \noindent{}Isolating $q$ in (\ref{eq:crat0:shqq0:dph0:04})
382 yields
383
384 \begin{equation}
385 \label{eq:crat0:shqq0:dph0:05}
386 2^q
387 \leq
388 \frac{\min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) - \frac{1}{2}}
389 {r_I}.
390 \end{equation}
391
392 \noindent{}Solving
393 (\ref{eq:crat0:shqq0:dph0:05})
394 for maximum value of $q$ that meets the constraint yields
395
396 \begin{equation}
397 \label{eq:crat0:shqq0:dph0:06}
398 q=
399 \left\lfloor
400 {
401 \log_2
402 \left(
403 {
404 \frac{\min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) - \frac{1}{2}}{r_I}
405 }
406 \right)
407 }
408 \right\rfloor .
409 \end{equation}
410
411 \noindent{}(\ref{eq:crat0:shqq0:dph0:06})
412 can be rewritten for easier calculation using most calculators (which do
413 not allow the direct evaluation of base-2 logarithms):
414
415 \begin{equation}
416 \label{eq:crat0:shqq0:dph0:07}
417 q=
418 \left\lfloor
419 \frac
420 {
421 {
422 \ln
423 \left(
424 {
425 \frac{\min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) - \frac{1}{2}}{r_I}
426 }
427 \right)
428 }
429 }
430 {\ln 2}
431 \right\rfloor .
432 \end{equation}
433
434 \noindent{}Once $q$ is established using (\ref{eq:crat0:shqq0:dph0:07}),
435 $h$ can be calculated using (\ref{eq:crat0:shqq0:dph0:03}).
436
437 In embedded control work (as well as in operating system internals),
438 $h/2^q$ rational linear approximations are often used in conjunction with
439 tabulated constants or calibratable parameters
440 where each constant or calibratable parameter may vary over a range of
441 $[0, r_I]$, and where $r_I$ is the value used in the design procedure
442 presented above. In these applications, the values of $h$ are
443 tabulated, but $q$ is invariant (usually hard-coded)
444 and is chosen at design time based on the upper bound $r_I$
445 of the interval $[0, r_I]$ in which each tabulated constant or calibratable
446 parameter will fall. With $q$ fixed,
447 $r_A$ can be adjusted in steps of $1/2^q$.
448
449 If $r_I$ is invariant, a final design step may be to reduce the rational
450 number $h/2^q$ by dividing some or all occurrences of 2 as a factor from both the
451 numerator and denominator. With some processors and in some applications, this
452 may save execution time by reducing the number of shift instructions that
453 must be executed, reducing the execution time of the shift instructions
454 that are executed, or allowing shifting via offset addressing.
455 For example, on a byte-addressible machine, if the design procedure
456 yields $h=608$ and $q=10$, it may be desirable to divide both $h$ and $2^q$ by 4 to
457 yield $h=152$ and $q=8$, as this allows the shift by 8 to be done by fetching
458 alternate bytes (rather than by actual shifting). In other applications, it may
459 be desirable to remove \emph{all} occurrences of 2 as a prime factor
460 from $h$.
461
462 For an invariant $r_I$, a suitable design procedure is:
463
464 \begin{enumerate}
465 \item Choose $q$ using (\ref{eq:crat0:shqq0:dph0:07}).
466 \item With $q$ fixed, choose $h$ using (\ref{eq:crat0:shqq0:dph0:03}).
467 \item If economies can be achieved on the target processor,
468 examine the possibility of removing some or all occurrences
469 of 2 as a prime factor from $h$ and decreasing $q$.
470 \end{enumerate}
471
472 For tabulated or calibratable constants in the
473 interval $[0,r_I]$, a suitable design procedure is to use the
474 procedure presented immediately above but without the third step.
475 Each tabulated value of $h$ is chosen using (\ref{eq:crat0:shqq0:dph0:03}).
476
477 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
478 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
479 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
480 \subsection[Design Procedure For \protect\mbox{\protect$2^q/k$} Rational Linear Approximations]
481 {Design Procedure For \protect\mbox{\protect\boldmath$2^q/k$} Rational Linear Approximation}
482 %Subsection tag: dpk0
483 \label{crat0:shqq0:dpk0}
484
485 TBD.
486
487 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
488 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
489 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
490 \section{Piecewise Rational Linear Approximation}
491 %Section tag: PWI0
492 \label{crat0:spwi0}
493
494 TBD.
495
496 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
497 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
498 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
499 \section[Frequency Division And Rational Counting]
500 {Frequency Division And Rational Counting Techniques}
501 %Section tag: FDV0
502 \label{crat0:sfdv0}
503
504 \index{frequency division}\index{rational counting}\index{counting}%
505 Often, software must ``divide down'' an execution rate. For example,
506 an interrupt service routine may be scheduled by hardware every
507 10ms, but may perform useful processing only every 50ms. This requires
508 that the ISR maintain a counter and only perform useful processing
509 every fifth invocation. This section deals with counting strategies
510 used to achieve invocation frequency division and other similar results.
511
512 Frequency division and
513 rational counting techniques presented in this section find application
514 primarily in the following scenarios:
515
516 \begin{itemize}
517 \item ISRs and other software components which must divide down
518 their invocation rate.
519 \item Pulse counting and scaling from encoders and other
520 similar systems.
521 \item The correction of inaccuracies in timebases (such as crystals
522 which oscillate at a frequency different than the
523 nominal rate).
524 \end{itemize}
525
526 Because the techniques presented must be usable with inexpensive
527 microcontrollers, such techniques must meet these constraints:
528
529 \begin{enumerate}
530 \item \label{enum:01:crat0:sfdv0:econex}
531 The counting techniques must be economical to execute on
532 an inexpensive microcontroller.
533 \item \label{enum:01:crat0:sfdv0:econcccalc}
534 An inexpensive microcontroller must be capable of calculating any
535 constants used as limits in counting (i.e. it cannot necessarily
536 be assumed that a more powerful computer calculates these constants,
537 and it cannot be assumed that these limits do not change on the fly).
538 \end{enumerate}
539
540 In this section, we analyze the behavior of several types of
541 rational counting algorithms, supplied as Algorithms
542 \ref{alg:crat0:sfdv0:01a}
543 through
544 \ref{alg:crat0:sfdv0:02a}.
545
546 \begin{algorithm}
547 \begin{verbatim}
548 /* The constants K1 through K4, which parameterize the */
549 /* counting behavior, are assumed assigned elsewhere in */
550 /* the code. The solution is analyzed in terms of the */
551 /* parameters K1 through K4. */
552 /* */
553 /* We also place the following restrictions on K1 through */
554 /* K4: */
555 /* K1 : K1 <= K3 - K2. */
556 /* K2 : K4 > K2 > 0. */
557 /* K3 : No restrictions. */
558 /* K4 : K4 > K2 > 0. */
559
560 void base_rate_sub(void)
561 {
562 static int state = K1;
563
564 state += K2;
565
566 if (state >= K3)
567 {
568 state -= K4;
569 A();
570 }
571 }
572 \end{verbatim}
573 \caption{Rational Counting Algorithm For $K_2/K_4 < 1$}
574 \label{alg:crat0:sfdv0:01a}
575 \end{algorithm}
576
577 \begin{algorithm}
578 \begin{verbatim}
579 /* The constants K1 through K4, which parameterize the */
580 /* counting behavior, are assumed assigned elsewhere in */
581 /* the code. The solution is analyzed in terms of the */
582 /* parameters K1 through K4. */
583 /* */
584 /* We also place the following restrictions on K1 through */
585 /* K4: */
586 /* K1 : K1 <= K3 - K2. */
587 /* K2 : K2 > 0. */
588 /* K3 : No restrictions. */
589 /* K4 : K4 > 0. */
590
591 void base_rate_sub(void)
592 {
593 static int state = K1;
594
595 state += K2;
596
597 while (state >= K3)
598 {
599 state -= K4;
600 A();
601 }
602 }
603 \end{verbatim}
604 \caption{Rational Counting Algorithm For $K_2/K_4 \geq 1$}
605 \label{alg:crat0:sfdv0:01b}
606 \end{algorithm}
607
608 \begin{algorithm}
609 \begin{verbatim}
610 /* The constants K1, K2, and K4, which parameterize the */
611 /* counting behavior, are assumed assigned elsewhere in */
612 /* the code. The solution is analyzed in terms of the */
613 /* parameters K1 through K4. */
614 /* */
615 /* We also place the following restrictions on K1, K2, */
616 /* and K4: */
617 /* K1 : K1 >= 0. */
618 /* K2 : K4 > K2 > 0. */
619 /* K4 : K4 > K2 > 0. */
620 /* */
621 /* Special thanks to Chuck B. Falconer (of the */
622 /* comp.arch.embedded newsgroup) for this rational */
623 /* counting algorithm. */
624 /* */
625 /* Note below that the test against K3 does not exist, */
626 /* instead a test against zero is used, which many */
627 /* machine instruction sets will do as part of the */
628 /* subtraction (but perhaps this needs to be coded in */
629 /* A/L). This saves machine code and also eliminates */
630 /* one unnecessary degree of freedom (K3). */
631
632 void base_rate_sub(void)
633 {
634 static int state = K1;
635
636 if ((state -= K2) < 0)
637 {
638 state += K4;
639 A();
640 }
641 }
642 \end{verbatim}
643 \caption{Zero-Test Rational Counting Algorithm For $K_2/K_4 < 1$}
644 \label{alg:crat0:sfdv0:01c}
645 \end{algorithm}
646
647 \begin{algorithm}
648 \begin{verbatim}
649 ;Special thanks to John Larkin (of the comp.arch.embedded
650 ;newsgroup) for this rational counting algorithm.
651 ;
652 ;This is the TMS-370C8 assembly-language version of the
653 ;algorithm. The algorithm is parameterized solely by
654 ;K1 and K2, with no restrictions on their values, because
655 ;the values are naturally constrained by the data types.
656 ;K1, which is the initial value of "state", is assumed
657 ;assigned elsewhere. The snippet shown here uses only
658 ;K2.
659 MOV state, A ;Get "state".
660 ADD #K2, A ;Increase by K2. Carry flag
661 ;will be set if rollover to or
662 ;past zero.
663 PUSH ST ;Save carry flag.
664 MOV A, state ;Move new value back.
665 POP ST ;Restore carry flag.
666 JNC done ;If didn't roll, don't run sub.
667 CALL A_SUBROUTINE ;Run sub.
668 done:
669
670 /* This is the 'C' version of the algorithm. It is not */
671 /* as easy or efficient in 'C' to detect rollover. */
672
673 void base_rate_sub(void)
674 {
675 static unsigned int state = K1;
676 unsigned int old_state;
677
678 old_state = state;
679 state += K2;
680 if (state < old_state)
681 {
682 A();
683 }
684 }
685 \end{verbatim}
686 \caption{$2^q$ Rollover Rational Counting Algorithm}
687 \label{alg:crat0:sfdv0:01d}
688 \end{algorithm}
689
690 \begin{algorithm}
691 \begin{verbatim}
692 /* The constants K1 through K4, which parameterize the */
693 /* counting behavior, are assumed assigned elsewhere in */
694 /* the code. The solution is analyzed in terms of the */
695 /* parameters K1 through K4. */
696 /* */
697 /* We also place the following restrictions on K1 through */
698 /* K4: */
699 /* K1 : K1 <= K3. */
700 /* K2 : K2 > 0. */
701 /* K3 : No restrictions. */
702 /* K4 : K4 > 0. */
703
704 void base_rate_sub(void)
705 {
706 static unsigned int state = K1;
707
708 if (state >= K3)
709 {
710 state -= K4;
711 A();
712 }
713 else
714 {
715 state += K2;
716 B();
717 }
718 }
719 \end{verbatim}
720 \caption{Rational Counting Algorithm With \texttt{else} Clause}
721 \label{alg:crat0:sfdv0:02a}
722 \end{algorithm}
723
724
725 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
726 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
727 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
728 \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01a}]
729 {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01a}}
730 %Section tag: PRC0
731 \label{crat0:sfdv0:sprc0}
732
733 Algorithm \ref{alg:crat0:sfdv0:01a}
734 is used frequently in microcontroller
735 software. A base rate subroutine\footnote{For brevity, we usually
736 call this just the \emph{base subroutine}.} (named ``\texttt{base\_rate\_sub()}''
737 in the algorithm) is called at a periodic rate, and subroutine
738 ``\texttt{A()}'' is called at a lesser rate.
739 We are interested in determining the relationships between the rates
740 as a function of $K_1$, $K_2$, $K_3$, and $K_4$; and we are interested
741 in developing other properties.
742
743 Notationally when analyzing rational counting algorithms, we agree
744 that $state_n$ denotes the value of the \texttt{state} variable
745 after the $n$th invocation and before the $n+1$'th invocation
746 of the base rate subroutine.
747 Using this convention with Algorithm \ref{alg:crat0:sfdv0:01a},
748 $state_0 = K_1$.\footnote{Algorithm \ref{alg:crat0:sfdv0:01a}
749 requires a knowledge of
750 `C' to fully understand. The \texttt{static} keyword ensures that the
751 variable \texttt{state} is initialized only once, at the time the program
752 is loaded. \texttt{state} is \emph{not} initialized each time the
753 base subroutine runs.}
754
755 We can first easily derive the number of initial invocations of
756 the base subroutine before ``\texttt{A()}'' is called for the first
757 time.
758
759 \begin{vworklemmastatement}
760 \label{lem:crat0:sfdv0:sprc0:01}
761 $N_{STARTUP}$, the number of invocations of the base subroutine
762 in Algorithm \ref{alg:crat0:sfdv0:01a} before ``\texttt{A()}'' is called
763 for the first time, is given by
764
765 \begin{equation}
766 \label{eq:lem:crat0:sfdv0:sprc0:01:01}
767 N_{STARTUP} =
768 \left\lceil
769 {
770 \frac{-K_1 - K_2 + K_3}{K_2}
771 }
772 \right\rceil .
773 \end{equation}
774 \end{vworklemmastatement}
775 \begin{vworklemmaproof}
776 The value of \texttt{state} after the $n$th invocation
777 is $state_n = K_1 + n K_2$. In order for the test in the
778 \texttt{if()} statement not to be met, we require that
779
780 \begin{equation}
781 \label{eq:lem:crat0:sfdv0:sprc0:01:02}
782 K_1 + n K_2 < K_3
783 \end{equation}
784
785 \noindent{}or equivalently that
786
787 \begin{equation}
788 \label{eq:lem:crat0:sfdv0:sprc0:01:03}
789 n < \frac{K_3 - K_1}{K_2} .
790 \end{equation}
791
792 Solving (\ref{eq:lem:crat0:sfdv0:sprc0:01:03}) for the largest
793 value of $n \in \vworkintset$ which still meets the criterion
794 yields (\ref{eq:lem:crat0:sfdv0:sprc0:01:01}). Note that
795 the derivation of (\ref{eq:lem:crat0:sfdv0:sprc0:01:01}) requires
796 that the restrictions on $K_1$ through $K_4$ documented in
797 Algorithm \ref{alg:crat0:sfdv0:01a} be met.
798 \end{vworklemmaproof}
799 \begin{vworklemmaparsection}{Remarks}
800 Note that if one chooses $K_1 > K_3 - K_2$ (in contradiction to the
801 restrictions in Algorithm \ref{alg:crat0:sfdv0:01a}), it is possible
802 to devise a counting scheme (and results analogous to this lemma) where
803 ``\texttt{A()}'' is run a number of times before it is
804 \emph{not} run for the first time. The construction of an analogous
805 lemma is the topic of Exercise \ref{exe:crat0:sexe0:01}.
806 \end{vworklemmaparsection}
807
808 \begin{vworklemmastatement}
809 \label{lem:crat0:sfdv0:sprc0:02}
810 Let $N_I$ be the number of times the Algorithm
811 \ref{alg:crat0:sfdv0:01a} base subroutine
812 is called, let $N_O$ be the number of times the
813 ``\texttt{A()}'' subroutine is called, let
814 $f_I$ be the frequency of invocation of the
815 Algorithm \ref{alg:crat0:sfdv0:01a} base subroutine, and let
816 $f_O$ be the frequency of invocation of
817 ``\texttt{A()}''. Provided the constraints
818 on $K_1$ through $K_4$ documented in
819 Algorithm \ref{alg:crat0:sfdv0:01a} are met,
820
821 \begin{equation}
822 \label{eq:lem:crat0:sfdv0:sprc0:02:01}
823 \lim_{N_I\rightarrow\infty}\frac{N_O}{N_I}
824 =
825 \frac{f_O}{f_I}
826 =
827 \frac{K_2}{K_4} .
828 \end{equation}
829 \end{vworklemmastatement}
830 \begin{vworklemmaproof}
831 (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) indicates that once
832 the initial delay (determined by $K_1$ and $K_3$) has finished,
833 $N_O/N_I$ will converge on a steady-state value of
834 $K_2/K_4$.
835
836 Assume that $K_1=0$ and $K_3=K_4$. The
837 conditional subtraction then calculates
838 $state \bmod K_4$. After the $n$th
839 invocation of the base subroutine, the value
840 of \texttt{state} will be
841
842 \begin{equation}
843 \label{eq:lem:crat0:sfdv0:sprc0:02:02}
844 state_n|_{K_1=0, K_3=K_4} = n K_2 \bmod K_4 .
845 \end{equation}
846
847 Assume that for two distinct values of
848 $n \in \vworkintsetnonneg$, $n_1$ and $n_2$,
849 the value of the \texttt{state} variable is the same:
850
851 \begin{equation}
852 \label{eq:lem:crat0:sfdv0:sprc0:02:03}
853 n_1 K_2 \bmod K_4 = n_2 K_2 \bmod K_4.
854 \end{equation}
855
856 Then
857
858 \begin{equation}
859 \label{eq:lem:crat0:sfdv0:sprc0:02:04}
860 (n_2 - n_1) K_2 = i K_4, \; \exists i \in \vworkintsetpos .
861 \end{equation}
862
863 However, we have no knowledge of whether $K_2$ and $K_4$ are
864 coprime (they are not required to be). We may rewrite
865 (\ref{eq:lem:crat0:sfdv0:sprc0:02:04}) equivalently as
866
867 \begin{equation}
868 \label{eq:lem:crat0:sfdv0:sprc0:02:05}
869 (n_2 - n_1) \frac{K_2}{\gcd(K_2, K_4)} = i \frac{K_4}{\gcd(K_2, K_4)},
870 \; \exists i \in \vworkintsetpos
871 \end{equation}
872
873 where of course by definition
874
875 \begin{equation}
876 \label{eq:lem:crat0:sfdv0:sprc0:02:06}
877 \gcd \left( { \frac{K_2}{\gcd(K_2, K_4)}, \frac{K_4}{\gcd(K_2, K_4)} } \right) = 1.
878 \end{equation}
879
880 In order to satisfy (\ref{eq:lem:crat0:sfdv0:sprc0:02:05}),
881 $n_2 - n_1$ must contain all of the prime factors of
882 $K_4/\gcd(K_2,K_4)$ in at least the same multiplicities,
883 and it follows that the set of values
884 of $n_2-n_1$ that satisfies
885 (\ref{eq:lem:crat0:sfdv0:sprc0:02:03}) is
886 precisely the set of multiples of $K_4/\gcd(K_2,K_4)$:
887
888 \begin{equation}
889 \label{eq:lem:crat0:sfdv0:sprc0:02:07}
890 n_2 - n_1 = j \frac{K_4}{\gcd(K_2, K_4)}, \; \exists j \in \vworkintsetpos .
891 \end{equation}
892
893 Examining (\ref{eq:lem:crat0:sfdv0:sprc0:02:02}), it can
894 also be seen that
895
896 \begin{equation}
897 \label{eq:lem:crat0:sfdv0:sprc0:02:08}
898 \gcd(K_2, K_4) \vworkdivides (n K_2 \bmod K_4),
899 \end{equation}
900
901 and so
902
903 \begin{eqnarray}
904 \label{eq:lem:crat0:sfdv0:sprc0:02:09}
905 & n K_2 \bmod K_4 \in & \\
906 \nonumber
907 & \{ 0, \gcd(K_2, K_4), 2 \gcd(K_2, K_4), \ldots , K_4 - \gcd(K_2, K_4) \} , &
908 \end{eqnarray}
909
910 a set which contains exactly $K_4/\gcd(K_2, K_4)$ elements.
911
912 Thus we've established by the pigeonhole principle
913 that the sequence of the
914 values of the variable \texttt{state}
915 specified by (\ref{eq:lem:crat0:sfdv0:sprc0:02:02})
916 repeats perfectly with periodicity $K_4/\gcd(K_2, K_4)$,
917 and we've established that in one period, every element of the set
918 specified in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) appears exactly
919 once. (However, we have not specified the order in which the
920 elements appear, but this is not important for this lemma. In general
921 the elements appear out of the order shown in
922 Eq. \ref{eq:lem:crat0:sfdv0:sprc0:02:09}.)
923
924 To establish the frequency with which the test against
925 $K_4$ is met, note that if $state_n + K_2 \geq K_4$, then
926
927 \begin{eqnarray}
928 \label{eq:lem:crat0:sfdv0:sprc0:02:10}
929 & \displaystyle{state_n \in \left\{ \frac{K_4-K_2}{\gcd(K_2,K_4)} \gcd(K_2, K_4), \right.} & \\
930 \nonumber & \displaystyle{\left. \left(\frac{K_4-K_2}{\gcd(K_2,K_4)} + 1 \right) \gcd(K_2, K_4), \ldots ,
931 K_4 - \gcd(K_2, K_4)\right\} ,} &
932 \end{eqnarray}
933
934 which has a cardinality $K_2/K_4$ that of the set in
935 (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}). Since the
936 \texttt{state} variable cycles through the set in
937 (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) with perfect periodicity and since
938 $K_2/K_4$ of the set elements lead to the \texttt{if()} statement
939 test being
940 met, (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) is also met as
941 $N_I\rightarrow\infty$.
942
943 Note that if $K_1 \neq 0$, it simply changes the startup
944 behavior of the rational counting. So long as $K_2 < K_4$,
945 Algorithm \ref{alg:crat0:sfdv0:01a} will reach a steady state where
946 (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) holds.
947 Note that if $K_3 \neq K_4$, it simply ``shifts'' the sets
948 specified in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09})
949 and (\ref{eq:lem:crat0:sfdv0:sprc0:02:10}), but
950 (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) still holds.
951 The lemma has thus been proved
952 for every case. (We have neglected to give
953 the formal proof as required by the definition of a limit that
954 for any arbitrarily small error $\epsilon$, a
955 finite $N_I$ can be found so that
956 the error is at or below $\epsilon$; however the skeptical reader
957 is encouraged to complete Exercise \ref{exe:crat0:sexe0:02}.)
958 \end{vworklemmaproof}
959 \begin{vworklemmaparsection}{Remarks}
960 It is possible to view the long-term accuracy of
961 Algorithm \ref{alg:crat0:sfdv0:01a} in terms of a limit, as is done in
962 (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}). However, it is also
963 possible to observe that $K_1$ and $K_3$ set a delay until
964 the counting algorithm reaches steady state.
965 With $K_3=K_4$, the attainment of
966 steady state is characterized by the \texttt{state} variable
967 being assigned for the first time to one of the values in
968 (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}). Once in steady state,
969 the algorithm cycles with perfect periodic behavior through all of the
970 $K_4/\gcd(K_2,K_4)$ elements in
971 (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}), but not necessarily in
972 the order shown in the equation.
973 During this period of length $K_4/\gcd(K_2,K_4)$,
974 exactly $K_2/\gcd(K_2,K_4)$ invocations of the base
975 subroutine result in
976 subroutine ``\texttt{A()}'' being run, and exactly
977 $(K_4-K_2)/\gcd(K_2,K_4)$ do not. Thus, after reaching steady-state the
978 algorithm has \emph{perfect} accuracy if one considers periods of
979 length $K_4/\gcd(K_2,K_4)$.
980 \end{vworklemmaparsection}
981 %\vworklemmafooter{}
982
983 \begin{vworklemmastatement}
984 \label{lem:crat0:sfdv0:sprc0:04}
985 If $K_3=K_4$, $K_1=0$, and
986 $\gcd(K_2, K_4)=1$\footnote{\label{footnote:lem:crat0:sfdv0:sprc0:04:01}If
987 $\gcd(K_2, K_4) > 1$, then by Theorem
988 \cprizeroxrefhyphen\ref{thm:cpri0:ppn0:00a} the largest
989 value that $n K_2 \bmod K_4$ can attain is
990 $K_4-\gcd(K_2, K_4)$ and the interval in
991 (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}) is correspondingly
992 smaller. (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}) is
993 technically correct but not as conservative as possible.
994 This is a minor point and we do not dwell on it.}, the error between
995 the approximation to $N_O$ implemented by
996 Algorithm \ref{alg:crat0:sfdv0:01a} and the ``ideal'' mapping is always
997 in the set
998
999 \begin{equation}
1000 \label{eq:lem:crat0:sfdv0:sprc0:04:01}
1001 \left[ - \frac{K_4 - 1}{K_4} , 0 \right] ,
1002 \end{equation}
1003
1004 and no algorithm can be constructed to
1005 confine the error to a smaller interval.
1006 \end{vworklemmastatement}
1007 \begin{vworklemmaproof}
1008 With $K_1=0$ and $K_3 = K_4$, it can be verified analytically that
1009 the total number of times the function ``\texttt{A()}'' has been
1010 invoked up to and including the $n$th invocation of the base subroutine
1011 is
1012
1013 \begin{equation}
1014 \label{eq:lem:crat0:sfdv0:sprc0:04:02}
1015 N_O = \left\lfloor \frac{n K_2}{K_4} \right\rfloor .
1016 \end{equation}
1017
1018 On the other hand, the ``ideal'' number of invocations, which
1019 we denote $\overline{N_O}$, is given by
1020
1021 \begin{equation}
1022 \label{eq:lem:crat0:sfdv0:sprc0:04:03}
1023 \overline{N_O} = \frac{n K_2}{K_4} .
1024 \end{equation}
1025
1026 Quantization of the rational number in (\ref{eq:lem:crat0:sfdv0:sprc0:04:02})
1027 can introduce an error of up to $-(K_4-1)/K_4$, therefore
1028
1029 \begin{equation}
1030 \label{eq:lem:crat0:sfdv0:sprc0:04:04}
1031 N_O - \overline{N_O} =
1032 \left\lfloor \frac{n K_2}{K_4} \right\rfloor - \frac{n K_2}{K_4}
1033 \in \left[ - \frac{K_4 - 1}{K_4} , 0 \right] .
1034 \end{equation}
1035
1036 This proves the error bound for Algorithm \ref{alg:crat0:sfdv0:01a}.
1037 The proof that there can be no better algorithm is the topic
1038 of Exercise \ref{exe:crat0:sexe0:06}.
1039 \end{vworklemmaproof}
1040 \begin{vworklemmaparsection}{Remarks}
1041 Algorithm \ref{alg:crat0:sfdv0:01a} is \emph{optimal} in the
1042 sense that no algorithm can achieve a tighter error
1043 bound than (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}). As
1044 demonstrated in Exercises \ref{exe:crat0:sexe0:04}
1045 and \ref{exe:crat0:sexe0:05}, $K_1 \neq 0$ can be chosen
1046 to shift the interval in (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}), but
1047 the span of the interval cannot be reduced.
1048 \end{vworklemmaparsection}
1049 \vworklemmafooter{}
1050
1051 Lemmas \ref{lem:crat0:sfdv0:sprc0:02}
1052 and \ref{lem:crat0:sfdv0:sprc0:04} have demonstrated that the ratio of
1053 counts $N_O/N_I$ will asymptotically
1054 approach $K_2/K_4$
1055 (i.e. the long-term accuracy of Algorithm \ref{alg:crat0:sfdv0:01a}
1056 is \emph{perfect}).
1057 However,
1058 for many applications it is also desirable to have a lack of
1059 ``bursty'' behavior. We demonstrate the lack of bursty
1060 behavior in the following lemma.
1061
1062 \begin{vworklemmastatement}
1063 \label{lem:crat0:sfdv0:sprc0:03}
1064 For Algorithm \ref{alg:crat0:sfdv0:01a}, once steady
1065 state has been achieved, the number of consecutive
1066 base subroutine invocations during which subroutine
1067 ``\texttt{A()}'' is executed is always in the set
1068
1069 \begin{equation}
1070 \label{eq:lem:crat0:sfdv0:sprc0:03:01}
1071 \left\{
1072 \left\lfloor \frac{K_2}{K_4 - K_2} \right\rfloor ,
1073 \left\lceil \frac{K_2}{K_4 - K_2} \right\rceil
1074 \right\} \cap \vworkintsetpos,
1075 \end{equation}
1076
1077 which contains one integer if $K_2/K_4 \leq 1/2$ or $(K_4-K_2) \vworkdivides K_2$,
1078 or two integers otherwise.
1079
1080 Once steady state has been achieved, the number of
1081 consecutive base function invocations during which
1082 subroutine ``\texttt{A()}'' is not executed is
1083 always in the set
1084
1085 \begin{equation}
1086 \label{eq:lem:crat0:sfdv0:sprc0:03:02}
1087 \left\{
1088 \left\lfloor \frac{K_4-K_2}{K_2} \right\rfloor ,
1089 \left\lceil \frac{K_4-K_2}{K_2} \right\rceil
1090 \right\} \cap \vworkintsetpos,
1091 \end{equation}
1092
1093 which contains one integer if $K_2/K_4 \geq 1/2$ or $K_2 \vworkdivides K_4$,
1094 or two integers otherwise.
1095 \end{vworklemmastatement}
1096 \begin{vworklemmaproof}
1097 As before in Lemma \ref{lem:crat0:sfdv0:sprc0:02}
1098 for convenience and without
1099 loss of generality, assume $K_3=K_4$ and
1100 $K_1=0$. Then after a transient period
1101 determined by $K_1$ and $K_3$, the \texttt{state}
1102 variable will be assigned one of the values in
1103 (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) and cycle through
1104 those values in an unestablished order but with perfect
1105 periodicity. To accomplish this proof, we must establish
1106 something about the order in which the \texttt{state} variable attains
1107 the values in the set in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}).
1108
1109 We can partition the set in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09})
1110 into two sets; the set of values of \texttt{state} for which if the
1111 base subroutine is invoked with \texttt{state} in this set, subroutine
1112 ``\texttt{A()}'' will not be invoked (we call this set $\phi_1$),
1113 and the set of values of \texttt{state} for which if the
1114 base subroutine is invoked with \texttt{state} in this set, subroutine
1115 ``\texttt{A()}'' will be invoked (we call this set $\phi_2$).
1116 $\phi_1$ and $\phi_2$ are identified below.
1117
1118 \begin{eqnarray}
1119 \label{eq:lem:crat0:sfdv0:sprc0:03:03}
1120 & \phi_1 = & \\
1121 \nonumber &
1122 \displaystyle{\left\{
1123 0, \gcd(K_2, K_4), 2 \gcd(K_2, K_4), \ldots ,
1124 \left(\frac{K_4-K_2}{\gcd(K_2,K_4)} - 1 \right) \gcd(K_2, K_4)
1125 \right\}} &
1126 \end{eqnarray}
1127
1128 \begin{eqnarray}
1129 \label{eq:lem:crat0:sfdv0:sprc0:03:04}
1130 & \displaystyle{
1131 \phi_2 = \left\{\left(\frac{K_4-K_2}{\gcd(K_2,K_4)}\right) \gcd(K_2, K_4),\right.} & \\
1132 \nonumber & \displaystyle{\left.
1133 \left(\frac{K_4-K_2}{\gcd(K_2,K_4)} + 1 \right) \gcd(K_2, K_4) ,
1134 \ldots ,
1135 K_4 - \gcd(K_2, K_4)
1136 \right\}} &
1137 \end{eqnarray}
1138
1139 We can also make the following four additional useful observations
1140 about $\phi_1$ and $\phi_2$. Note that
1141 (\ref{eq:lem:crat0:sfdv0:sprc0:03:07}) and
1142 (\ref{eq:lem:crat0:sfdv0:sprc0:03:08}) become equality
1143 if $\gcd(K_2, K_4) = 1$.
1144
1145 \begin{equation}
1146 \label{eq:lem:crat0:sfdv0:sprc0:03:05}
1147 n(\phi_1) = \frac{K_4 - K_2}{\gcd(K_2, K_4)}
1148 \end{equation}
1149
1150 \begin{equation}
1151 \label{eq:lem:crat0:sfdv0:sprc0:03:06}
1152 n(\phi_2) = \frac{K_2}{\gcd(K_2, K_4)}
1153 \end{equation}
1154
1155 \begin{equation}
1156 \label{eq:lem:crat0:sfdv0:sprc0:03:07}
1157 \phi_1 \subseteq \{ 0, 1, \ldots , K_4 - K_2 - 1 \}
1158 \end{equation}
1159
1160 \begin{equation}
1161 \label{eq:lem:crat0:sfdv0:sprc0:03:08}
1162 \phi_2 \subseteq \{K_4 - K_2, \ldots , K_4 - 1 \}
1163 \end{equation}
1164
1165 We first prove (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}).
1166 If $state_n \in \phi_2$ at the time the base function
1167 is invoked, then
1168 ``\texttt{A()}'' will be invoked. We also know that
1169 since $state_n \in \phi_2$, $state_n + K_2 \geq K_4$, so
1170
1171 \begin{equation}
1172 \label{eq:lem:crat0:sfdv0:sprc0:03:09}
1173 state_{n+1} \;\; =|_{state_n \in \phi_2} \;\; state_n - (K_4 - K_2) .
1174 \end{equation}
1175
1176 Thus so long as $state_n \in \phi_2$, $state_{n+1} < state_n$
1177 as specified above in (\ref{eq:lem:crat0:sfdv0:sprc0:03:09}).
1178 With each invocation of the base subroutine, \texttt{state} will
1179 ``walk downward'' through $\phi_2$. It can
1180 also be observed that when \texttt{state} drops below the smallest
1181 element of $\phi_2$, the next value of \texttt{state} will
1182 be in $\phi_1$.
1183
1184 Note also that although the downward walk specified in
1185 (\ref{eq:lem:crat0:sfdv0:sprc0:03:09}) walks downward in absolute steps
1186 of $K_4-K_2$, this corresponds to $(K_4-K_2) / \gcd(K_2, K_4)$
1187 \emph{elements} of $\phi_2$, since the elements of $\phi_2$ are
1188 separated by $\gcd(K_2, K_4)$.
1189
1190 Given the ``downward walk'' specified in (\ref{eq:lem:crat0:sfdv0:sprc0:03:09}),
1191 the only question to be answered is how many consecutive values of
1192 \texttt{state}, separated by $K_4-K_2$ (or $(K_4-K_2)/\gcd(K_2, K_4)$ elements),
1193 can ``fit'' into
1194 $\phi_2$. Considering that $n(\phi_2) = K_2/\gcd(K_2, K_4)$
1195 (Eq. \ref{eq:lem:crat0:sfdv0:sprc0:03:06}) and that the
1196 downward step represents $(K_4-K_2)/\gcd(K_2, K_4)$ set elements,
1197 (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}) comes immediately by
1198 a graphical argument.
1199
1200 We now prove (\ref{eq:lem:crat0:sfdv0:sprc0:03:02}).
1201 This can be proved using exactly the same arguments
1202 as for (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}), but
1203 considering the upward walk through $\phi_1$ rather
1204 than the downward walk through $\phi_2$.
1205
1206 As with Lemma \ref{lem:crat0:sfdv0:sprc0:02},
1207 note that the choices of $K_1$ and $K_3$ do not
1208 materially affect the proof above. $K_1$ and
1209 $K_3$ only set a delay until the rational counting
1210 algorithm reaches steady state. $K_3$ only shifts
1211 the sets $\phi_1$ and $\phi_2$.
1212 \end{vworklemmaproof}
1213 \begin{vworklemmaparsection}{Remark \#1}
1214 This lemma proves an \emph{extremely} important property for the
1215 usability of Algorithm \ref{alg:crat0:sfdv0:01a}. It says that once
1216 steady state has been reached, the variability in the number of consecutive
1217 times ``\texttt{A()}'' is run or not run is at most one count.
1218 \end{vworklemmaparsection}
1219 \begin{vworklemmaparsection}{Remark \#2}
1220 It is probably also possible to construct a rational counting algorithm
1221 so that the number of consecutive times ``\texttt{A()}'' is run is constant,
1222 but the algorithm achieves long-term accuracy by varying only the number
1223 of consecutive times ``\texttt{A()}'' is not run (or vice-versa), but this
1224 is not done here.
1225 \end{vworklemmaparsection}
1226 \begin{vworklemmaparsection}{Remark \#3}
1227 There is no requirement that $K_2$ and $K_4$ be coprime. In fact, as
1228 demonstrated later, it may be advantageous to choose a large $K_2$ and
1229 $K_4$ to approximate a simple ratio so that very fine adjustments can be
1230 made. For example, if the ideal ratio is 1/2, it may be desirable
1231 in some applications to
1232 choose $K_2$=1,000 and $K_4$=2,000 so that fine adjustments can be made
1233 by slightly perturbing $K_2$ or $K_4$. One might adjust 1,000/2,000 downward
1234 to 999/2,000 or upward to 1,001/2,000 by modifying $K_2$
1235 (both very fine adjustments).
1236 \end{vworklemmaparsection}
1237 \begin{vworklemmaparsection}{Remark \#4}
1238 The most common choice of $K_1$ in practice is 0. If $K_1=0$ is chosen,
1239 it can be shown that the number of initial invocations of the
1240 base subroutine is in the set identified in
1241 (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}).
1242 (See Exercise \ref{exe:crat0:sexe0:07}.)
1243 \end{vworklemmaparsection}
1244 \vworklemmafooter{}
1245
1246 For microcontroller work, it is considered
1247 a desirable property that software components be resilient
1248 to state upset
1249 (see Section \chgrzeroxrefhyphen\ref{chgr0:sdda0:srob0}).
1250 It can be observed that Algorithm \ref{alg:crat0:sfdv0:01a} will
1251 exhibit very anomalous behavior if \texttt{state} is upset to a very negative
1252 value. One possible correction to this shortcoming is illustrated
1253 in Figure \ref{fig:crat0:sfdv0:sprc0:01}. Other possible
1254 corrections are the topic of Exercise \ref{exe:crat0:sexe0:08}.
1255
1256 \begin{figure}
1257 \begin{verbatim}
1258 /* The constants K1 through K4, which parameterize the */
1259 /* counting behavior, are assumed assigned elsewhere in */
1260 /* the code. The solution is analyzed in terms of the */
1261 /* parameters K1 through K4. */
1262 /* */
1263 /* We also place the following restrictions on K1 through */
1264 /* K4: */
1265 /* K1 : K1 <= K3 - K2. */
1266 /* K2 : K4 > K2 > 0. */
1267 /* K3 : No restrictions. */
1268 /* K4 : K4 > K2 > 0. */
1269
1270 void base_rate_func(void)
1271 {
1272 static int state = K1;
1273
1274 state += K2;
1275
1276 if ((state < K1) || (state >= K3))
1277 {
1278 state -= K4;
1279 A();
1280 }
1281 }
1282 \end{verbatim}
1283 \caption{Algorithm \ref{alg:crat0:sfdv0:01a} With State Upset Shortcoming
1284 Corrected}
1285 \label{fig:crat0:sfdv0:sprc0:01}
1286 \end{figure}
1287
1288 \begin{vworkexamplestatement}
1289 \label{ex:crat0:sfdv0:sprc0:01}
1290 Determine the behavior of Algorithm \ref{alg:crat0:sfdv0:01a} with
1291 $K_1=0$, $K_2=30$, and $K_3=K_4=50$.
1292 \end{vworkexamplestatement}
1293 \begin{vworkexampleparsection}{Solution}
1294 We first predict the behavior, and then trace the algorithm to
1295 verify whether the predictions are accurate.
1296
1297 We make the following predictions:
1298
1299 \begin{itemize}
1300 \item The steady state sequence of invocations of ``\texttt{A()}'' will
1301 be periodic with period
1302 $K_4/\gcd(K_2, K_4) = 50/10 = 5$, as described
1303 in Lemma \ref{lem:crat0:sfdv0:sprc0:02}.
1304 \item The number of initial invocations of the
1305 base subroutine in which ``\texttt{A()}''
1306 is not run will be
1307 $\lceil (K_4 - K_2) / K_2 \rceil = \lceil 2/3 \rceil = 1$,
1308 as described in Remark \#4 of
1309 Lemma \ref{lem:crat0:sfdv0:sprc0:03} and in the solution to
1310 Exercise \ref{exe:crat0:sexe0:07}.
1311 \item In steady state, the number of consecutive invocations of the
1312 base subroutine during which ``\texttt{A()}''
1313 is not executed will always be 1, as
1314 described in Equation \ref{eq:lem:crat0:sfdv0:sprc0:03:02} of
1315 Lemma \ref{lem:crat0:sfdv0:sprc0:03}.
1316 (Applying Eq. \ref{eq:lem:crat0:sfdv0:sprc0:03:02}
1317 yields \
1318 $\{ \lfloor 20/30 \rfloor , \lceil 20/30 \rceil \} \cap \vworkintsetpos %
1319 = \{ 0,1 \} \cap \{1, 2, \ldots \} = \{ 1 \}$.)
1320 \item In steady state, the number of consecutive invocations of the
1321 base subroutine during which ``\texttt{A()}''
1322 is executed will always be 1 or 2, as
1323 described in Equation \ref{eq:lem:crat0:sfdv0:sprc0:03:01} of
1324 Lemma \ref{lem:crat0:sfdv0:sprc0:03}.
1325 (Applying Eq. \ref{eq:lem:crat0:sfdv0:sprc0:03:01}
1326 yields \
1327 $\{ \lfloor 30/20 \rfloor , \lceil 30/20 \rceil \} \cap \vworkintsetpos %
1328 = \{ 1,2 \} \cap \{1, 2, \ldots \} = \{ 1,2 \}$.)
1329 \item The rational counting algorithm will have
1330 perfect long-term accuracy.
1331 \end{itemize}
1332
1333 We can verify the predictions above by tracing the behavior of
1334 Algorithm \ref{alg:crat0:sfdv0:01a}. We adopt the convention
1335 that $A_n = 1$ if subroutine ``\texttt{A()}'' is invoked during
1336 the $n$th invocation of the base subroutine.
1337 Table \ref{tbl:crat0:sfdv0:sprc0:01}
1338 contains the results of tracing Algorithm \ref{alg:crat0:sfdv0:01a}
1339 with $K_1=0$, $K_2=30$, and $K_3=K_4=50$.
1340
1341 \begin{table}
1342 \caption{Trace Of Algorithm \ref{alg:crat0:sfdv0:01a} With
1343 $K_1=0$, $K_2=30$, And $K_3=K_4=50$ (Example \ref{ex:crat0:sfdv0:sprc0:01})}
1344 \label{tbl:crat0:sfdv0:sprc0:01}
1345 \begin{center}
1346 \begin{tabular}{|c|c|c|}
1347 \hline
1348 Index ($n$) & $state_n$ & $A_n$ \\
1349 \hline
1350 \hline
1351 0 & 0 & N/A \\
1352 \hline
1353 1 & 30 & 0 \\
1354 \hline
1355 2 & 10 & 1 \\
1356 \hline
1357 3 & 40 & 0 \\
1358 \hline
1359 4 & 20 & 1 \\
1360 \hline
1361 5 & 0 & 1 \\
1362 \hline
1363 6 & 30 & 0 \\
1364 \hline
1365 7 & 10 & 1 \\
1366 \hline
1367 8 & 40 & 0 \\
1368 \hline
1369 9 & 20 & 1 \\
1370 \hline
1371 10 & 0 & 1 \\
1372 \hline
1373 \end{tabular}
1374 \end{center}
1375 \end{table}
1376
1377 It can be verfied from the table that all of the
1378 predicted properties are exhibited by the
1379 algorithm.
1380 \end{vworkexampleparsection}
1381 \vworkexamplefooter{}
1382
1383 A second characteristic of Algorithm \ref{alg:crat0:sfdv0:01a}
1384 that should be analyzed carefully is the behavior
1385 of the algorithm if parameters $K_2$ and $K_4$ are adjusted
1386 ``on the fly''. ``On-the-fly'' adjustment
1387 raises the following concerns. We assume for convenience
1388 that $K_1=0$ and $K_3=K_4$.
1389
1390 \begin{enumerate}
1391 \item \label{enum:crat0:sfdv0:sprc0:01:01}
1392 \textbf{Critical section protocol:} if the
1393 rational counting algorithm is implemented in a process which
1394 is asynchronous to the process which desires to change
1395 $K_2$ and $K_4$, what precautions must be taken?
1396 \item \label{enum:crat0:sfdv0:sprc0:01:02}
1397 \textbf{Anomalous behavior:} will the rational
1398 counting algorithm behave in a \emph{very} unexpected way
1399 if $K_2$ and $K_4$ are changed on the fly?
1400 \item \label{enum:crat0:sfdv0:sprc0:01:03}
1401 \textbf{Preservation of accuracy:} even if the behavior
1402 exhibited is not \emph{extremely} anomalous, how should
1403 $K_2$ and $K_4$ be modified on the fly so as to preserve the
1404 maximum accuracy?
1405 \end{enumerate}
1406
1407 \textbf{(Concern \#\ref{enum:crat0:sfdv0:sprc0:01:02}):} It can be observed
1408 with Algorithm \ref{alg:crat0:sfdv0:01a} that neither increasing
1409 nor decreasing $K_2$ nor $K_4$ on the fly
1410 will lead to \emph{highly} anomalous
1411 behavior. Each invocation of the algorithm will map
1412 \texttt{state} back into the set identified in
1413 (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}). Thus on-the-fly changes
1414 to $K_2$ and $K_4$ will establish the rational counting algorithm
1415 immediately into steady-state behavior, and the result will not be
1416 \emph{highly} anomalous if such on-the-fly changes are not
1417 made very often.
1418
1419 \textbf{(Concern \#\ref{enum:crat0:sfdv0:sprc0:01:03}):} It can be deduced
1420 from
1421 (\ref{eq:lem:crat0:sfdv0:sprc0:04:02}),
1422 (\ref{eq:lem:crat0:sfdv0:sprc0:04:03}), and
1423 (\ref{eq:lem:crat0:sfdv0:sprc0:04:04}) that the value of the
1424 \texttt{state} variable in Algorithm \ref{alg:crat0:sfdv0:01a}
1425 satisfies the relationship
1426
1427 \begin{equation}
1428 \label{eq:crat0:sfdv0:sprc0:01}
1429 \overline{N_O} - N_O = \frac{state}{K_4} ;
1430 \end{equation}
1431
1432 \noindent{}in other words, the \texttt{state} variable
1433 contains the remainder of an effective division by $K_4$
1434 and thus maintains the fractional part of $\overline{N_O}$.
1435 Altering $K_4$ on the fly to a new value
1436 (say, $\overline{K_4}$) may be problematic, because
1437 to preserve the current fractional part
1438 of $\overline{N_O}$, one must adjust it for
1439 the new denominator $\overline{K_4}$. This requires
1440 solving the equation
1441
1442 \begin{equation}
1443 \label{eq:crat0:sfdv0:sprc0:02}
1444 \frac{state}{K_4} = \frac{n}{\;\;\overline{K_4}\;\;}
1445 \end{equation}
1446
1447 \noindent{}for $n$ which must be an integer to avoid
1448 loss of information. In general,
1449 this would require that $K_4 \vworkdivides \overline{K_4}$,
1450 a constraint which would be rarely met. Thus, for high-precision
1451 applications where a new rational counting rate should become effective
1452 seamlessly, the best strategy would seem to be to modify $K_2$ only.
1453 It can be verified that modifying $K_2$ on the fly accomplishes
1454 a perfect rate transition.
1455
1456 \textbf{(Concern \#\ref{enum:crat0:sfdv0:sprc0:01:01}):} In microcontroller work,
1457 ordinal data types often represent machine-native data types. In such cases,
1458 it may be possible for one process to set $K_2$ or $K_4$
1459 for another process that is asynchronous with respect to it by relying
1460 on the atomicity of machine instructions (i.e. without formal mutual
1461 exclusion protocol). However, in other cases where the ordinal data types
1462 of $K_2$ or $K_4$ are larger than can be accomodated by
1463 a single machine instruction or where $K_2$ and $K_4$ must be modified
1464 together atomically, mutual exclusion protocol should be used to
1465 prevent anomalous behavior due to race conditions (see
1466 Exercise \ref{exe:crat0:sexe0:14}).
1467
1468 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1469 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1470 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1471 \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01b}]
1472 {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01b}}
1473 %Section tag: PRC1
1474 \label{crat0:sfdv0:sprc1}
1475
1476 Algorithm \ref{alg:crat0:sfdv0:01a}
1477 has the disadvantage that it requires $K_2/K_4 < 1$ (i.e. it can only
1478 decrease frequency, but never increase frequency). This deficiency
1479 can be corrected by using
1480 Algorithm \ref{alg:crat0:sfdv0:01b}.
1481
1482 Note that Algorithm \ref{alg:crat0:sfdv0:01b} will properly deal with $K_2$ and
1483 $K_4$ chosen such that $0 < K_2/K_4 < \infty$.
1484
1485 The most common reason that one may want a counting algorithm
1486 that will correctly handle
1487 $K_2/K_4 \geq 1$ is to conveniently handle $K_2/K_4 \approx 1$.
1488 In practice, $K_2/K_4$ may represent a quantity that is
1489 normally very close to
1490 1 but may also be slightly less than or slightly greater than 1.
1491 For example, one may use $K_2/K_4 \approx 1$ to correct for a
1492 crystal or a resonator which deviates slightly from its nominal
1493 frequency. We illustrate this with the following example.
1494
1495 \begin{vworkexamplestatement}
1496 \label{ex:crat0:sfdv0:sprc1:01}
1497 A microcontroller software load keeps time via an interrupt
1498 service routine that runs every 1ms, but this frequency may be
1499 off by as much as 1 part in 10,000 due to variations in
1500 crystal or resonator manufacture. The interrupt service routine
1501 updates a counter which represents the number of milliseconds elapsed since
1502 the software load was reset. Devise a rational counting strategy
1503 based on Algorithm \ref{alg:crat0:sfdv0:01b}
1504 which will allow the time accuracy to be trimmed to within
1505 one second per year or less by adjusting only $K_4$, and implement the counting strategy
1506 in software.
1507 \end{vworkexamplestatement}
1508 \begin{vworkexampleparsection}{Solution}
1509 $K_2/K_4$ will be nominally very close to 1 ($K_2 \approx K_4$).
1510 If we assume that each year has 365.2422\footnote{The period of the earth's
1511 rotation about the sun is not an integral number of days, which is why the
1512 rules for leap years exist. Ironically, the assignment of leap years is itself
1513 a problem very similar to the rational counting problems discussed in this chapter.} days
1514 ($\approx$ 31,556,926 seconds), then choosing
1515 $K_2 \approx K_4 = 31,556,926$ will yield satisfactory results.
1516 If we may need to compensate for up to 1 part in 10,000 of crystal or resonator
1517 inaccuracy, we may need to adjust $K_2$ as low as 0.9999 $\times$ 31,556,926 $\approx$
1518 31,553,770 (to compensate for a fast
1519 crystal or resonator) or as
1520 high as 1.0001 $\times$ 31,556,926
1521 $\approx$ 31,560,082
1522 (to compensate for a slow crystal or resonator). Choosing
1523 $K_4 = 31,556,926$ yields the convenient relationship that each
1524 count in $K_2$ corresponds to one second per year.
1525
1526 \begin{figure}
1527 \begin{verbatim}
1528 /* The constants K1 through K4, which parameterize the */
1529 /* counting behavior, are assumed assigned elsewhere in */
1530 /* the code. */
1531 /* */
1532 /* The variable time_count below is the number of milli- */
1533 /* seconds since the software was reset. */
1534 int time_count = 0;
1535
1536 /* It is assumed that the base rate subroutine below is */
1537 /* called every millisecond (or, at least what should be */
1538 /* every millisecond of the crystal or resonator were */
1539 /* perfect). */
1540
1541 void base_rate_sub(void)
1542 {
1543 static int state = K1;
1544
1545 state += K2;
1546
1547 while (state >= K3)
1548 {
1549 state -= K4;
1550 time_count++;
1551 }
1552 }
1553 \end{verbatim}
1554 \caption{Algorithm \ref{alg:crat0:sfdv0:01b} Applied To Timekeeping
1555 (Example \ref{ex:crat0:sfdv0:sprc1:01})}
1556 \label{fig:ex:crat0:sfdv0:sprc1:01:01}
1557 \end{figure}
1558
1559 Figure \ref{fig:ex:crat0:sfdv0:sprc1:01:01} provides an illustration
1560 of Algorithm \ref{alg:crat0:sfdv0:01b} applied in this scenario.
1561 We assume that $K_4$ contains the constant value 31,556,926
1562 and that $K_2$ is modified about this value either downwards or upwards
1563 to trim the timekeeping. Note that Algorithm \ref{alg:crat0:sfdv0:01b} will correctly
1564 handle $K_2 \geq K_4$.
1565
1566 Also note in the implementation illustrated in Figure
1567 \ref{fig:ex:crat0:sfdv0:sprc1:01:01} that large integers (27 bits or more)
1568 are required. (See also Exercise \ref{exe:crat0:sexe0:09}).
1569 \end{vworkexampleparsection}
1570 \vworkexamplefooter{}
1571
1572 It may not be obvious whether Algorithm \ref{alg:crat0:sfdv0:01b} has the
1573 same or similar desirable properties as Algorithm \ref{alg:crat0:sfdv0:01a}
1574 presented
1575 in Lemmas
1576 \ref{lem:crat0:sfdv0:sprc0:01},
1577 \ref{lem:crat0:sfdv0:sprc0:02},
1578 \ref{lem:crat0:sfdv0:sprc0:04},
1579 and
1580 \ref{lem:crat0:sfdv0:sprc0:03}.
1581 Algorithm \ref{alg:crat0:sfdv0:01b} does have these desirable
1582 properties, and these properties are presented as
1583 Lemmas \ref{lem:crat0:sfdv0:sprc1:01},
1584 \ref{lem:crat0:sfdv0:sprc1:02},
1585 \ref{lem:crat0:sfdv0:sprc1:03}, and
1586 \ref{lem:crat0:sfdv0:sprc1:04}.
1587 The proofs of these lemmas are identical or very similar to the proofs
1588 of Lemmas
1589 \ref{lem:crat0:sfdv0:sprc0:01},
1590 \ref{lem:crat0:sfdv0:sprc0:02},
1591 \ref{lem:crat0:sfdv0:sprc0:04},
1592 and
1593 \ref{lem:crat0:sfdv0:sprc0:03};
1594 and so these proofs when not identical are presented as exercises.
1595 Note that Algorithm \ref{alg:crat0:sfdv0:01b} behaves identically to
1596 Algorithm \ref{alg:crat0:sfdv0:01a} when $K_2 < K_4$, and the
1597 case of $K_2=K_4$ is trivial, so in general only
1598 the behavior when $K_2 > K_4$ remains to be proved.
1599
1600 \begin{vworklemmastatement}
1601 \label{lem:crat0:sfdv0:sprc1:01}
1602 $N_{STARTUP}$, the number of invocations of the base subroutine
1603 in Algorithm \ref{alg:crat0:sfdv0:01b} before ``\texttt{A()}'' is called
1604 for the first time, is given by
1605
1606 \begin{equation}
1607 \label{eq:lem:crat0:sfdv0:sprc1:01:01}
1608 N_{STARTUP} =
1609 \left\lceil
1610 {
1611 \frac{-K_1 - K_2 + K_3}{K_2}
1612 }
1613 \right\rceil .
1614 \end{equation}
1615 \end{vworklemmastatement}
1616 \begin{vworklemmaproof}
1617 The proof is identical to the proof of Lemma
1618 \ref{lem:crat0:sfdv0:sprc0:01}.
1619 \end{vworklemmaproof}
1620
1621
1622 \begin{vworklemmastatement}
1623 \label{lem:crat0:sfdv0:sprc1:02}
1624 Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:01b}
1625 base subroutine
1626 is called, let $N_O$ be the number of times the
1627 ``\texttt{A()}'' subroutine is called, let
1628 $f_I$ be the frequency of invocation of the
1629 Algorithm \ref{alg:crat0:sfdv0:01a} base subroutine, and let
1630 $f_O$ be the frequency of invocation of
1631 ``\texttt{A()}''.
1632
1633 \begin{equation}
1634 \label{eq:lem:crat0:sfdv0:sprc1:02:01}
1635 \lim_{N_I\rightarrow\infty}\frac{N_O}{N_I}
1636 =
1637 \frac{f_O}{f_I}
1638 =
1639 \frac{K_2}{K_4} .
1640 \end{equation}
1641 \end{vworklemmastatement}
1642 \begin{vworklemmaproof}
1643 See Exercise \ref{exe:crat0:sexe0:10}.
1644 \end{vworklemmaproof}
1645
1646 \begin{vworklemmastatement}
1647 \label{lem:crat0:sfdv0:sprc1:03}
1648 If $K_3=K_4$, $K_1=0$, and $\gcd(K_2, K_4)=1$\footnote{See also
1649 footnote \ref{footnote:lem:crat0:sfdv0:sprc0:04:01} in this chapter.}, the error between
1650 the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01b}
1651 and the ``ideal'' mapping is always
1652 in the set
1653
1654 \begin{equation}
1655 \label{eq:lem:crat0:sfdv0:sprc1:03:01}
1656 \left[ - \frac{K_4 - 1}{K_4} , 0 \right] ,
1657 \end{equation}
1658
1659 and no algorithm can be constructed to
1660 confine the error to a smaller interval.
1661 \end{vworklemmastatement}
1662 \begin{vworklemmaproof}
1663 The proof is identical to the proof of Lemma \ref{lem:crat0:sfdv0:sprc0:04}.
1664 \end{vworklemmaproof}
1665
1666 \begin{vworklemmastatement}
1667 \label{lem:crat0:sfdv0:sprc1:04}
1668 For Algorithm \ref{alg:crat0:sfdv0:01b}
1669 with
1670 $K_2 \geq K_4$, once steady
1671 state has been achieved (see Exercise
1672 \ref{exe:crat0:sexe0:13}), each invocation of the
1673 base subroutine will result in
1674 a number of invocations of
1675 ``\texttt{A()}'' which is in the set
1676
1677 \begin{equation}
1678 \label{eq:lem:crat0:sfdv0:sprc1:04:01}
1679 \left\{
1680 \left\lfloor \frac{K_2}{K_4} \right\rfloor ,
1681 \left\lceil \frac{K_2}{K_4} \right\rceil
1682 \right\},
1683 \end{equation}
1684
1685 which contains one integer if $K_4 \vworkdivides K_2$,
1686 or two integers otherwise. With $K_2 < K_4$,
1687 the behavior will be as specified in Lemma
1688 \ref{lem:crat0:sfdv0:sprc0:03}.
1689 \end{vworklemmastatement}
1690 \begin{vworklemmaproof}
1691 See Exercise \ref{exe:crat0:sexe0:12}.
1692 \end{vworklemmaproof}
1693 \begin{vworklemmaparsection}{Remark}
1694 Note that Lemma \ref{lem:crat0:sfdv0:sprc0:03}
1695 and this lemma specify different aspects of behavior,
1696 which is why (\ref{eq:lem:crat0:sfdv0:sprc0:03:01})
1697 and (\ref{eq:lem:crat0:sfdv0:sprc0:03:02}) take
1698 different forms than
1699 (\ref{eq:lem:crat0:sfdv0:sprc1:04:01}).
1700 Lemma \ref{lem:crat0:sfdv0:sprc0:03} specifies the number of consecutive
1701 invocations of the base subroutine for which ``\texttt{A()}''
1702 will be run, but with $K_2 \geq K_4$ it does not make sense to
1703 specify behavior in this way since ``\texttt{A()}'' will be run
1704 on \emph{every} invocation of the base subroutine. This lemma specifies
1705 the number of times ``\texttt{A()}'' will be run on a \emph{single}
1706 invocation of the base subroutine (which is not meaningful if
1707 $K_2 < K_4$ since the result will always be 0 or 1).
1708 \end{vworklemmaparsection}
1709 %\vworklemmafooter{}
1710
1711
1712 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1713 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1714 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1715 \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01c}]
1716 {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01c}}
1717 %Section tag: PRX0
1718 \label{crat0:sfdv0:sprx0}
1719
1720 Algorithm \ref{alg:crat0:sfdv0:01c}\footnote{Algorithm \ref{alg:crat0:sfdv0:01c}
1721 was contributed in March, 2003
1722 by Chuck B. Falconer \cite{bibref:i:chuckbfalconer}
1723 via the
1724 \texttt{comp.arch.embedded} \cite{bibref:n:comparchembedded}
1725 newsgroup.}
1726 is a variant of Algorithm \ref{alg:crat0:sfdv0:01a}
1727 which has one fewer
1728 degrees of freedom than Algorithms \ref{alg:crat0:sfdv0:01a}
1729 and \ref{alg:crat0:sfdv0:01b} and can be implemented
1730 more efficiently under most instruction sets. Algorithm \ref{alg:crat0:sfdv0:01c}
1731 is superior to Algorithms \ref{alg:crat0:sfdv0:01a}
1732 and \ref{alg:crat0:sfdv0:01b}
1733 from a computational efficiency
1734 point of view, but is less intuitive.
1735
1736 The superiority in computational efficiency of Algorithm \ref{alg:crat0:sfdv0:01c}
1737 comes from the possibility of using an implicit test against zero
1738 (rather than an explicit
1739 test against $K_3$, as is found in Algorithms \ref{alg:crat0:sfdv0:01a}
1740 and \ref{alg:crat0:sfdv0:01b}).
1741 Many machine instruction sets automatically set flags to indicate a negative
1742 result when the
1743 subtraction of $K_2$ is performed, thus often allowing a conditional branch
1744 without an additional instruction. Whether an instruction will be saved in
1745 the code of Figure \ref{fig:crat0:sfdv0:01c} depends on the sophistication
1746 of the `C' compiler, but of course if the algorithm were coded in
1747 assembly-language an instruction could be saved on most processors.
1748
1749 The properties of rational counting Algorithm \ref{alg:crat0:sfdv0:01c} are nearly
1750 identical to those of Algorithm \ref{alg:crat0:sfdv0:01a},
1751 and we prove the important properties
1752 now.
1753
1754 \begin{vworklemmastatement}
1755 \label{lem:crat0:sfdv0:sprx0:01}
1756 $N_{STARTUP}$, the number of invocations of the base subroutine
1757 in Algorithm \ref{alg:crat0:sfdv0:01c} before ``\texttt{A()}'' is called
1758 for the first time, is given by
1759
1760 \begin{equation}
1761 \label{eq:lem:crat0:sfdv0:sprx0:01:01}
1762 N_{STARTUP} =
1763 \left\lfloor
1764 {
1765 \frac{K_1}{K_2}
1766 }
1767 \right\rfloor .
1768 \end{equation}
1769 \end{vworklemmastatement}
1770 \begin{vworklemmaproof}
1771 The value of \texttt{state} when tested against
1772 zero in the \texttt{if()} statement during the $n$th invocation
1773 of the base subroutine is $K_1 - n K_2$. In order for the test
1774 not to be met on the $n$th invocation
1775 of the base subroutine, we require that
1776
1777 \begin{equation}
1778 \label{eq:lem:crat0:sfdv0:sprx0:01:02}
1779 K_1 - n K_2 \geq 0
1780 \end{equation}
1781
1782 \noindent{}or equivalently that
1783
1784 \begin{equation}
1785 \label{eq:lem:crat0:sfdv0:sprx0:01:03}
1786 n \leq \frac{K_1}{K_2} .
1787 \end{equation}
1788
1789 Solving (\ref{eq:lem:crat0:sfdv0:sprx0:01:03}) for the
1790 largest value of $n \in \vworkintset$ which still meets the criterion
1791 yields (\ref{eq:lem:crat0:sfdv0:sprx0:01:01}). Note that
1792 the derivation of (\ref{eq:lem:crat0:sfdv0:sprx0:01:01}) requires
1793 that the restrictions on $K_1$, $K_2$, and $K_3$ documented in
1794 Figure \ref{fig:crat0:sfdv0:01c} be met.
1795 \end{vworklemmaproof}
1796
1797 \begin{vworklemmastatement}
1798 \label{lem:crat0:sfdv0:sprx0:02}
1799 Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:01c}
1800 base subroutine
1801 is called, let $N_O$ be the number of times the
1802 ``\texttt{A()}'' subroutine is called, let
1803 $f_I$ be the frequency of invocation of the
1804 Algorithm \ref{alg:crat0:sfdv0:01a}
1805 base subroutine, and let
1806 $f_O$ be the frequency of invocation of
1807 ``\texttt{A()}''. Provided the constraints
1808 on $K_1$, $K_2$, and $K_3$ documented in
1809 Figure \ref{fig:crat0:sfdv0:01c} are met,
1810
1811 \begin{equation}
1812 \label{eq:lem:crat0:sfdv0:sprx0:02:01}
1813 \lim_{N_I\rightarrow\infty}\frac{N_O}{N_I}
1814 =
1815 \frac{f_O}{f_I}
1816 =
1817 \frac{K_2}{K_4} .
1818 \end{equation}
1819 \end{vworklemmastatement}
1820 \begin{vworklemmaproof}
1821 (\ref{eq:lem:crat0:sfdv0:sprx0:02:01}) indicates that once
1822 an initial delay (determined by $K_1$) has finished,
1823 $N_O/N_I$ will converge on a steady-state value of
1824 $K_2/K_4$.
1825
1826 The most straightforward way to analyze Algorithm \ref{alg:crat0:sfdv0:01c}
1827 is to show how an algorithm already
1828 understood (Algorithm \ref{alg:crat0:sfdv0:01a})
1829 can be transformed to
1830 Algorithm \ref{alg:crat0:sfdv0:01c}
1831 in a way where the analysis of Algorithm \ref{alg:crat0:sfdv0:01a}
1832 also applies to Algorithm \ref{alg:crat0:sfdv0:01c}.
1833 Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01} shows
1834 how such a transformation can be performed in
1835 four steps.
1836
1837 \begin{figure}
1838 (a) Algorithm \ref{alg:crat0:sfdv0:01a} unchanged.
1839 $state_{a,n} \in \{0, 1, \ldots, K_4 - 1 \}$.
1840 \begin{verbatim}
1841 state += K2;
1842 if (state >= K4)
1843 {
1844 state -= K4;
1845 A();
1846 }
1847 \end{verbatim}
1848 (b) ``\texttt{>=}'' changed to ``\texttt{>}''. $state_{b,n} \in \{1, 2, \ldots, K_4 \}$,
1849 $state_{b,n} = state_{a,n} + 1$.
1850 \begin{verbatim}
1851 state += K2;
1852 if (state > K4)
1853 {
1854 state -= K4;
1855 A();
1856 }
1857 \end{verbatim}
1858 (c) Test against $K_4$ changed to test against zero.
1859 $state_{c,n} \in \{-K_4 + 1, -K_4 + 2, \ldots, 0 \}$,
1860 $state_{c,n} = state_{b,n} - K_4$.
1861 \begin{verbatim}
1862 state += K2;
1863 if (state > 0)
1864 {
1865 state -= K4;
1866 A();
1867 }
1868 \end{verbatim}
1869 (d) Sign inversion.
1870 $state_{d,n} \in \{ 0, 1, \ldots, K_4 - 1 \}$,
1871 $state_{d,n} = - state_{c,n}$.
1872 \begin{verbatim}
1873 state -= K2;
1874 if (state < 0)
1875 {
1876 state += K4;
1877 A();
1878 }
1879 \end{verbatim}
1880 (e) `C' expression rearrangement.
1881 $state_{e,n} \in \{ 0, 1, \ldots, K_4 - 1 \}$,
1882 $state_{e,n} = state_{d,n}$.
1883 \begin{verbatim}
1884 if ((state -= K2) < 0)
1885 {
1886 state += K4;
1887 A();
1888 }
1889 \end{verbatim}
1890 \caption{4-Step Transformation Of Algorithm \ref{alg:crat0:sfdv0:01a}
1891 To Algorithm \ref{alg:crat0:sfdv0:01c}}
1892 \label{fig:lem:crat0:sfdv0:sprx0:02:01}
1893 \end{figure}
1894
1895 In Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01}, each of the
1896 four steps required to transform from Algorithm \ref{alg:crat0:sfdv0:01a} to
1897 Algorithm \ref{alg:crat0:sfdv0:01c} includes an equation to transform the
1898 \texttt{state} variable. Combining all of these
1899 transformations yields
1900
1901 \begin{eqnarray}
1902 \label{eq:lem:crat0:sfdv0:sprx0:02:02}
1903 state_{e,n} & = & K_4 - 1 - state_{a,n} \\
1904 \label{eq:lem:crat0:sfdv0:sprx0:02:03}
1905 state_{a,n} & = & K_4 - 1 - state_{e,n}
1906 \end{eqnarray}
1907
1908 We thus see that Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01}(a)
1909 (corresponding to Algorithm \ref{alg:crat0:sfdv0:01a}) and
1910 Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01}(e)
1911 (corresponding to Algorithm \ref{alg:crat0:sfdv0:01c}) have
1912 \texttt{state} semantics which involve the same range
1913 but a reversed order. (\ref{eq:lem:crat0:sfdv0:sprx0:02:01})
1914 follows directly from this observation and from
1915 Lemma \ref{lem:crat0:sfdv0:sprc0:02}.
1916 \end{vworklemmaproof}
1917 %\vworklemmafooter{}
1918
1919 \begin{vworklemmastatement}
1920 \label{lem:crat0:sfdv0:sprx0:03}
1921 If $K_1=0$ and $\gcd(K_2, K_4)=1$\footnote{See also
1922 footnote \ref{footnote:lem:crat0:sfdv0:sprc0:04:01} in this chapter.}, the error between
1923 the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01c}
1924 and the ``ideal'' mapping is always
1925 in the set
1926
1927 \begin{equation}
1928 \label{eq:lem:crat0:sfdv0:sprx0:03:01}
1929 \left[ 0, \frac{K_4 - 1}{K_4} \right] ,
1930 \end{equation}
1931
1932 and no algorithm can be constructed to
1933 confine the error to a smaller interval.
1934 \end{vworklemmastatement}
1935 \begin{vworklemmaproof}
1936 Using the duality illustrated by
1937 (\ref{eq:lem:crat0:sfdv0:sprx0:02:02}) and
1938 (\ref{eq:lem:crat0:sfdv0:sprx0:02:03}),
1939 starting Algorithm \ref{alg:crat0:sfdv0:01c} with
1940 $state_0=0$ will yield a dual state vector
1941 with respect to starting Algorithm \ref{alg:crat0:sfdv0:01a} with
1942 $state_0=K_4-1$. Thus,
1943
1944 \begin{equation}
1945 \label{eq:lem:crat0:sfdv0:sprx0:03:02}
1946 N_O = \left\lfloor \frac{n K_2 + K_4 - 1}{K_4} \right\rfloor .
1947 \end{equation}
1948
1949 Using this altered value of $N_O$ in (\ref{eq:lem:crat0:sfdv0:sprc0:04:04})
1950 leads directly to (\ref{eq:lem:crat0:sfdv0:sprx0:03:01}).
1951
1952 The proof that there can be no better algorithm is identical
1953 to the same proof for Lemma \ref{lem:crat0:sfdv0:sprc0:04} (Exercise \ref{exe:crat0:sexe0:06}).
1954 \end{vworklemmaproof}
1955 %\vworklemmafooter{}
1956
1957 \begin{vworklemmastatement}
1958 \label{lem:crat0:sfdv0:sprx0:04}
1959 For Algorithm \ref{alg:crat0:sfdv0:01c}, once steady
1960 state has been achieved, the number of consecutive
1961 base subroutine invocations during which subroutine
1962 ``\texttt{A()}'' is executed is always in the set
1963
1964 \begin{equation}
1965 \label{eq:lem:crat0:sfdv0:sprx0:04:01}
1966 \left\{
1967 \left\lfloor \frac{K_2}{K_4 - K_2} \right\rfloor ,
1968 \left\lceil \frac{K_2}{K_4 - K_2} \right\rceil
1969 \right\} \cap \vworkintsetpos,
1970 \end{equation}
1971
1972 which contains one integer if $K_2/K_4 \leq 1/2$ or $(K_4-K_2) \vworkdivides K_2$,
1973 or two integers otherwise.
1974
1975 Once steady state has been achieved, the number of
1976 consecutive base function invocations during which
1977 subroutine ``\texttt{A()}'' is not executed is
1978 always in the set
1979
1980 \begin{equation}
1981 \label{eq:lem:crat0:sfdv0:sprx0:04:02}
1982 \left\{
1983 \left\lfloor \frac{K_4-K_2}{K_2} \right\rfloor ,
1984 \left\lceil \frac{K_4-K_2}{K_2} \right\rceil
1985 \right\} \cap \vworkintsetpos,
1986 \end{equation}
1987
1988 which contains one integer if $K_2/K_4 \geq 1/2$ or $K_2 \vworkdivides K_4$,
1989 or two integers otherwise.
1990 \end{vworklemmastatement}
1991 \begin{vworklemmaproof}
1992 The proof comes directly from the duality between algorithm
1993 Algorithms \ref{alg:crat0:sfdv0:01a}
1994 and \ref{alg:crat0:sfdv0:01c} established in the
1995 proof of Lemma \ref{lem:crat0:sfdv0:sprx0:01}, so that the results
1996 from Lemma \ref{lem:crat0:sfdv0:sprc0:03} apply without modification.
1997 \end{vworklemmaproof}
1998 \vworklemmafooter{}
1999
2000
2001 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2002 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2003 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2004 \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01d}]
2005 {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01d}}
2006 %Section tag: PRX1
2007 \label{crat0:sfdv0:sprx1}
2008
2009 Algorithm \ref{alg:crat0:sfdv0:01d}\footnote{Algorithm \ref{alg:crat0:sfdv0:01d}
2010 was contributed in March, 2003
2011 by John Larkin \cite{bibref:i:johnlarkin}
2012 via the
2013 \texttt{comp.arch.embedded} \cite{bibref:n:comparchembedded}
2014 newsgroup.}
2015 (Figure \ref{fig:crat0:sfdv0:01d}) is a further
2016 economization of Algorithms \ref{alg:crat0:sfdv0:01a}
2017 through \ref{alg:crat0:sfdv0:01c} that can be made by eliminating
2018 the addition or subtraction of $K_4$ and test against $K_3$
2019 and instead using the
2020 inherent machine integer size of $W$ bits to perform
2021 arithmetic modulo $2^W$. Thus, effectively, Algorithm \ref{alg:crat0:sfdv0:01d}
2022 is equivalent to Algorithm \ref{alg:crat0:sfdv0:01a} with
2023 $K_4 = K_3 = 2^W$.
2024
2025 Figure \ref{fig:crat0:sfdv0:01d} shows both
2026 assembly-language (Texas Instruments TMS-370C8) and
2027 `C' implementations of the algorithm. The assembly-language
2028 version uses the carry flag of the processor and thus
2029 is \emph{very} efficient. Because `C' does not have access
2030 to the processor flags, the 'C' version is less efficient.
2031 The ``less than'' comparison when
2032 using unsigned integers is equivalent to a rollover test.
2033
2034 It is easy to see from the figure that Algorithm \ref{alg:crat0:sfdv0:01d}
2035 is equivalent in all
2036 respects to Algorithm \ref{alg:crat0:sfdv0:01a} with
2037 $K_3 = K_4$ fixed at $2^W$. It is not necessary to enforce any constraints
2038 on $K_2$ because $K_2 < K_3 = K_4 = 2^W$ due to the inherent size of
2039 a machine integer. Note that unlike Algorithms \ref{alg:crat0:sfdv0:01a}
2040 through \ref{alg:crat0:sfdv0:01c} which allow $K_2$ and $K_4$ to be chosen independently
2041 and from the Farey series of appropriate order, Algorithm \ref{alg:crat0:sfdv0:01c}
2042 only allows
2043 $K_2/K_4$ of the form $K_2/2^W$.
2044
2045 The properties below follow immediately
2046 from the properties of Algorithm \ref{alg:crat0:sfdv0:01a}.
2047
2048 \begin{vworklemmastatement}
2049 \label{lem:crat0:sfdv0:sprx1:01}
2050 $N_{STARTUP}$, the number of invocations of the base subroutine
2051 in Algorithm \ref{alg:crat0:sfdv0:01d} before ``\texttt{A()}'' is called
2052 for the first time, is given by
2053
2054 \begin{equation}
2055 \label{eq:lem:crat0:sfdv0:sprx1:01:01}
2056 N_{STARTUP} =
2057 \left\lfloor
2058 {
2059 \frac{2^W - K_1 - 1}{K_2}
2060 }
2061 \right\rfloor .
2062 \end{equation}
2063 \end{vworklemmastatement}
2064 \begin{vworklemmaproof}
2065 The value of \texttt{state} after the $n$th invocation
2066 is $state_n = K_1 + n K_2$. In order for the test in the
2067 \texttt{if()} statement not to be met, we require that
2068
2069 \begin{equation}
2070 \label{eq:lem:crat0:sfdv0:sprx1:01:02}
2071 K_1 + n K_2 \leq 2^W - 1
2072 \end{equation}
2073
2074 \noindent{}or equivalently that
2075
2076 \begin{equation}
2077 \label{eq:lem:crat0:sfdv0:sprx1:01:03}
2078 n \leq \frac{2^W - K_1 - 1}{K_2} .
2079 \end{equation}
2080
2081 Solving (\ref{eq:lem:crat0:sfdv0:sprx1:01:03}) for the largest
2082 value of $n \in \vworkintset$ which still meets the criterion
2083 yields (\ref{eq:lem:crat0:sfdv0:sprx1:01:01}).
2084 \end{vworklemmaproof}
2085
2086 \begin{vworklemmastatement}
2087 \label{lem:crat0:sfdv0:sprx1:02}
2088 Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:01d} base subroutine
2089 is called, let $N_O$ be the number of times the
2090 ``\texttt{A()}'' subroutine is called, let
2091 $f_I$ be the frequency of invocation of the
2092 Algorithm \ref{alg:crat0:sfdv0:01d} base subroutine, and let
2093 $f_O$ be the frequency of invocation of
2094 ``\texttt{A()}''. Then
2095
2096 \begin{equation}
2097 \label{eq:lem:crat0:sfdv0:sprx1:02:01}
2098 \lim_{N_I\rightarrow\infty}\frac{N_O}{N_I}
2099 =
2100 \frac{f_O}{f_I}
2101 =
2102 \frac{K_2}{2^W} ,
2103 \end{equation}
2104
2105 where $W$ is the number of bits in a machine unsigned integer.
2106 Note that $K_2 < 2^W$ since $K_2 \in \{ 0, 1, \ldots , 2^W-1 \}$.
2107 \end{vworklemmastatement}
2108 \begin{vworklemmaproof}
2109 The proof is identical to the proof of
2110 Lemma \ref{lem:crat0:sfdv0:sprc0:02} with $K_3=K_4=2^W$.
2111 Note that Algorithm \ref{alg:crat0:sfdv0:01a} calculates $n K_2 \bmod K_4$ by
2112 subtraction, whereas Algorithm \ref{alg:crat0:sfdv0:01d} calculates
2113 $n K_2 \bmod 2^W$ by the properties of a $W$-bit counter
2114 which is allowed to roll over.
2115 \end{vworklemmaproof}
2116 %\vworklemmafooter{}
2117
2118
2119 \begin{vworklemmastatement}
2120 \label{lem:crat0:sfdv0:sprx1:03}
2121 If $\gcd(K_2, 2^W)=1$\footnote{See also footnote \ref{footnote:lem:crat0:sfdv0:sprc0:04:01}
2122 in this chapter. Note also that in this context the condition $\gcd(K_2, 2^W)=1$
2123 is equivalent to the condition that $K_2$ be odd.}, the error between
2124 the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01d}
2125 and the ``ideal'' mapping is always
2126 in the set
2127
2128 \begin{equation}
2129 \label{eq:lem:crat0:sfdv0:sprx1:03:01}
2130 \left[ - \frac{2^W - 1}{2^W} , 0 \right] ,
2131 \end{equation}
2132
2133 and no algorithm can be constructed to
2134 confine the error to a smaller interval.
2135 \end{vworklemmastatement}
2136 \begin{vworklemmaproof}
2137 The proof is identical to the proof of Lemma
2138 \ref{lem:crat0:sfdv0:sprc0:04} with $K_4 = 2^W$.
2139 \end{vworklemmaproof}
2140 %\vworklemmafooter{}
2141
2142 \begin{vworklemmastatement}
2143 \label{lem:crat0:sfdv0:sprx1:04}
2144 For Algorithm \ref{alg:crat0:sfdv0:01d}
2145 (Figure \ref{fig:crat0:sfdv0:01d}), once steady
2146 state has been achieved, the number of consecutive
2147 base subroutine invocations during which subroutine
2148 ``\texttt{A()}'' is executed is always in the set
2149
2150 \begin{equation}
2151 \label{eq:lem:crat0:sfdv0:sprx1:04:01}
2152 \left\{
2153 \left\lfloor \frac{K_2}{2^W - K_2} \right\rfloor ,
2154 \left\lceil \frac{K_2}{2^W - K_2} \right\rceil
2155 \right\} \cap \vworkintsetpos,
2156 \end{equation}
2157
2158 which contains one integer if $K_2/2^W \leq 1/2$ or $(2^W-K_2) \vworkdivides K_2$,
2159 or two integers otherwise.
2160
2161 Once steady state has been achieved, the number of
2162 consecutive base function invocations during which
2163 subroutine ``\texttt{A()}'' is not executed is
2164 always in the set
2165
2166 \begin{equation}
2167 \label{eq:lem:crat0:sfdv0:sprx1:04:02}
2168 \left\{
2169 \left\lfloor \frac{2^W-K_2}{K_2} \right\rfloor ,
2170 \left\lceil \frac{2^W-K_2}{K_2} \right\rceil
2171 \right\} \cap \vworkintsetpos,
2172 \end{equation}
2173
2174 which contains one integer if $K_2/2^W \geq 1/2$ or $K_2 \vworkdivides 2^W$,
2175 or two integers otherwise.
2176 \end{vworklemmastatement}
2177 \begin{vworklemmaproof}
2178 The proof is identical to the proof of Lemma
2179 \ref{lem:crat0:sfdv0:sprc0:03} with $K_4 = 2^W$.
2180 \end{vworklemmaproof}
2181 \vworklemmafooter{}
2182
2183
2184 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2185 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2186 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2187 \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:02a}]
2188 {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:02a}}
2189 %Section tag: PRC2
2190 \label{crat0:sfdv0:sprc2}
2191
2192 Another useful rational counting algorithm is Algorithm \ref{alg:crat0:sfdv0:02a}.
2193 At first glance, it may appear that Algorithm \ref{alg:crat0:sfdv0:02a}
2194 is qualitatively
2195 different than Algorithms \ref{alg:crat0:sfdv0:01a}
2196 and \ref{alg:crat0:sfdv0:01b}.
2197 However, as the following lemmas demonstrate, Algorithm \ref{alg:crat0:sfdv0:02a}
2198 can be easily rearranged to be in the form
2199 of Algorithm \ref{alg:crat0:sfdv0:01a}.
2200
2201 \begin{vworklemmastatement}
2202 \label{lem:crat0:sfdv0:sprc2:01}
2203 $N_{STARTUP}$, the number of invocations of the base subroutine
2204 in Algorithm \ref{alg:crat0:sfdv0:02a} before ``\texttt{A()}'' is called
2205 for the first time, is given by
2206
2207 \begin{equation}
2208 \label{eq:lem:crat0:sfdv0:sprc2:01:01}
2209 N_{STARTUP} =
2210 \left\lceil
2211 {
2212 \frac{K_3 - K_1}{K_2}
2213 }
2214 \right\rceil .
2215 \end{equation}
2216 \end{vworklemmastatement}
2217 \begin{vworklemmaproof}
2218 The value of \texttt{state} after the $n$th invocation
2219 is $K_1 + n K_2$. In order for the test in the
2220 \texttt{if()} statement to be met on the $n+1$'th invocation
2221 of the base subroutine, we require that
2222
2223 \begin{equation}
2224 \label{eq:lem:crat0:sfdv0:sprc2:01:02}
2225 K_1 + n K_2 \geq K_3
2226 \end{equation}
2227
2228 \noindent{}or equivalently that
2229
2230 \begin{equation}
2231 \label{eq:lem:crat0:sfdv0:sprc2:01:03}
2232 n \geq \frac{K_3 - K_1}{K_2} .
2233 \end{equation}
2234
2235 Solving (\ref{eq:lem:crat0:sfdv0:sprc2:01:03}) for the smallest
2236 value of $n \in \vworkintset$ which still meets the criterion
2237 yields (\ref{eq:lem:crat0:sfdv0:sprc2:01:01}). Note that
2238 the derivation of (\ref{eq:lem:crat0:sfdv0:sprc2:01:01}) requires
2239 that the restrictions on $K_1$ through $K_4$ documented in
2240 Figure \ref{fig:crat0:sfdv0:02a} be met.
2241 \end{vworklemmaproof}
2242
2243 \begin{vworklemmastatement}
2244 \label{lem:crat0:sfdv0:sprc2:02}
2245 Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:02a} base subroutine
2246 is called, let $N_{OA}$ be the number of times the
2247 ``\texttt{A()}'' subroutine is called, let
2248 $f_I$ be the frequency of invocation of the
2249 Algorithm \ref{alg:crat0:sfdv0:02a} base subroutine, and let
2250 $f_{OA}$ be the frequency of invocation of
2251 ``\texttt{A()}''. Then, the proportion of times the
2252 ``\texttt{A()}'' subroutine is called is given by
2253
2254 \begin{equation}
2255 \label{eq:lem:crat0:sfdv0:sprc2:02:01}
2256 \lim_{N_I\rightarrow\infty}\frac{N_{OA}}{N_I}
2257 =
2258 \frac{f_{OA}}{f_I}
2259 =
2260 \frac{K_2}{K_4 + K_2} ,
2261 \end{equation}
2262
2263 and the proportion of times the ``\texttt{B()}'' subroutine is called
2264 is given by
2265
2266 \begin{equation}
2267 \label{eq:lem:crat0:sfdv0:sprc2:02:02}
2268 \lim_{N_I\rightarrow\infty}\frac{N_{OB}}{N_I}
2269 =
2270 \frac{f_{OB}}{f_I}
2271 =
2272 1 - \frac{f_{OA}}{f_I}
2273 =
2274 \frac{K_4}{K_4 + K_2} .
2275 \end{equation}
2276 \end{vworklemmastatement}
2277 \begin{vworklemmaproof}
2278 As in Lemma \ref{} and without
2279 loss of generality, we assume for analytic
2280 convenience that $K_1=0$ and $K_3=K_4$. Note that
2281 $K_1$ and $K_3$ influence only the transient startup
2282 behavior of the algorithm.
2283
2284 It can be observed from the algorithm that once steady
2285 state is achieved, \texttt{state} will be confined to the set
2286
2287 \begin{equation}
2288 \label{eq:lem:crat0:sfdv0:sprc2:02:10}
2289 state \in \{ 0, 1, \ldots , K_4 + K_2 - 1 \} .
2290 \end{equation}
2291
2292 It is certainly possible to use results from
2293 number theory and analyze which values in the
2294 set (\ref{eq:lem:crat0:sfdv0:sprc2:02:10}) can be
2295 attained and the order in which they can be attained.
2296 However, an easier approach is to observe that
2297 Algorithm \ref{alg:crat0:sfdv0:02a}
2298 can be rearranged to take the form of
2299 rational counting Algorithm \ref{alg:crat0:sfdv0:01a}.
2300 This rearranged
2301 algorithm is presented as
2302 Figure \ref{fig:lem:crat0:sfdv0:sprc2:02:01}. Note that the
2303 algorithm is rearranged only for easier analysis.
2304
2305 \begin{figure}
2306 \begin{verbatim}
2307 void base_rate_sub(void)
2308 {
2309 static unsigned int state = K1;
2310
2311 state += K2;
2312
2313 if (state >= (K4 + K2))
2314 {
2315 state -= (K4 + K2);
2316 A();
2317 }
2318 else
2319 {
2320 B();
2321 }
2322 }
2323 \end{verbatim}
2324 \caption{Algorithm \ref{alg:crat0:sfdv0:02a} Modified To Resemble Algorithm \ref{alg:crat0:sfdv0:01a}
2325 (Proof Of Lemma \ref{lem:crat0:sfdv0:sprc2:02})}
2326 \label{fig:lem:crat0:sfdv0:sprc2:02:01}
2327 \end{figure}
2328
2329 In Figure \ref{fig:lem:crat0:sfdv0:sprc2:02:01}, the
2330 statement ``\texttt{state += K2}'' has been removed from the
2331 \texttt{else} clause and placed above the \texttt{if()} statement,
2332 and other constants have been adjusted accordingly.
2333 It can be observed that the figure
2334 is structurally identical to rational counting algorithm, except for the
2335 \texttt{else} clause (which does not affect the counting behavior) and
2336 the specific constants for testing and incrementation.
2337
2338 In Figure \ref{fig:lem:crat0:sfdv0:sprc2:02:01}, as contrasted with
2339 Algorithm \ref{alg:crat0:sfdv0:01a}, ``$K_4 + K_2$'' takes the
2340 place of $K_4$. $\gcd(K_2, K_4 + K_2) = \gcd(K_2, K_4)$
2341 (see Lemma \cprizeroxrefhyphen\ref{lem:cpri0:gcd0:01}), so the
2342 results from
2343 \end{vworklemmaproof}
2344
2345 \begin{vworklemmastatement}
2346 \label{lem:crat0:sfdv0:sprc2:03}
2347 If $K_3=K_4$, $K_1=0$, and $\gcd(K_2, K_4)=1$, the error between
2348 the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01b}
2349 and the ``ideal'' mapping is always
2350 in the set
2351
2352 \begin{equation}
2353 \label{eq:lem:crat0:sfdv0:sprc2:03:01}
2354 \left[ - \frac{K_4 - 1}{K_4} , 0 \right] ,
2355 \end{equation}
2356
2357 and no algorithm can be constructed to
2358 confine the error to a smaller interval.
2359 \end{vworklemmastatement}
2360 \begin{vworklemmaproof}
2361 The proof is identical to Lemma \ref{lem:crat0:sfdv0:sprc0:04}.
2362 \end{vworklemmaproof}
2363
2364
2365
2366
2367
2368
2369
2370 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2371 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2372 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2373 \section{Bresenham's Line Algorithm}
2374 %Section tag: BLA0
2375 \label{crat0:sbla0}
2376
2377 \index{Bresenham's line algorithm}\emph{Bresenham's line algorithm} is a
2378 very efficient algorithm for drawing lines on devices that have
2379 a rectangular array of pixels which can be individually illuminated.
2380 Bresenham's line algorithm is efficient for small microcontrollers
2381 because it relies only
2382 on integer addition, subtraction, shifting, and comparison.
2383
2384 Bresenham's line algorithm is presented for two reasons:
2385
2386 \begin{itemize}
2387 \item The algorithm is useful for drawing lines on LCD
2388 displays and other devices typically controlled by
2389 microcontrollers.
2390 \item The algorithm is an [extremely optimized] application
2391 of the rational
2392 counting algorithms presented in this chapter.
2393 \end{itemize}
2394
2395 \begin{figure}
2396 \begin{center}
2397 \begin{huge}
2398 Figure Space Reserved
2399 \end{huge}
2400 \end{center}
2401 \caption{Raster Grid For Development Of Bresenham's Line Algorithm}
2402 \label{fig:crat0:sbla0:01}
2403 \end{figure}
2404
2405 Assume that we wish to draw a line from $(0,0)$ to $(x_f, y_f)$ on
2406 a raster device (Figure \ref{fig:crat0:sbla0:01}). For simplicity of
2407 development, assume that $y_f \leq x_f$ (i.e. that the slope $m \leq 1$).
2408
2409 For each value of $x \in \vworkintset$, the ideal value of $y$ is given
2410 by
2411
2412 \begin{equation}
2413 \label{eq:crat0:sbla0:01}
2414 y = mx = \frac{y_f}{x_f} x = \frac{y_f x}{x_f} .
2415 \end{equation}
2416
2417 \noindent{}However, on a raster device, we must usually
2418 choose an inexact pixel to illuminate, since it is typically
2419 rare that $x_f \vworkdivides y_f x$. If
2420 $x_f \vworkdivides y_f x$, then the ideal value of $y$ is
2421 an integer, and we choose to illuminate
2422 $(x, (y_f x)/x_f)$. However, if $x_f \vworknotdivides y_f x$,
2423 then we must choose either a pixel with the same y-coordinate
2424 as the previous pixel (we call this choice `D') or the pixel
2425 with a y-coordinate one greater than the previous pixel (we
2426 call this choice `U').
2427 The fractional part of the quotient
2428 $(y_f x) / x_f$ indicates whether D or U is closer to the ideal line.
2429 If $y_f x \bmod x_f \geq x_f/2$, we choose U, otherwise we choose D
2430 (note that the decision to choose U in the equality case is arbitrary).
2431
2432 Using the rational approximation techniques presented in
2433 Section \ref{crat0:sfdv0}, it is straightforward to
2434 develop an algorithm, which is presented as the code
2435 in Figure \ref{fig:crat0:sbla0:02}.
2436 Note that this code will only work if $m = y_f/x_f \leq 1$.
2437
2438 \begin{figure}
2439 \begin{verbatim}
2440 /* Draws a line from (0,0) to (x_f,y_f) on a raster */
2441 /* device. */
2442
2443 void bresenham_line(int x_f, int y_f)
2444 {
2445 int d=0; /* The modulo counter. */
2446 int x=0, y=0;
2447 /* x- and y-coordinates currently being */
2448 /* evaluated. */
2449 int d_old; /* Remembers previous value of d. */
2450
2451 plotpoint(0,0); /* Plot initial point. */
2452 while (x <= x_f)
2453 {
2454 d_old = d;
2455 d += y_f;
2456 if (d >= x_f)
2457 d -= x_f;
2458 x++;
2459 if (
2460 (
2461 (d == 0) && (d_old < x_f/2)
2462 )
2463 ||
2464 (
2465 (d >= x_f/2)
2466 &&
2467 ((d_old < x_f/2) || (d_old >= d))
2468 )
2469 )
2470 y++;
2471 plotpoint(x,y);
2472 }
2473 }
2474 \end{verbatim}
2475 \caption{First Attempt At A Raster Device Line Algorithm
2476 Using Rational Counting Techniques}
2477 \label{fig:crat0:sbla0:02}
2478 \end{figure}
2479
2480 There are a few efficiency refinements that can be made to
2481 the code in Figure \ref{fig:crat0:sbla0:02}, but overall
2482 it is a very efficient algorithm. Note that
2483 nearly all compilers will handle the integer
2484 division by two using a shift
2485 operation rather than a division.
2486
2487 We can however substantially simplify and economize the code of
2488 Figure \ref{fig:crat0:sbla0:02} by using the technique
2489 presented in Figures \ref{fig:crat0:sfdv0:fab0:03} and
2490 \ref{fig:crat0:sfdv0:fab0:04}, and this improved code is
2491 presented as Figure \ref{fig:crat0:sbla0:03}.
2492
2493 \begin{figure}
2494 \begin{verbatim}
2495 /* Draws a line from (0,0) to (x_f,y_f) on a raster */
2496 /* device. */
2497
2498 void bresenham_line(int x_f, int y_f)
2499 {
2500 int d=y_f; /* Position of the ideal line minus */
2501 /* the position of the line we are */
2502 /* drawing, in units of 1/x_f. The */
2503 /* initialization value is y_f because */
2504 /* the algorithm is looking one pixel */
2505 /* ahead in the x direction, so we */
2506 /* begin at x=1. */
2507 int x=0, y=0;
2508 /* x- and y-coordinates currently being */
2509 /* evaluated. */
2510 plotpoint(0,0); /* Plot initial point. */
2511 while (x <= x_f)
2512 {
2513 x++; /* We move to the right regardless. */
2514 if (d >= x_f/2)
2515 {
2516 /* The "U" choice. We must jump up a pixel */
2517 /* to keep up with the ideal line. */
2518 d += (y_f - x_f);
2519 y++; /* Jump up a pixel. */
2520 }
2521 else /* d < x_f/2 */
2522 {
2523 /* The "D" choice. Distance is not large */
2524 /* enough to jump up a pixel. */
2525 d += y_f;
2526 }
2527 plotpoint(x,y);
2528 }
2529 }
2530 \end{verbatim}
2531 \caption{Second Attempt At A Raster Device Line Algorithm
2532 Using Rational Counting Techniques}
2533 \label{fig:crat0:sbla0:03}
2534 \end{figure}
2535
2536 In order to understand the code of Figure \ref{fig:crat0:sbla0:03},
2537 it is helpful to view the problem in an alternate way.
2538 For any $x \in \vworkintset$, let
2539 $d$ be the distance between the position of the ideal line
2540 (characterized by $y = y_f x / x_f$) and
2541 the actual pixel which will be illuminated. It is easy to
2542 observe that:
2543
2544 \begin{itemize}
2545 \item When drawing a raster line, if one proceeds from
2546 $(x, y)$ to $(x+1, y)$ (i.e. makes the ``D'' choice),
2547 $d$ will increase by $y_f/x_f$.
2548 \item When drawing a raster line, if one proceeds from
2549 $(x,y)$ to $(x+1, y+1)$ (i.e. makes the ``U'' choice),
2550 $d$ will increase by $(y_f - x_f)/x_f$. (The increase
2551 of $y_f/x_f$ comes about because the ideal line proceeds
2552 upward from $x$ to $x+1$, while the decrease of $x_f/x_f = 1$
2553 comes about because the line being drawn jumps upward by one
2554 unit, thus tending to ``catch'' the ideal line.)
2555 \end{itemize}
2556
2557 The code of Figure \ref{fig:crat0:sbla0:03} implements the
2558 two observations above in a straightforward way. $d$ is maintained
2559 in units of $1/x_f$, and when ``U'' is chosen over ``D'' whenever
2560 the gap between the ideal line and the current row of pixels
2561 being drawn becomes too large.
2562
2563 The code in Figure \ref{fig:crat0:sbla0:03} does however contain logical
2564 and performance problems which should be corrected:
2565
2566 \begin{itemize}
2567 \item The test of $d$ against $x_f/2$ will perform as intended.
2568 For example, if $d=2$ and $x_f=5$, the test
2569 ``\texttt{d >= x\_f/2}'' in the code will evaluate true
2570 although the actual condition is false. To correct this
2571 defect, the units of $d$ should be changed from
2572 $1/x_f$ to $1/(2 x_f)$.
2573 \item The quantity $y_f - x_f$ is calculated repeatedly. This
2574 calculation should be moved out of the \emph{while()} loop.
2575 \item The test against $x_f$ may be more economical if changed to
2576 a test against 0 (but this requires a different initialization
2577 assignment for $d$).
2578 \end{itemize}
2579
2580 Figure \ref{fig:crat0:sbla0:04} corrects these defects
2581 from Figure \ref{fig:crat0:sbla0:03}.
2582 Figure \ref{fig:crat0:sbla0:04} is essentially the Bresenham
2583 line algorithm, except that it only draws starting from the
2584 origin and will only draw a line with a slope
2585 $m = y_f/x_f \leq 1$.
2586
2587 \begin{figure}
2588 \begin{verbatim}
2589 /* Draws a line from (0,0) to (x_f,y_f) on a raster */
2590 /* device. */
2591
2592 void bresenham_line(int x_f, int y_f)
2593 {
2594 int d = 2 * y_f - x_f;
2595 /* Position of the ideal line minus */
2596 /* the position of the line we are */
2597 /* drawing, in units of 1/(2 * x_f). */
2598 /* Initialization value of 2 * y_f is */
2599 /* because algorithm is looking one */
2600 /* pixel ahead. Value of -x_f is from */
2601 /* shifting the midpoint test (the */
2602 /* "if" statement below) downward to a */
2603 /* test against zero. */
2604 int dD = 2 * y_f;
2605 int dU = dD - x_f;
2606 /* Amounts to add to d if "D" and "U" */
2607 /* pixels are chosen, respectively. */
2608 /* Calculated here outside of loop. */
2609 int x=0, y=0;
2610 /* x- and y-coordinates currently being */
2611 /* evaluated. */
2612 plotpoint(0,0); /* Plot initial point. */
2613 while (x <= x_f)
2614 {
2615 x++; /* We move to the right regardless. */
2616 if (d >= 0)
2617 {
2618 /* The "U" choice. We must jump up a pixel */
2619 /* to keep up with the ideal line. */
2620 d += dU;
2621 y++; /* Jump up a pixel. */
2622 }
2623 else /* d < 0 */
2624 {
2625 /* The "D" choice. Distance is not large */
2626 /* enough to jump up a pixel. */
2627 d += dD;
2628 }
2629 plotpoint(x,y);
2630 }
2631 }
2632 \end{verbatim}
2633 \caption{Third Attempt At A Raster Device Line Algorithm
2634 Using Rational Counting Techniques}
2635 \label{fig:crat0:sbla0:04}
2636 \end{figure}
2637
2638
2639 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2640 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2641 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2642 \section{Authors And Acknowledgements}
2643 %Section tag: ACK0
2644 This chapter was primarily written by
2645 \index{Ashley, David T.} David T. Ashley
2646 \cite{bibref:i:daveashley}.
2647
2648 We would like to gratefully acknowledge the assistance of
2649 \index{Falconer, Chuck B.} Chuck B. Falconer \cite{bibref:i:chuckbfalconer},
2650 \index{Hoffmann, Klaus} Klaus Hoffmann \cite{bibref:i:klaushoffmann},
2651 \index{Larkin, John} John Larkin \cite{bibref:i:johnlarkin},
2652 \index{Smith, Thad} Thad Smith \cite{bibref:i:thadsmith},
2653 and
2654 \index{Voipio, Tauno} Tauno Voipio \cite{bibref:i:taunovoipio}
2655 for insight into rational counting approaches, contributed via the
2656 \texttt{sci.math} \cite{bibref:n:scimathnewsgroup}
2657 and
2658 \texttt{comp.arch.embedded} \cite{bibref:n:comparchembedded}
2659 newsgroups.
2660
2661
2662 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2663 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2664 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2665 \section{Exercises}
2666 %Section tag: EXE0
2667
2668 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2669 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2670 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2671 \subsection[\protect\mbox{\protect$h/2^q$} and \protect\mbox{\protect$2^q/k$} Rational Linear Approximation]
2672 {\protect\mbox{\protect\boldmath$h/2^q$} and \protect\mbox{\protect\boldmath$2^q/k$} Rational Linear Approximation}
2673
2674 \begin{vworkexercisestatement}
2675 \label{exe:crat0:sexe0:a01}
2676 Derive equations analogous to (\ref{eq:crat0:shqq0:dph0:03})
2677 and (\ref{eq:crat0:shqq0:dph0:07}) if $r_A$ is chosen
2678 without rounding, i.e.
2679 $h=\lfloor r_I 2^q \rfloor$ and therefore
2680 $r_A=\lfloor r_I 2^q \rfloor/2^q$.
2681 \end{vworkexercisestatement}
2682 \vworkexercisefooter{}
2683
2684 \begin{vworkexercisestatement}
2685 \label{exe:crat0:sexe0:a02}
2686 Derive equations analogous to (\ref{eq:crat0:shqq0:dph0:03})
2687 and (\ref{eq:crat0:shqq0:dph0:07}) if
2688 $z$ is chosen for rounding with the midpoint case rounded
2689 down, i.e. $z=2^{q-1}-1$, and applied as in
2690 (\ref{eq:crat0:sint0:01}).
2691 \end{vworkexercisestatement}
2692 \vworkexercisefooter{}
2693
2694 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2695 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2696 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2697 \subsection{Rational Counting}
2698
2699
2700 \begin{vworkexercisestatement}
2701 \label{exe:crat0:sexe0:01}
2702 For Algorithm \ref{alg:crat0:sfdv0:01a},
2703 assume that one chooses $K_1 > K_3 - K_2$ (in contradiction to the
2704 restrictions in Figure \ref{fig:crat0:sfdv0:01a}).
2705 Derive a result similar to Lemma \ref{lem:crat0:sfdv0:sprc0:01}
2706 for the number of base subroutine invocations in which
2707 ``\texttt{A()}'' is run before it is
2708 \emph{not} run for the first time.
2709 \end{vworkexercisestatement}
2710 \vworkexercisefooter{}
2711
2712 \begin{vworkexercisestatement}
2713 \label{exe:crat0:sexe0:02}
2714 This will be the $\epsilon$ lemma proof.
2715 \end{vworkexercisestatement}
2716 \vworkexercisefooter{}
2717
2718 \begin{vworkexercisestatement}
2719 \label{exe:crat0:sexe0:03}
2720 Rederive appropriate results similar to
2721 Lemma \ref{lem:crat0:sfdv0:sprc0:04} in the case where
2722 $\gcd(K_2, K_4) > 1$.
2723 \end{vworkexercisestatement}
2724 \vworkexercisefooter{}
2725
2726 \begin{vworkexercisestatement}
2727 \label{exe:crat0:sexe0:04}
2728 Rederive appropriate results similar to
2729 Lemma \ref{lem:crat0:sfdv0:sprc0:04} in the case where
2730 $K_1 \neq 0$.
2731 \end{vworkexercisestatement}
2732 \vworkexercisefooter{}
2733
2734 \begin{vworkexercisestatement}
2735 \label{exe:crat0:sexe0:05}
2736 Rederive appropriate results similar to
2737 Lemma \ref{lem:crat0:sfdv0:sprc0:04} in the case where
2738 $\gcd(K_2, K_4) > 1$ and $K_1 \neq 0$.
2739 \end{vworkexercisestatement}
2740 \vworkexercisefooter{}
2741
2742 \begin{vworkexercisestatement}
2743 \label{exe:crat0:sexe0:06}
2744 For Lemma \ref{lem:crat0:sfdv0:sprc0:04},
2745 complete the missing proof:
2746 show that if $\gcd(K_2, K_4) = 1$, no algorithm can
2747 lead to a tighter bound than (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}).
2748 \textbf{Hint:} start with the observation
2749 that with
2750 $\gcd(K_2, K_4) = 1$, $n K_2 \bmod K_4$ will attain every value in
2751 the set $\{ 0, \ldots , K_4-1 \}$.
2752 \end{vworkexercisestatement}
2753 \vworkexercisefooter{}
2754
2755 \begin{vworkexercisestatement}
2756 \label{exe:crat0:sexe0:07}
2757 For Lemma \ref{lem:crat0:sfdv0:sprc0:03},
2758 show that if $K_1=0$, the number of initial invocations
2759 of the base subroutine before ``\texttt{A()}'' is first
2760 called is in the set specified in
2761 (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}).
2762 \end{vworkexercisestatement}
2763 \vworkexercisefooter{}
2764
2765 \begin{vworkexercisestatement}
2766 \label{exe:crat0:sexe0:08}
2767 Develop other techniques to correct the state upset vulnerability
2768 of Algorithm \ref{alg:crat0:sfdv0:01a} besides
2769 the technique illustrated in
2770 Figure \ref{fig:crat0:sfdv0:sprc0:01}.
2771 \end{vworkexercisestatement}
2772 \vworkexercisefooter{}
2773
2774 \begin{vworkexercisestatement}
2775 \label{exe:crat0:sexe0:09}
2776 Show for Example \ref{ex:crat0:sfdv0:sprc1:01} that integers of at least
2777 27 bits are required.
2778 \end{vworkexercisestatement}
2779 \vworkexercisefooter{}
2780
2781 \begin{vworkexercisestatement}
2782 \label{exe:crat0:sexe0:10}
2783 Prove Lemma \ref{lem:crat0:sfdv0:sprc1:02}.
2784 \end{vworkexercisestatement}
2785 \vworkexercisefooter{}
2786
2787 \begin{vworkexercisestatement}
2788 \label{exe:crat0:sexe0:12}
2789 Prove Lemma \ref{lem:crat0:sfdv0:sprc1:04}.
2790 \end{vworkexercisestatement}
2791 \vworkexercisefooter{}
2792
2793 \begin{vworkexercisestatement}
2794 \label{exe:crat0:sexe0:13}
2795 Define the term \emph{steady state} as used in
2796 Lemma \ref{lem:crat0:sfdv0:sprc1:04} in terms of
2797 set membership of the \texttt{state} variable.
2798 \end{vworkexercisestatement}
2799 \vworkexercisefooter{}
2800
2801 \begin{vworkexercisestatement}
2802 \label{exe:crat0:sexe0:14}
2803 For Algorithm \ref{alg:crat0:sfdv0:01a}, devise examples of anomalous behavior due to
2804 race conditions that may occur if $K_2$ and/or $K_4$ are set in a process
2805 which is asynchronous with respect to the process which implements the
2806 rational counting algorithm if mutual exclusion protocol is not
2807 implemented.
2808 \end{vworkexercisestatement}
2809 \vworkexercisefooter{}
2810
2811 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2812 \vfill
2813 \noindent\begin{figure}[!b]
2814 \noindent\rule[-0.25in]{\textwidth}{1pt}
2815 \begin{tiny}
2816 \begin{verbatim}
2817 $RCSfile: c_rat0.tex,v $
2818 $Source: /home/dashley/cvsrep/e3ft_gpl01/e3ft_gpl01/dtaipubs/esrgubka/c_rat0/c_rat0.tex,v $
2819 $Revision: 1.28 $
2820 $Author: dtashley $
2821 $Date: 2004/02/22 19:27:48 $
2822 \end{verbatim}
2823 \end{tiny}
2824 \noindent\rule[0.25in]{\textwidth}{1pt}
2825 \end{figure}
2826
2827 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2828 % $Log: c_rat0.tex,v $
2829 % Revision 1.28 2004/02/22 19:27:48 dtashley
2830 % Edits.
2831 %
2832 % Revision 1.27 2004/02/22 15:01:53 dtashley
2833 % Edits.
2834 %
2835 % Revision 1.26 2003/12/06 17:48:49 dtashley
2836 % Final edits before move back to SourceForge.
2837 %
2838 % Revision 1.25 2003/04/08 01:21:16 dtashley
2839 % Checkin after major ripup to mechanism for documenting algorithms.
2840 %
2841 % Revision 1.24 2003/04/07 09:38:23 dtashley
2842 % Safety checkin before major tearup with algorithms.
2843 %
2844 % Revision 1.23 2003/04/04 04:05:40 dtashley
2845 % Safety checkin before another major edit.
2846 %
2847 % Revision 1.22 2003/04/03 19:49:36 dtashley
2848 % Global corrections to typeface of "gcd" made as per Jan-Hinnerk Reichert's
2849 % recommendation.
2850 %
2851 % Revision 1.21 2003/04/03 19:33:13 dtashley
2852 % Substantial edits. Safety checkin. Preparing to make corrections to
2853 % gcd typeface pointed out my Jan-Hinnerk Reichert.
2854 %
2855 % Revision 1.20 2003/04/02 08:21:16 dtashley
2856 % Substantial edits, safety checkin.
2857 %
2858 % Revision 1.19 2003/03/30 05:37:20 dtashley
2859 % Evening safety checkin. Substantial edits.
2860 %
2861 % Revision 1.18 2003/03/28 07:24:16 dtashley
2862 % Safety checkin, substantial edits.
2863 %
2864 % Revision 1.17 2003/03/25 05:31:40 dtashley
2865 % Substantial edits, safety checkin.
2866 %
2867 % Revision 1.16 2003/03/21 06:34:54 dtashley
2868 % Major revisions. Safety checkin.
2869 %
2870 % Revision 1.15 2003/03/18 06:20:48 dtashley
2871 % Substantial edits, safety checkin.
2872 %
2873 % Revision 1.14 2003/03/13 06:28:36 dtashley
2874 % Substantial progress, edits.
2875 %
2876 % Revision 1.13 2003/03/08 04:11:19 dtashley
2877 % Friday evening safety checkin.
2878 %
2879 % Revision 1.12 2003/03/05 02:37:34 dtashley
2880 % Safety checkin before major edits.
2881 %
2882 % Revision 1.11 2003/03/03 23:50:44 dtashley
2883 % Substantial edits. Safety checkin.
2884 %
2885 % Revision 1.10 2002/04/27 00:21:04 dtashley
2886 % Substantial edits--preparing for review.
2887 %
2888 % Revision 1.9 2002/04/26 03:47:22 dtashley
2889 % Substantial edits.
2890 %
2891 % Revision 1.8 2002/04/23 02:58:53 dtashley
2892 % Edits.
2893 %
2894 % Revision 1.7 2002/04/22 07:27:32 dtashley
2895 % Preparing to work on desktop computer again.
2896 %
2897 % Revision 1.6 2002/04/22 04:47:30 dtashley
2898 % Preparing to work on laptop.
2899 %
2900 % Revision 1.5 2002/04/22 02:11:54 dtashley
2901 % Preparing to resume work on desktop.
2902 %
2903 % Revision 1.4 2002/04/22 00:14:56 dtashley
2904 % Edits before resuming work on desktop.
2905 %
2906 % Revision 1.3 2002/04/21 23:05:09 dtashley
2907 % Version control information straightened out.
2908 %
2909 %End of file C_RAT0.TEX

dashley@gmail.com
ViewVC Help
Powered by ViewVC 1.1.25