# Contents of /pubs/books/ucbka/trunk/c_rat0/c_rat0.tex

Wed Aug 14 23:10:36 2019 UTC (3 years, 5 months ago) by dashley
File MIME type: application/x-tex
File size: 105593 byte(s)
Change keyword substitution (migration from cvs to svn).

 1 %$Header$ 2 3 \chapter{Rational Linear Approximation} 4 5 \label{crat0} 6 7 \beginchapterquote{Die ganzen Zahlen hat der liebe Gott gemacht, 8 alles andere ist Menschenwerk.''\footnote{German 9 language: God made the integers; everything 10 else was made by man.}} 11 {Leopold Kronecker} 12 13 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 14 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 15 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 16 \section{Introduction} 17 %Section tag: INT0 18 \label{crat0:sint0} 19 20 In this chapter, we consider practical applications of 21 rational approximation. 22 Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0} 23 have presented algorithms for finding 24 the closest rational numbers to an arbitrary real number, 25 subject to constraints on the numerator and denominator. 26 The basis of these algorithms is complex and comes from number theory, and so 27 these algorithms and their basis have been presented in separate chapters. 28 29 In Section \ref{crat0:srla0}, rational linear approximation itself 30 and associated error bounds are presented. By \emph{rational linear 31 approximation} we mean simply the approximation of a line 32 $y = r_I x$ ($y, r_I, x \in \vworkrealset$) by a line 33 34 35 \label{eq:crat0:sint0:01} 36 y = \left\lfloor 37 \frac{h \lfloor x \rfloor + z}{k} 38 \right\rfloor , 39 40 41 \noindent{}where we choose $h/k \approx r_I$ and optionally choose $z$ to 42 shift the error introduced. Note that (\ref{eq:crat0:sint0:01}) is 43 very economical for microcontroller instruction sets, since only integer 44 arithmetic is required. We may choose $h/k$ from a Farey series (see 45 Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0}), or 46 we may choose a ratio $h/2^q$ so that the division in (\ref{eq:crat0:sint0:01}) 47 can be implemented 48 by a bitwise right shift. 49 50 Section \ref{crat0:srla0} discusses linear rational approximation 51 in general, with a special eye on error analysis. 52 53 Section \ref{crat0:spwi0} discusses piecewise linear rational approximation, 54 which is the approximation of a curve or complex mapping by a 55 number of joined line segments. 56 57 Section \ref{crat0:sfdv0} discusses frequency division and rational counting. 58 Such techniques share the same mathematical framework as rational linear 59 approximation, and as with rational linear approximation the ratio 60 involved may be chosen from a Farey series or with a denominator of $2^q$, depending 61 on the algorithm employed. 62 63 Section \ref{crat0:sbla0} discusses Bresenham's classic line algorithm, 64 which is a practical application of rational linear approximation. 65 66 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 67 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 68 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 69 \section{Rational Linear Approximation} 70 %Section tag: RLA0 71 \label{crat0:srla0} 72 73 It occurs frequently in embedded software design that one wishes to 74 implement a linear scaling from a domain to a range of the form 75 76 77 \label{eq:crat0:srla0:01} 78 f(x) = r_I x , 79 80 81 \noindent{}where $r_I$ is the \emph{ideal} 82 83 84 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 85 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 86 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 87 \subsection{Model Functions} 88 %Section tag: mfu0 89 \label{crat0:srla0:smfu0} 90 91 In general, we seek to approximate the ideal function 92 93 94 \noindent{}by some less ideal function where 95 96 \begin{itemize} 97 \item $r_A \neq r_I$, although we seek to choose $r_A \approx r_I$. 98 \item The input to the function, $x$, may already contain 99 quantization error. 100 \item Although $r_I x \in \vworkrealsetnonneg$, we must choose 101 an integer as the function output. 102 \end{itemize} 103 104 In modeling quantization error, we use the floor function\index{floor function} 105 ($\lfloor\cdot\rfloor$) 106 for algebraic simplicity. The floor function precisely 107 describes the behavior of integer division instructions (where 108 remainders are discarded), but may not describe other sources of 109 quantization, such as quantization that occurs in A/D conversion. 110 However, techniques identical to those presented in this 111 section may be used when quantization is not best described 112 by the floor function, and these results are left to the reader. 113 114 Traditionally, because addition of integers is an inexpensive 115 machine operation, a parameter $z \in \vworkintset$ may optionally 116 be added to the product $hx$ in order to round or otherwise 117 shift the result. 118 119 If $x$ is assumed to be without error, the ideal function is 120 given by (\ref{eq:crat0:srla0:smfu0:01}), whereas the function 121 that can be economically implemented is 122 123 124 \label{eq:crat0:srla0:smfu0:02} 125 g(x) = \left\lfloor \frac{hx + z}{k} \right\rfloor 126 = 127 \left\lfloor r_A x + \frac{z}{k} \right\rfloor . 128 129 130 If, on the other hand, $x$ may be already quantized, 131 the function that can actually be implemented is 132 133 134 \label{eq:crat0:srla0:smfu0:03} 135 h(x) = \left\lfloor \frac{h \lfloor x \rfloor + z}{k} \right\rfloor 136 = 137 \left\lfloor r_A \lfloor x \rfloor + \frac{z}{k} \right\rfloor . 138 139 140 141 142 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 143 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 144 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 145 \section[\protect\mbox{\protect$h/2^q$} and \protect\mbox{\protect$2^q/k$} Rational Linear Approximation] 146 {\protect\mbox{\protect\boldmath$h/2^q$} and \protect\mbox{\protect\boldmath$2^q/k$} Rational Linear Approximation} 147 %Section tag: HQQ0 148 \label{crat0:shqq0} 149 150 \index{h/2q@$h/2^q$ rational linear approximation} 151 \index{rational linear approximation!h/2q@$h/2^q$} 152 The algorithms presented in 153 Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0} 154 will always provide the rational number $h/k$ closest to 155 an arbitrary real number $r_I$ subject to the constraints 156 $h \leq h_{MAX}$ and $k \leq k_{MAX}$. 157 158 However, because shifting in order 159 to implement multiplication or division by a power of 2 160 is at least as fast (and often \emph{much} faster) 161 on all processors as arbitrary multiplication or division, 162 and because not all processors have multiplication and division instructions, 163 it is worthwhile to examine choosing $h/k$ so that either $h$ or $k$ are 164 powers of 2. 165 166 There are thus three rational linear approximation techniques to be 167 examined: 168 169 \begin{enumerate} 170 \item \emph{$h/k$ rational linear approximation}, in which an arbitrary 171 $h \leq h_{MAX}$ and an arbitrary $k \leq k_{MAX}$ are used, 172 with $r_A = h/k$. $h$ and $k$ can be chosen using the algorithms 173 presented in Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0}. 174 Implementation of this technique would most often involve a single integer 175 multiplication instruction to form the product $hx$, followed by an optional single 176 addition instruction to form the sum $hx+z$, and then 177 followed by by a single division instruction 178 to form the quotient $\lfloor (hx+z)/k \rfloor$. Implementation may also less commonly involve 179 multiplication, addition, and division of operands too large to be processed 180 with single machine instructions. 181 \item \emph{$h/2^q$ rational linear approximation}, in which an arbitrary 182 $h \leq h_{MAX}$ and an integral power of two $k=2^q$ are used, with 183 $r_A = h/2^q$. 184 Implementation of this technique would most often involve a single integer 185 multiplication instruction to form the product $hx$, followed by an optional single 186 addition instruction to form the sum $hx+z$, and then 187 followed by right shift instruction(s) 188 to form the quotient $\lfloor (hx+z)/2^q \rfloor$. Implementation may also less commonly involve 189 multiplication, addition, and right shift of operands too large to be processed 190 with single machine instructions. 191 \item \emph{$2^q/k$ rational linear approximation}, in which an integral 192 power of two $h=2^q$ and an arbitrary $k \leq k_{MAX}$ are used, with 193 $r_A = 2^q/k$. 194 Implementation of this technique would most often involve left shift 195 instruction(s) to form the product $2^qx$, followed by an optional single 196 addition instruction to form the sum $2^qx+z$, and then 197 followed by a single division instruction to form 198 the quotient $\lfloor (2^qx+z)/k \rfloor$. Implementation may also less 199 commonly involve 200 left shift, addition, and division of operands too large to be processed 201 with single machine instructions. 202 \end{enumerate} 203 204 We use the nomenclature \emph{$h/k$ rational linear approximation}'', 205 \emph{$h/2^q$ rational linear approximation}'', and 206 \emph{$2^q/k$ rational linear approximation}'' to identify the three 207 techniques enumerated above. 208 209 210 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 211 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 212 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 213 \subsection{Integer Arithmetic and Processor Instruction Set Characteristics} 214 %Subsection tag: pis0 215 \label{crat0:shqq0:pis0} 216 217 The following observations about integer arithmetic and about processors 218 used in embedded control can be made: 219 220 \begin{enumerate} 221 \item \label{enum:crat0:shqq0:pis0:01:01a} 222 \emph{Shifting is the fastest method of integer multiplication or division 223 (by $2^q$ only), 224 followed by utilization of the processor multiplication or division instructions (for arbitrary 225 operands), 226 followed by software implementation of multiplication or division (for arbitrary operands).} 227 Relative costs vary depending on the processor, but the monotonic 228 ordering always holds. 229 $h/2^q$ and $2^q/k$ rational linear 230 approximation are thus worthy of investigation. (Note also that in many practical 231 applications of $h/2^q$ and $2^q/k$ rational linear approximation, 232 the required shift is performed by 233 addressing the operand with an offset, 234 and so has no cost.) 235 \item \label{enum:crat0:shqq0:pis0:01:01b} 236 \emph{Shifting is $O(N)$ (where $N$ is the number of bits in the argument), 237 but both 238 multiplication and division are $O(N^2)$ for 239 practical\footnote{\index{Karatsuba multiplication}Karatsuba 240 multiplication, for example, is 241 $O(N^{\log_2 3}) \approx O(N^{1.58}) \ll O(N^2)$. However, Karatsuba 242 multiplication cannot be applied economically to the small 243 operands that typically occur in embedded control work. It would 244 be rare in embedded control applications 245 for the length of a multiplication operand to exceed four 246 times the length that is accommodated by a machine instruction; and this 247 is far below the threshold at which Karatsuba multiplication is 248 economical. Thus, for all intents and purposes in embedded control work, 249 multiplication is $O(N^2)$.} operands (where 250 $N$ is the number of bits in each 251 operand).} It follows that $2^q/k$ and $h/2^q$ rational 252 linear approximation 253 will scale to large operands better than $h/k$ rational linear approximation. 254 \item \label{enum:crat0:shqq0:pis0:01:02a} 255 \emph{Integer division instructions take as long or longer than 256 integer multiplication instructions.} In designing digital logic 257 to implement basic integer arithmetic, division is the operation most difficult 258 to perform economically.\footnote{For some processors, the penalty is extreme. 259 For example, on the NEC V850 (a RISC processor), 260 a division requires 36 clock cycles, 261 whereas multiplication, addition, and subtraction each effectively 262 require 1 clock cycle.} 263 It follows that multiplication using operands that exceed the machine's word size 264 is often far less expensive than division using operands that exceed the 265 machine's word size. 266 \item \label{enum:crat0:shqq0:pis0:01:03a} 267 \emph{All processors that have an integer division instruction also 268 have an integer multiplication instruction.} 269 Phrased 270 differently, no processor has an integer division instruction but no 271 integer multiplication instruction. 272 \end{enumerate} 273 274 Enumerated items 275 (\ref{enum:crat0:shqq0:pis0:01:01a}) through 276 (\ref{enum:crat0:shqq0:pis0:01:03a}) above lead to the following conclusions. 277 278 \begin{enumerate} 279 \item $h/2^q$ rational linear approximation is likely to be implementable 280 more efficiently on most processors than $h/k$ rational linear approximation. 281 (\emph{Rationale:} shift instruction(s) or accessing a 282 memory address with an offset 283 is 284 likely to be more economical than division, particularly if $k$ would exceed 285 the native 286 operand size of the processor.) 287 \item $h/2^q$ rational linear approximation is likely to be a more useful 288 technique than $2^q/k$ rational linear approximation. 289 (\emph{Rationale:} the generally high cost of division compared to 290 multiplication, and the existence of processors that possess a multiplication 291 instruction but no division instruction.) 292 \end{enumerate} 293 294 295 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 296 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 297 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 298 \subsection[Design Procedure For \protect\mbox{\protect$h/2^q$} Rational Linear Approximations] 299 {Design Procedure For \protect\mbox{\protect\boldmath$h/2^q$} Rational Linear Approximation} 300 %Subsection tag: dph0 301 \label{crat0:shqq0:dph0} 302 303 An $h/2^q$ rational linear approximation is parameterized by: 304 305 \begin{itemize} 306 \item The unsigned or signed nature of $h$ and $x$. (Rational linear approximations 307 may involve either signed or unsigned domains and ranges. Furthermore, 308 signed integers may be maintained using either 2's-complement 309 or sign-magnitude representation, and the processor instruction set 310 may or may not directly support signed multiplication.) 311 \item $r_I$, the real number we wish to approximate by $r_A = h/2^q$. 312 \item $x_{MAX}$, the maximum possible value of the input argument $x$. (Typically, 313 software contains a test to clip the output if $x > x_{MAX}$.) 314 \item $w_h$, the width in bits allowed for $h$. (Typically, $w_h$ is 315 the maximum operand size of a machine multiplication instruction.) 316 \item $w_r$, the width in bits allowed for the result $hx$. (Typically, 317 $w_r$ is the maximum result size of a machine multiplication instruction.) 318 \item The rounding mode when choosing $h$ (and thus effectively $r_A$) 319 based on $r_I$. It is common to choose the 320 closest value, 321 $r_A=\lfloor r_I 2^q + 1/2 \rfloor/2^q$ 322 or 323 $r_A=\lceil r_I 2^q - 1/2 \rceil/2^q$, 324 but other choices are possible. 325 \item The rounding mode for the result (i.e. the choice of $z$ in 326 Eq. \ref{eq:crat0:sint0:01}). 327 \end{itemize} 328 329 This section develops a design procedure for $h/2^q$ rational linear 330 approximations with the most typical set of assumptions: unsigned arithmetic, 331 $r_A=\lfloor r_I 2^q + 1/2 \rfloor/2^q$, 332 and $z=0$. Design procedures for other scenarios are presented as exercises. 333 334 By definition, $h$ is constrained in two ways: 335 336 337 \label{eq:crat0:shqq0:dph0:00} 338 h \leq 2^{w_h} - 1 339 340 341 \noindent{}and 342 343 344 \label{eq:crat0:shqq0:dph0:01} 345 h \leq \frac{2^{w_r} - 1}{x_{MAX}} . 346 347 348 \noindent{}(\ref{eq:crat0:shqq0:dph0:00}) comes directly from the 349 requirement that $h$ fit in $w_h$ bits. 350 (\ref{eq:crat0:shqq0:dph0:01}) comes directly from the requirement 351 that $hx$ fit in $w_r$ bits. 352 (\ref{eq:crat0:shqq0:dph0:00}) and (\ref{eq:crat0:shqq0:dph0:01}) 353 may be combined to form one inequality: 354 355 356 \label{eq:crat0:shqq0:dph0:02} 357 h \leq \min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) . 358 359 360 If $q$ is known, the choice of $h$ that will be made so as to minimize 361 $|r_A-r_I| = |h/2^q - r_I|$ is 362 363 364 \label{eq:crat0:shqq0:dph0:03} 365 h=\left\lfloor r_I 2^q + \frac{1}{2} \right\rfloor . 366 367 368 \noindent{}It is required that the choice of $h$ specified by 369 (\ref{eq:crat0:shqq0:dph0:03}) meet 370 (\ref{eq:crat0:shqq0:dph0:02}). Making the most pessimistic 371 assumption about the rounding of $h$ and substituting into 372 (\ref{eq:crat0:shqq0:dph0:02}) leads to 373 374 375 \label{eq:crat0:shqq0:dph0:04} 376 r_I 2^q + \frac{1}{2} 377 \leq 378 \min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) . 379 380 381 \noindent{}Isolating $q$ in (\ref{eq:crat0:shqq0:dph0:04}) 382 yields 383 384 385 \label{eq:crat0:shqq0:dph0:05} 386 2^q 387 \leq 388 \frac{\min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) - \frac{1}{2}} 389 {r_I}. 390 391 392 \noindent{}Solving 393 (\ref{eq:crat0:shqq0:dph0:05}) 394 for maximum value of $q$ that meets the constraint yields 395 396 397 \label{eq:crat0:shqq0:dph0:06} 398 q= 399 \left\lfloor 400 { 401 \log_2 402 \left( 403 { 404 \frac{\min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) - \frac{1}{2}}{r_I} 405 } 406 \right) 407 } 408 \right\rfloor . 409 410 411 \noindent{}(\ref{eq:crat0:shqq0:dph0:06}) 412 can be rewritten for easier calculation using most calculators (which do 413 not allow the direct evaluation of base-2 logarithms): 414 415 416 \label{eq:crat0:shqq0:dph0:07} 417 q= 418 \left\lfloor 419 \frac 420 { 421 { 422 \ln 423 \left( 424 { 425 \frac{\min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) - \frac{1}{2}}{r_I} 426 } 427 \right) 428 } 429 } 430 {\ln 2} 431 \right\rfloor . 432 433 434 \noindent{}Once $q$ is established using (\ref{eq:crat0:shqq0:dph0:07}), 435 $h$ can be calculated using (\ref{eq:crat0:shqq0:dph0:03}). 436 437 In embedded control work (as well as in operating system internals), 438 $h/2^q$ rational linear approximations are often used in conjunction with 439 tabulated constants or calibratable parameters 440 where each constant or calibratable parameter may vary over a range of 441 $[0, r_I]$, and where $r_I$ is the value used in the design procedure 442 presented above. In these applications, the values of $h$ are 443 tabulated, but $q$ is invariant (usually hard-coded) 444 and is chosen at design time based on the upper bound $r_I$ 445 of the interval $[0, r_I]$ in which each tabulated constant or calibratable 446 parameter will fall. With $q$ fixed, 447 $r_A$ can be adjusted in steps of $1/2^q$. 448 449 If $r_I$ is invariant, a final design step may be to reduce the rational 450 number $h/2^q$ by dividing some or all occurrences of 2 as a factor from both the 451 numerator and denominator. With some processors and in some applications, this 452 may save execution time by reducing the number of shift instructions that 453 must be executed, reducing the execution time of the shift instructions 454 that are executed, or allowing shifting via offset addressing. 455 For example, on a byte-addressible machine, if the design procedure 456 yields $h=608$ and $q=10$, it may be desirable to divide both $h$ and $2^q$ by 4 to 457 yield $h=152$ and $q=8$, as this allows the shift by 8 to be done by fetching 458 alternate bytes (rather than by actual shifting). In other applications, it may 459 be desirable to remove \emph{all} occurrences of 2 as a prime factor 460 from $h$. 461 462 For an invariant $r_I$, a suitable design procedure is: 463 464 \begin{enumerate} 465 \item Choose $q$ using (\ref{eq:crat0:shqq0:dph0:07}). 466 \item With $q$ fixed, choose $h$ using (\ref{eq:crat0:shqq0:dph0:03}). 467 \item If economies can be achieved on the target processor, 468 examine the possibility of removing some or all occurrences 469 of 2 as a prime factor from $h$ and decreasing $q$. 470 \end{enumerate} 471 472 For tabulated or calibratable constants in the 473 interval $[0,r_I]$, a suitable design procedure is to use the 474 procedure presented immediately above but without the third step. 475 Each tabulated value of $h$ is chosen using (\ref{eq:crat0:shqq0:dph0:03}). 476 477 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 478 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 479 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 480 \subsection[Design Procedure For \protect\mbox{\protect$2^q/k$} Rational Linear Approximations] 481 {Design Procedure For \protect\mbox{\protect\boldmath$2^q/k$} Rational Linear Approximation} 482 %Subsection tag: dpk0 483 \label{crat0:shqq0:dpk0} 484 485 TBD. 486 487 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 488 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 489 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 490 \section{Piecewise Rational Linear Approximation} 491 %Section tag: PWI0 492 \label{crat0:spwi0} 493 494 TBD. 495 496 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 497 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 498 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 499 \section[Frequency Division And Rational Counting] 500 {Frequency Division And Rational Counting Techniques} 501 %Section tag: FDV0 502 \label{crat0:sfdv0} 503 504 \index{frequency division}\index{rational counting}\index{counting}% 505 Often, software must divide down'' an execution rate. For example, 506 an interrupt service routine may be scheduled by hardware every 507 10ms, but may perform useful processing only every 50ms. This requires 508 that the ISR maintain a counter and only perform useful processing 509 every fifth invocation. This section deals with counting strategies 510 used to achieve invocation frequency division and other similar results. 511 512 Frequency division and 513 rational counting techniques presented in this section find application 514 primarily in the following scenarios: 515 516 \begin{itemize} 517 \item ISRs and other software components which must divide down 518 their invocation rate. 519 \item Pulse counting and scaling from encoders and other 520 similar systems. 521 \item The correction of inaccuracies in timebases (such as crystals 522 which oscillate at a frequency different than the 523 nominal rate). 524 \end{itemize} 525 526 Because the techniques presented must be usable with inexpensive 527 microcontrollers, such techniques must meet these constraints: 528 529 \begin{enumerate} 530 \item \label{enum:01:crat0:sfdv0:econex} 531 The counting techniques must be economical to execute on 532 an inexpensive microcontroller. 533 \item \label{enum:01:crat0:sfdv0:econcccalc} 534 An inexpensive microcontroller must be capable of calculating any 535 constants used as limits in counting (i.e. it cannot necessarily 536 be assumed that a more powerful computer calculates these constants, 537 and it cannot be assumed that these limits do not change on the fly). 538 \end{enumerate} 539 540 In this section, we analyze the behavior of several types of 541 rational counting algorithms, supplied as Algorithms 542 \ref{alg:crat0:sfdv0:01a} 543 through 544 \ref{alg:crat0:sfdv0:02a}. 545 546 \begin{algorithm} 547 \begin{verbatim} 548 /* The constants K1 through K4, which parameterize the */ 549 /* counting behavior, are assumed assigned elsewhere in */ 550 /* the code. The solution is analyzed in terms of the */ 551 /* parameters K1 through K4. */ 552 /* */ 553 /* We also place the following restrictions on K1 through */ 554 /* K4: */ 555 /* K1 : K1 <= K3 - K2. */ 556 /* K2 : K4 > K2 > 0. */ 557 /* K3 : No restrictions. */ 558 /* K4 : K4 > K2 > 0. */ 559 560 void base_rate_sub(void) 561 { 562 static int state = K1; 563 564 state += K2; 565 566 if (state >= K3) 567 { 568 state -= K4; 569 A(); 570 } 571 } 572 \end{verbatim} 573 \caption{Rational Counting Algorithm For $K_2/K_4 < 1$} 574 \label{alg:crat0:sfdv0:01a} 575 \end{algorithm} 576 577 \begin{algorithm} 578 \begin{verbatim} 579 /* The constants K1 through K4, which parameterize the */ 580 /* counting behavior, are assumed assigned elsewhere in */ 581 /* the code. The solution is analyzed in terms of the */ 582 /* parameters K1 through K4. */ 583 /* */ 584 /* We also place the following restrictions on K1 through */ 585 /* K4: */ 586 /* K1 : K1 <= K3 - K2. */ 587 /* K2 : K2 > 0. */ 588 /* K3 : No restrictions. */ 589 /* K4 : K4 > 0. */ 590 591 void base_rate_sub(void) 592 { 593 static int state = K1; 594 595 state += K2; 596 597 while (state >= K3) 598 { 599 state -= K4; 600 A(); 601 } 602 } 603 \end{verbatim} 604 \caption{Rational Counting Algorithm For $K_2/K_4 \geq 1$} 605 \label{alg:crat0:sfdv0:01b} 606 \end{algorithm} 607 608 \begin{algorithm} 609 \begin{verbatim} 610 /* The constants K1, K2, and K4, which parameterize the */ 611 /* counting behavior, are assumed assigned elsewhere in */ 612 /* the code. The solution is analyzed in terms of the */ 613 /* parameters K1 through K4. */ 614 /* */ 615 /* We also place the following restrictions on K1, K2, */ 616 /* and K4: */ 617 /* K1 : K1 >= 0. */ 618 /* K2 : K4 > K2 > 0. */ 619 /* K4 : K4 > K2 > 0. */ 620 /* */ 621 /* Special thanks to Chuck B. Falconer (of the */ 622 /* comp.arch.embedded newsgroup) for this rational */ 623 /* counting algorithm. */ 624 /* */ 625 /* Note below that the test against K3 does not exist, */ 626 /* instead a test against zero is used, which many */ 627 /* machine instruction sets will do as part of the */ 628 /* subtraction (but perhaps this needs to be coded in */ 629 /* A/L). This saves machine code and also eliminates */ 630 /* one unnecessary degree of freedom (K3). */ 631 632 void base_rate_sub(void) 633 { 634 static int state = K1; 635 636 if ((state -= K2) < 0) 637 { 638 state += K4; 639 A(); 640 } 641 } 642 \end{verbatim} 643 \caption{Zero-Test Rational Counting Algorithm For $K_2/K_4 < 1$} 644 \label{alg:crat0:sfdv0:01c} 645 \end{algorithm} 646 647 \begin{algorithm} 648 \begin{verbatim} 649 ;Special thanks to John Larkin (of the comp.arch.embedded 650 ;newsgroup) for this rational counting algorithm. 651 ; 652 ;This is the TMS-370C8 assembly-language version of the 653 ;algorithm. The algorithm is parameterized solely by 654 ;K1 and K2, with no restrictions on their values, because 655 ;the values are naturally constrained by the data types. 656 ;K1, which is the initial value of "state", is assumed 657 ;assigned elsewhere. The snippet shown here uses only 658 ;K2. 659 MOV state, A ;Get "state". 660 ADD #K2, A ;Increase by K2. Carry flag 661 ;will be set if rollover to or 662 ;past zero. 663 PUSH ST ;Save carry flag. 664 MOV A, state ;Move new value back. 665 POP ST ;Restore carry flag. 666 JNC done ;If didn't roll, don't run sub. 667 CALL A_SUBROUTINE ;Run sub. 668 done: 669 670 /* This is the 'C' version of the algorithm. It is not */ 671 /* as easy or efficient in 'C' to detect rollover. */ 672 673 void base_rate_sub(void) 674 { 675 static unsigned int state = K1; 676 unsigned int old_state; 677 678 old_state = state; 679 state += K2; 680 if (state < old_state) 681 { 682 A(); 683 } 684 } 685 \end{verbatim} 686 \caption{$2^q$ Rollover Rational Counting Algorithm} 687 \label{alg:crat0:sfdv0:01d} 688 \end{algorithm} 689 690 \begin{algorithm} 691 \begin{verbatim} 692 /* The constants K1 through K4, which parameterize the */ 693 /* counting behavior, are assumed assigned elsewhere in */ 694 /* the code. The solution is analyzed in terms of the */ 695 /* parameters K1 through K4. */ 696 /* */ 697 /* We also place the following restrictions on K1 through */ 698 /* K4: */ 699 /* K1 : K1 <= K3. */ 700 /* K2 : K2 > 0. */ 701 /* K3 : No restrictions. */ 702 /* K4 : K4 > 0. */ 703 704 void base_rate_sub(void) 705 { 706 static unsigned int state = K1; 707 708 if (state >= K3) 709 { 710 state -= K4; 711 A(); 712 } 713 else 714 { 715 state += K2; 716 B(); 717 } 718 } 719 \end{verbatim} 720 \caption{Rational Counting Algorithm With \texttt{else} Clause} 721 \label{alg:crat0:sfdv0:02a} 722 \end{algorithm} 723 724 725 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 726 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 727 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 728 \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01a}] 729 {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01a}} 730 %Section tag: PRC0 731 \label{crat0:sfdv0:sprc0} 732 733 Algorithm \ref{alg:crat0:sfdv0:01a} 734 is used frequently in microcontroller 735 software. A base rate subroutine\footnote{For brevity, we usually 736 call this just the \emph{base subroutine}.} (named \texttt{base\_rate\_sub()}'' 737 in the algorithm) is called at a periodic rate, and subroutine 738 \texttt{A()}'' is called at a lesser rate. 739 We are interested in determining the relationships between the rates 740 as a function of $K_1$, $K_2$, $K_3$, and $K_4$; and we are interested 741 in developing other properties. 742 743 Notationally when analyzing rational counting algorithms, we agree 744 that $state_n$ denotes the value of the \texttt{state} variable 745 after the $n$th invocation and before the $n+1$'th invocation 746 of the base rate subroutine. 747 Using this convention with Algorithm \ref{alg:crat0:sfdv0:01a}, 748 $state_0 = K_1$.\footnote{Algorithm \ref{alg:crat0:sfdv0:01a} 749 requires a knowledge of 750 C' to fully understand. The \texttt{static} keyword ensures that the 751 variable \texttt{state} is initialized only once, at the time the program 752 is loaded. \texttt{state} is \emph{not} initialized each time the 753 base subroutine runs.} 754 755 We can first easily derive the number of initial invocations of 756 the base subroutine before \texttt{A()}'' is called for the first 757 time. 758 759 \begin{vworklemmastatement} 760 \label{lem:crat0:sfdv0:sprc0:01} 761 $N_{STARTUP}$, the number of invocations of the base subroutine 762 in Algorithm \ref{alg:crat0:sfdv0:01a} before \texttt{A()}'' is called 763 for the first time, is given by 764 765 766 \label{eq:lem:crat0:sfdv0:sprc0:01:01} 767 N_{STARTUP} = 768 \left\lceil 769 { 770 \frac{-K_1 - K_2 + K_3}{K_2} 771 } 772 \right\rceil . 773 774 \end{vworklemmastatement} 775 \begin{vworklemmaproof} 776 The value of \texttt{state} after the $n$th invocation 777 is $state_n = K_1 + n K_2$. In order for the test in the 778 \texttt{if()} statement not to be met, we require that 779 780 781 \label{eq:lem:crat0:sfdv0:sprc0:01:02} 782 K_1 + n K_2 < K_3 783 784 785 \noindent{}or equivalently that 786 787 788 \label{eq:lem:crat0:sfdv0:sprc0:01:03} 789 n < \frac{K_3 - K_1}{K_2} . 790 791 792 Solving (\ref{eq:lem:crat0:sfdv0:sprc0:01:03}) for the largest 793 value of $n \in \vworkintset$ which still meets the criterion 794 yields (\ref{eq:lem:crat0:sfdv0:sprc0:01:01}). Note that 795 the derivation of (\ref{eq:lem:crat0:sfdv0:sprc0:01:01}) requires 796 that the restrictions on $K_1$ through $K_4$ documented in 797 Algorithm \ref{alg:crat0:sfdv0:01a} be met. 798 \end{vworklemmaproof} 799 \begin{vworklemmaparsection}{Remarks} 800 Note that if one chooses $K_1 > K_3 - K_2$ (in contradiction to the 801 restrictions in Algorithm \ref{alg:crat0:sfdv0:01a}), it is possible 802 to devise a counting scheme (and results analogous to this lemma) where 803 \texttt{A()}'' is run a number of times before it is 804 \emph{not} run for the first time. The construction of an analogous 805 lemma is the topic of Exercise \ref{exe:crat0:sexe0:01}. 806 \end{vworklemmaparsection} 807 808 \begin{vworklemmastatement} 809 \label{lem:crat0:sfdv0:sprc0:02} 810 Let $N_I$ be the number of times the Algorithm 811 \ref{alg:crat0:sfdv0:01a} base subroutine 812 is called, let $N_O$ be the number of times the 813 \texttt{A()}'' subroutine is called, let 814 $f_I$ be the frequency of invocation of the 815 Algorithm \ref{alg:crat0:sfdv0:01a} base subroutine, and let 816 $f_O$ be the frequency of invocation of 817 \texttt{A()}''. Provided the constraints 818 on $K_1$ through $K_4$ documented in 819 Algorithm \ref{alg:crat0:sfdv0:01a} are met, 820 821 822 \label{eq:lem:crat0:sfdv0:sprc0:02:01} 823 \lim_{N_I\rightarrow\infty}\frac{N_O}{N_I} 824 = 825 \frac{f_O}{f_I} 826 = 827 \frac{K_2}{K_4} . 828 829 \end{vworklemmastatement} 830 \begin{vworklemmaproof} 831 (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) indicates that once 832 the initial delay (determined by $K_1$ and $K_3$) has finished, 833 $N_O/N_I$ will converge on a steady-state value of 834 $K_2/K_4$. 835 836 Assume that $K_1=0$ and $K_3=K_4$. The 837 conditional subtraction then calculates 838 $state \bmod K_4$. After the $n$th 839 invocation of the base subroutine, the value 840 of \texttt{state} will be 841 842 843 \label{eq:lem:crat0:sfdv0:sprc0:02:02} 844 state_n|_{K_1=0, K_3=K_4} = n K_2 \bmod K_4 . 845 846 847 Assume that for two distinct values of 848 $n \in \vworkintsetnonneg$, $n_1$ and $n_2$, 849 the value of the \texttt{state} variable is the same: 850 851 852 \label{eq:lem:crat0:sfdv0:sprc0:02:03} 853 n_1 K_2 \bmod K_4 = n_2 K_2 \bmod K_4. 854 855 856 Then 857 858 859 \label{eq:lem:crat0:sfdv0:sprc0:02:04} 860 (n_2 - n_1) K_2 = i K_4, \; \exists i \in \vworkintsetpos . 861 862 863 However, we have no knowledge of whether $K_2$ and $K_4$ are 864 coprime (they are not required to be). We may rewrite 865 (\ref{eq:lem:crat0:sfdv0:sprc0:02:04}) equivalently as 866 867 868 \label{eq:lem:crat0:sfdv0:sprc0:02:05} 869 (n_2 - n_1) \frac{K_2}{\gcd(K_2, K_4)} = i \frac{K_4}{\gcd(K_2, K_4)}, 870 \; \exists i \in \vworkintsetpos 871 872 873 where of course by definition 874 875 876 \label{eq:lem:crat0:sfdv0:sprc0:02:06} 877 \gcd \left( { \frac{K_2}{\gcd(K_2, K_4)}, \frac{K_4}{\gcd(K_2, K_4)} } \right) = 1. 878 879 880 In order to satisfy (\ref{eq:lem:crat0:sfdv0:sprc0:02:05}), 881 $n_2 - n_1$ must contain all of the prime factors of 882 $K_4/\gcd(K_2,K_4)$ in at least the same multiplicities, 883 and it follows that the set of values 884 of $n_2-n_1$ that satisfies 885 (\ref{eq:lem:crat0:sfdv0:sprc0:02:03}) is 886 precisely the set of multiples of $K_4/\gcd(K_2,K_4)$: 887 888 889 \label{eq:lem:crat0:sfdv0:sprc0:02:07} 890 n_2 - n_1 = j \frac{K_4}{\gcd(K_2, K_4)}, \; \exists j \in \vworkintsetpos . 891 892 893 Examining (\ref{eq:lem:crat0:sfdv0:sprc0:02:02}), it can 894 also be seen that 895 896 897 \label{eq:lem:crat0:sfdv0:sprc0:02:08} 898 \gcd(K_2, K_4) \vworkdivides (n K_2 \bmod K_4), 899 900 901 and so 902 903 \begin{eqnarray} 904 \label{eq:lem:crat0:sfdv0:sprc0:02:09} 905 & n K_2 \bmod K_4 \in & \\ 906 \nonumber 907 & \{ 0, \gcd(K_2, K_4), 2 \gcd(K_2, K_4), \ldots , K_4 - \gcd(K_2, K_4) \} , & 908 \end{eqnarray} 909 910 a set which contains exactly $K_4/\gcd(K_2, K_4)$ elements. 911 912 Thus we've established by the pigeonhole principle 913 that the sequence of the 914 values of the variable \texttt{state} 915 specified by (\ref{eq:lem:crat0:sfdv0:sprc0:02:02}) 916 repeats perfectly with periodicity $K_4/\gcd(K_2, K_4)$, 917 and we've established that in one period, every element of the set 918 specified in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) appears exactly 919 once. (However, we have not specified the order in which the 920 elements appear, but this is not important for this lemma. In general 921 the elements appear out of the order shown in 922 Eq. \ref{eq:lem:crat0:sfdv0:sprc0:02:09}.) 923 924 To establish the frequency with which the test against 925 $K_4$ is met, note that if $state_n + K_2 \geq K_4$, then 926 927 \begin{eqnarray} 928 \label{eq:lem:crat0:sfdv0:sprc0:02:10} 929 & \displaystyle{state_n \in \left\{ \frac{K_4-K_2}{\gcd(K_2,K_4)} \gcd(K_2, K_4), \right.} & \\ 930 \nonumber & \displaystyle{\left. \left(\frac{K_4-K_2}{\gcd(K_2,K_4)} + 1 \right) \gcd(K_2, K_4), \ldots , 931 K_4 - \gcd(K_2, K_4)\right\} ,} & 932 \end{eqnarray} 933 934 which has a cardinality $K_2/K_4$ that of the set in 935 (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}). Since the 936 \texttt{state} variable cycles through the set in 937 (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) with perfect periodicity and since 938 $K_2/K_4$ of the set elements lead to the \texttt{if()} statement 939 test being 940 met, (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) is also met as 941 $N_I\rightarrow\infty$. 942 943 Note that if $K_1 \neq 0$, it simply changes the startup 944 behavior of the rational counting. So long as $K_2 < K_4$, 945 Algorithm \ref{alg:crat0:sfdv0:01a} will reach a steady state where 946 (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) holds. 947 Note that if $K_3 \neq K_4$, it simply shifts'' the sets 948 specified in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) 949 and (\ref{eq:lem:crat0:sfdv0:sprc0:02:10}), but 950 (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) still holds. 951 The lemma has thus been proved 952 for every case. (We have neglected to give 953 the formal proof as required by the definition of a limit that 954 for any arbitrarily small error $\epsilon$, a 955 finite $N_I$ can be found so that 956 the error is at or below $\epsilon$; however the skeptical reader 957 is encouraged to complete Exercise \ref{exe:crat0:sexe0:02}.) 958 \end{vworklemmaproof} 959 \begin{vworklemmaparsection}{Remarks} 960 It is possible to view the long-term accuracy of 961 Algorithm \ref{alg:crat0:sfdv0:01a} in terms of a limit, as is done in 962 (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}). However, it is also 963 possible to observe that $K_1$ and $K_3$ set a delay until 964 the counting algorithm reaches steady state. 965 With $K_3=K_4$, the attainment of 966 steady state is characterized by the \texttt{state} variable 967 being assigned for the first time to one of the values in 968 (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}). Once in steady state, 969 the algorithm cycles with perfect periodic behavior through all of the 970 $K_4/\gcd(K_2,K_4)$ elements in 971 (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}), but not necessarily in 972 the order shown in the equation. 973 During this period of length $K_4/\gcd(K_2,K_4)$, 974 exactly $K_2/\gcd(K_2,K_4)$ invocations of the base 975 subroutine result in 976 subroutine \texttt{A()}'' being run, and exactly 977 $(K_4-K_2)/\gcd(K_2,K_4)$ do not. Thus, after reaching steady-state the 978 algorithm has \emph{perfect} accuracy if one considers periods of 979 length $K_4/\gcd(K_2,K_4)$. 980 \end{vworklemmaparsection} 981 %\vworklemmafooter{} 982 983 \begin{vworklemmastatement} 984 \label{lem:crat0:sfdv0:sprc0:04} 985 If $K_3=K_4$, $K_1=0$, and 986 $\gcd(K_2, K_4)=1$\footnote{\label{footnote:lem:crat0:sfdv0:sprc0:04:01}If 987 $\gcd(K_2, K_4) > 1$, then by Theorem 988 \cprizeroxrefhyphen\ref{thm:cpri0:ppn0:00a} the largest 989 value that $n K_2 \bmod K_4$ can attain is 990 $K_4-\gcd(K_2, K_4)$ and the interval in 991 (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}) is correspondingly 992 smaller. (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}) is 993 technically correct but not as conservative as possible. 994 This is a minor point and we do not dwell on it.}, the error between 995 the approximation to $N_O$ implemented by 996 Algorithm \ref{alg:crat0:sfdv0:01a} and the ideal'' mapping is always 997 in the set 998 999 1000 \label{eq:lem:crat0:sfdv0:sprc0:04:01} 1001 \left[ - \frac{K_4 - 1}{K_4} , 0 \right] , 1002 1003 1004 and no algorithm can be constructed to 1005 confine the error to a smaller interval. 1006 \end{vworklemmastatement} 1007 \begin{vworklemmaproof} 1008 With $K_1=0$ and $K_3 = K_4$, it can be verified analytically that 1009 the total number of times the function \texttt{A()}'' has been 1010 invoked up to and including the $n$th invocation of the base subroutine 1011 is 1012 1013 1014 \label{eq:lem:crat0:sfdv0:sprc0:04:02} 1015 N_O = \left\lfloor \frac{n K_2}{K_4} \right\rfloor . 1016 1017 1018 On the other hand, the ideal'' number of invocations, which 1019 we denote $\overline{N_O}$, is given by 1020 1021 1022 \label{eq:lem:crat0:sfdv0:sprc0:04:03} 1023 \overline{N_O} = \frac{n K_2}{K_4} . 1024 1025 1026 Quantization of the rational number in (\ref{eq:lem:crat0:sfdv0:sprc0:04:02}) 1027 can introduce an error of up to $-(K_4-1)/K_4$, therefore 1028 1029 1030 \label{eq:lem:crat0:sfdv0:sprc0:04:04} 1031 N_O - \overline{N_O} = 1032 \left\lfloor \frac{n K_2}{K_4} \right\rfloor - \frac{n K_2}{K_4} 1033 \in \left[ - \frac{K_4 - 1}{K_4} , 0 \right] . 1034 1035 1036 This proves the error bound for Algorithm \ref{alg:crat0:sfdv0:01a}. 1037 The proof that there can be no better algorithm is the topic 1038 of Exercise \ref{exe:crat0:sexe0:06}. 1039 \end{vworklemmaproof} 1040 \begin{vworklemmaparsection}{Remarks} 1041 Algorithm \ref{alg:crat0:sfdv0:01a} is \emph{optimal} in the 1042 sense that no algorithm can achieve a tighter error 1043 bound than (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}). As 1044 demonstrated in Exercises \ref{exe:crat0:sexe0:04} 1045 and \ref{exe:crat0:sexe0:05}, $K_1 \neq 0$ can be chosen 1046 to shift the interval in (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}), but 1047 the span of the interval cannot be reduced. 1048 \end{vworklemmaparsection} 1049 \vworklemmafooter{} 1050 1051 Lemmas \ref{lem:crat0:sfdv0:sprc0:02} 1052 and \ref{lem:crat0:sfdv0:sprc0:04} have demonstrated that the ratio of 1053 counts $N_O/N_I$ will asymptotically 1054 approach $K_2/K_4$ 1055 (i.e. the long-term accuracy of Algorithm \ref{alg:crat0:sfdv0:01a} 1056 is \emph{perfect}). 1057 However, 1058 for many applications it is also desirable to have a lack of 1059 bursty'' behavior. We demonstrate the lack of bursty 1060 behavior in the following lemma. 1061 1062 \begin{vworklemmastatement} 1063 \label{lem:crat0:sfdv0:sprc0:03} 1064 For Algorithm \ref{alg:crat0:sfdv0:01a}, once steady 1065 state has been achieved, the number of consecutive 1066 base subroutine invocations during which subroutine 1067 \texttt{A()}'' is executed is always in the set 1068 1069 1070 \label{eq:lem:crat0:sfdv0:sprc0:03:01} 1071 \left\{ 1072 \left\lfloor \frac{K_2}{K_4 - K_2} \right\rfloor , 1073 \left\lceil \frac{K_2}{K_4 - K_2} \right\rceil 1074 \right\} \cap \vworkintsetpos, 1075 1076 1077 which contains one integer if $K_2/K_4 \leq 1/2$ or $(K_4-K_2) \vworkdivides K_2$, 1078 or two integers otherwise. 1079 1080 Once steady state has been achieved, the number of 1081 consecutive base function invocations during which 1082 subroutine \texttt{A()}'' is not executed is 1083 always in the set 1084 1085 1086 \label{eq:lem:crat0:sfdv0:sprc0:03:02} 1087 \left\{ 1088 \left\lfloor \frac{K_4-K_2}{K_2} \right\rfloor , 1089 \left\lceil \frac{K_4-K_2}{K_2} \right\rceil 1090 \right\} \cap \vworkintsetpos, 1091 1092 1093 which contains one integer if $K_2/K_4 \geq 1/2$ or $K_2 \vworkdivides K_4$, 1094 or two integers otherwise. 1095 \end{vworklemmastatement} 1096 \begin{vworklemmaproof} 1097 As before in Lemma \ref{lem:crat0:sfdv0:sprc0:02} 1098 for convenience and without 1099 loss of generality, assume $K_3=K_4$ and 1100 $K_1=0$. Then after a transient period 1101 determined by $K_1$ and $K_3$, the \texttt{state} 1102 variable will be assigned one of the values in 1103 (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) and cycle through 1104 those values in an unestablished order but with perfect 1105 periodicity. To accomplish this proof, we must establish 1106 something about the order in which the \texttt{state} variable attains 1107 the values in the set in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}). 1108 1109 We can partition the set in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) 1110 into two sets; the set of values of \texttt{state} for which if the 1111 base subroutine is invoked with \texttt{state} in this set, subroutine 1112 \texttt{A()}'' will not be invoked (we call this set $\phi_1$), 1113 and the set of values of \texttt{state} for which if the 1114 base subroutine is invoked with \texttt{state} in this set, subroutine 1115 \texttt{A()}'' will be invoked (we call this set $\phi_2$). 1116 $\phi_1$ and $\phi_2$ are identified below. 1117 1118 \begin{eqnarray} 1119 \label{eq:lem:crat0:sfdv0:sprc0:03:03} 1120 & \phi_1 = & \\ 1121 \nonumber & 1122 \displaystyle{\left\{ 1123 0, \gcd(K_2, K_4), 2 \gcd(K_2, K_4), \ldots , 1124 \left(\frac{K_4-K_2}{\gcd(K_2,K_4)} - 1 \right) \gcd(K_2, K_4) 1125 \right\}} & 1126 \end{eqnarray} 1127 1128 \begin{eqnarray} 1129 \label{eq:lem:crat0:sfdv0:sprc0:03:04} 1130 & \displaystyle{ 1131 \phi_2 = \left\{\left(\frac{K_4-K_2}{\gcd(K_2,K_4)}\right) \gcd(K_2, K_4),\right.} & \\ 1132 \nonumber & \displaystyle{\left. 1133 \left(\frac{K_4-K_2}{\gcd(K_2,K_4)} + 1 \right) \gcd(K_2, K_4) , 1134 \ldots , 1135 K_4 - \gcd(K_2, K_4) 1136 \right\}} & 1137 \end{eqnarray} 1138 1139 We can also make the following four additional useful observations 1140 about $\phi_1$ and $\phi_2$. Note that 1141 (\ref{eq:lem:crat0:sfdv0:sprc0:03:07}) and 1142 (\ref{eq:lem:crat0:sfdv0:sprc0:03:08}) become equality 1143 if $\gcd(K_2, K_4) = 1$. 1144 1145 1146 \label{eq:lem:crat0:sfdv0:sprc0:03:05} 1147 n(\phi_1) = \frac{K_4 - K_2}{\gcd(K_2, K_4)} 1148 1149 1150 1151 \label{eq:lem:crat0:sfdv0:sprc0:03:06} 1152 n(\phi_2) = \frac{K_2}{\gcd(K_2, K_4)} 1153 1154 1155 1156 \label{eq:lem:crat0:sfdv0:sprc0:03:07} 1157 \phi_1 \subseteq \{ 0, 1, \ldots , K_4 - K_2 - 1 \} 1158 1159 1160 1161 \label{eq:lem:crat0:sfdv0:sprc0:03:08} 1162 \phi_2 \subseteq \{K_4 - K_2, \ldots , K_4 - 1 \} 1163 1164 1165 We first prove (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}). 1166 If $state_n \in \phi_2$ at the time the base function 1167 is invoked, then 1168 \texttt{A()}'' will be invoked. We also know that 1169 since $state_n \in \phi_2$, $state_n + K_2 \geq K_4$, so 1170 1171 1172 \label{eq:lem:crat0:sfdv0:sprc0:03:09} 1173 state_{n+1} \;\; =|_{state_n \in \phi_2} \;\; state_n - (K_4 - K_2) . 1174 1175 1176 Thus so long as $state_n \in \phi_2$, $state_{n+1} < state_n$ 1177 as specified above in (\ref{eq:lem:crat0:sfdv0:sprc0:03:09}). 1178 With each invocation of the base subroutine, \texttt{state} will 1179 walk downward'' through $\phi_2$. It can 1180 also be observed that when \texttt{state} drops below the smallest 1181 element of $\phi_2$, the next value of \texttt{state} will 1182 be in $\phi_1$. 1183 1184 Note also that although the downward walk specified in 1185 (\ref{eq:lem:crat0:sfdv0:sprc0:03:09}) walks downward in absolute steps 1186 of $K_4-K_2$, this corresponds to $(K_4-K_2) / \gcd(K_2, K_4)$ 1187 \emph{elements} of $\phi_2$, since the elements of $\phi_2$ are 1188 separated by $\gcd(K_2, K_4)$. 1189 1190 Given the downward walk'' specified in (\ref{eq:lem:crat0:sfdv0:sprc0:03:09}), 1191 the only question to be answered is how many consecutive values of 1192 \texttt{state}, separated by $K_4-K_2$ (or $(K_4-K_2)/\gcd(K_2, K_4)$ elements), 1193 can fit'' into 1194 $\phi_2$. Considering that $n(\phi_2) = K_2/\gcd(K_2, K_4)$ 1195 (Eq. \ref{eq:lem:crat0:sfdv0:sprc0:03:06}) and that the 1196 downward step represents $(K_4-K_2)/\gcd(K_2, K_4)$ set elements, 1197 (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}) comes immediately by 1198 a graphical argument. 1199 1200 We now prove (\ref{eq:lem:crat0:sfdv0:sprc0:03:02}). 1201 This can be proved using exactly the same arguments 1202 as for (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}), but 1203 considering the upward walk through $\phi_1$ rather 1204 than the downward walk through $\phi_2$. 1205 1206 As with Lemma \ref{lem:crat0:sfdv0:sprc0:02}, 1207 note that the choices of $K_1$ and $K_3$ do not 1208 materially affect the proof above. $K_1$ and 1209 $K_3$ only set a delay until the rational counting 1210 algorithm reaches steady state. $K_3$ only shifts 1211 the sets $\phi_1$ and $\phi_2$. 1212 \end{vworklemmaproof} 1213 \begin{vworklemmaparsection}{Remark \#1} 1214 This lemma proves an \emph{extremely} important property for the 1215 usability of Algorithm \ref{alg:crat0:sfdv0:01a}. It says that once 1216 steady state has been reached, the variability in the number of consecutive 1217 times \texttt{A()}'' is run or not run is at most one count. 1218 \end{vworklemmaparsection} 1219 \begin{vworklemmaparsection}{Remark \#2} 1220 It is probably also possible to construct a rational counting algorithm 1221 so that the number of consecutive times \texttt{A()}'' is run is constant, 1222 but the algorithm achieves long-term accuracy by varying only the number 1223 of consecutive times \texttt{A()}'' is not run (or vice-versa), but this 1224 is not done here. 1225 \end{vworklemmaparsection} 1226 \begin{vworklemmaparsection}{Remark \#3} 1227 There is no requirement that $K_2$ and $K_4$ be coprime. In fact, as 1228 demonstrated later, it may be advantageous to choose a large $K_2$ and 1229 $K_4$ to approximate a simple ratio so that very fine adjustments can be 1230 made. For example, if the ideal ratio is 1/2, it may be desirable 1231 in some applications to 1232 choose $K_2$=1,000 and $K_4$=2,000 so that fine adjustments can be made 1233 by slightly perturbing $K_2$ or $K_4$. One might adjust 1,000/2,000 downward 1234 to 999/2,000 or upward to 1,001/2,000 by modifying $K_2$ 1235 (both very fine adjustments). 1236 \end{vworklemmaparsection} 1237 \begin{vworklemmaparsection}{Remark \#4} 1238 The most common choice of $K_1$ in practice is 0. If $K_1=0$ is chosen, 1239 it can be shown that the number of initial invocations of the 1240 base subroutine is in the set identified in 1241 (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}). 1242 (See Exercise \ref{exe:crat0:sexe0:07}.) 1243 \end{vworklemmaparsection} 1244 \vworklemmafooter{} 1245 1246 For microcontroller work, it is considered 1247 a desirable property that software components be resilient 1248 to state upset 1249 (see Section \chgrzeroxrefhyphen\ref{chgr0:sdda0:srob0}). 1250 It can be observed that Algorithm \ref{alg:crat0:sfdv0:01a} will 1251 exhibit very anomalous behavior if \texttt{state} is upset to a very negative 1252 value. One possible correction to this shortcoming is illustrated 1253 in Figure \ref{fig:crat0:sfdv0:sprc0:01}. Other possible 1254 corrections are the topic of Exercise \ref{exe:crat0:sexe0:08}. 1255 1256 \begin{figure} 1257 \begin{verbatim} 1258 /* The constants K1 through K4, which parameterize the */ 1259 /* counting behavior, are assumed assigned elsewhere in */ 1260 /* the code. The solution is analyzed in terms of the */ 1261 /* parameters K1 through K4. */ 1262 /* */ 1263 /* We also place the following restrictions on K1 through */ 1264 /* K4: */ 1265 /* K1 : K1 <= K3 - K2. */ 1266 /* K2 : K4 > K2 > 0. */ 1267 /* K3 : No restrictions. */ 1268 /* K4 : K4 > K2 > 0. */ 1269 1270 void base_rate_func(void) 1271 { 1272 static int state = K1; 1273 1274 state += K2; 1275 1276 if ((state < K1) || (state >= K3)) 1277 { 1278 state -= K4; 1279 A(); 1280 } 1281 } 1282 \end{verbatim} 1283 \caption{Algorithm \ref{alg:crat0:sfdv0:01a} With State Upset Shortcoming 1284 Corrected} 1285 \label{fig:crat0:sfdv0:sprc0:01} 1286 \end{figure} 1287 1288 \begin{vworkexamplestatement} 1289 \label{ex:crat0:sfdv0:sprc0:01} 1290 Determine the behavior of Algorithm \ref{alg:crat0:sfdv0:01a} with 1291 $K_1=0$, $K_2=30$, and $K_3=K_4=50$. 1292 \end{vworkexamplestatement} 1293 \begin{vworkexampleparsection}{Solution} 1294 We first predict the behavior, and then trace the algorithm to 1295 verify whether the predictions are accurate. 1296 1297 We make the following predictions: 1298 1299 \begin{itemize} 1300 \item The steady state sequence of invocations of \texttt{A()}'' will 1301 be periodic with period 1302 $K_4/\gcd(K_2, K_4) = 50/10 = 5$, as described 1303 in Lemma \ref{lem:crat0:sfdv0:sprc0:02}. 1304 \item The number of initial invocations of the 1305 base subroutine in which \texttt{A()}'' 1306 is not run will be 1307 $\lceil (K_4 - K_2) / K_2 \rceil = \lceil 2/3 \rceil = 1$, 1308 as described in Remark \#4 of 1309 Lemma \ref{lem:crat0:sfdv0:sprc0:03} and in the solution to 1310 Exercise \ref{exe:crat0:sexe0:07}. 1311 \item In steady state, the number of consecutive invocations of the 1312 base subroutine during which \texttt{A()}'' 1313 is not executed will always be 1, as 1314 described in Equation \ref{eq:lem:crat0:sfdv0:sprc0:03:02} of 1315 Lemma \ref{lem:crat0:sfdv0:sprc0:03}. 1316 (Applying Eq. \ref{eq:lem:crat0:sfdv0:sprc0:03:02} 1317 yields \ 1318 $\{ \lfloor 20/30 \rfloor , \lceil 20/30 \rceil \} \cap \vworkintsetpos % 1319 = \{ 0,1 \} \cap \{1, 2, \ldots \} = \{ 1 \}$.) 1320 \item In steady state, the number of consecutive invocations of the 1321 base subroutine during which \texttt{A()}'' 1322 is executed will always be 1 or 2, as 1323 described in Equation \ref{eq:lem:crat0:sfdv0:sprc0:03:01} of 1324 Lemma \ref{lem:crat0:sfdv0:sprc0:03}. 1325 (Applying Eq. \ref{eq:lem:crat0:sfdv0:sprc0:03:01} 1326 yields \ 1327 $\{ \lfloor 30/20 \rfloor , \lceil 30/20 \rceil \} \cap \vworkintsetpos % 1328 = \{ 1,2 \} \cap \{1, 2, \ldots \} = \{ 1,2 \}$.) 1329 \item The rational counting algorithm will have 1330 perfect long-term accuracy. 1331 \end{itemize} 1332 1333 We can verify the predictions above by tracing the behavior of 1334 Algorithm \ref{alg:crat0:sfdv0:01a}. We adopt the convention 1335 that $A_n = 1$ if subroutine \texttt{A()}'' is invoked during 1336 the $n$th invocation of the base subroutine. 1337 Table \ref{tbl:crat0:sfdv0:sprc0:01} 1338 contains the results of tracing Algorithm \ref{alg:crat0:sfdv0:01a} 1339 with $K_1=0$, $K_2=30$, and $K_3=K_4=50$. 1340 1341 \begin{table} 1342 \caption{Trace Of Algorithm \ref{alg:crat0:sfdv0:01a} With 1343 $K_1=0$, $K_2=30$, And $K_3=K_4=50$ (Example \ref{ex:crat0:sfdv0:sprc0:01})} 1344 \label{tbl:crat0:sfdv0:sprc0:01} 1345 \begin{center} 1346 \begin{tabular}{|c|c|c|} 1347 \hline 1348 Index ($n$) & $state_n$ & $A_n$ \\ 1349 \hline 1350 \hline 1351 0 & 0 & N/A \\ 1352 \hline 1353 1 & 30 & 0 \\ 1354 \hline 1355 2 & 10 & 1 \\ 1356 \hline 1357 3 & 40 & 0 \\ 1358 \hline 1359 4 & 20 & 1 \\ 1360 \hline 1361 5 & 0 & 1 \\ 1362 \hline 1363 6 & 30 & 0 \\ 1364 \hline 1365 7 & 10 & 1 \\ 1366 \hline 1367 8 & 40 & 0 \\ 1368 \hline 1369 9 & 20 & 1 \\ 1370 \hline 1371 10 & 0 & 1 \\ 1372 \hline 1373 \end{tabular} 1374 \end{center} 1375 \end{table} 1376 1377 It can be verfied from the table that all of the 1378 predicted properties are exhibited by the 1379 algorithm. 1380 \end{vworkexampleparsection} 1381 \vworkexamplefooter{} 1382 1383 A second characteristic of Algorithm \ref{alg:crat0:sfdv0:01a} 1384 that should be analyzed carefully is the behavior 1385 of the algorithm if parameters $K_2$ and $K_4$ are adjusted 1386 on the fly''. On-the-fly'' adjustment 1387 raises the following concerns. We assume for convenience 1388 that $K_1=0$ and $K_3=K_4$. 1389 1390 \begin{enumerate} 1391 \item \label{enum:crat0:sfdv0:sprc0:01:01} 1392 \textbf{Critical section protocol:} if the 1393 rational counting algorithm is implemented in a process which 1394 is asynchronous to the process which desires to change 1395 $K_2$ and $K_4$, what precautions must be taken? 1396 \item \label{enum:crat0:sfdv0:sprc0:01:02} 1397 \textbf{Anomalous behavior:} will the rational 1398 counting algorithm behave in a \emph{very} unexpected way 1399 if $K_2$ and $K_4$ are changed on the fly? 1400 \item \label{enum:crat0:sfdv0:sprc0:01:03} 1401 \textbf{Preservation of accuracy:} even if the behavior 1402 exhibited is not \emph{extremely} anomalous, how should 1403 $K_2$ and $K_4$ be modified on the fly so as to preserve the 1404 maximum accuracy? 1405 \end{enumerate} 1406 1407 \textbf{(Concern \#\ref{enum:crat0:sfdv0:sprc0:01:02}):} It can be observed 1408 with Algorithm \ref{alg:crat0:sfdv0:01a} that neither increasing 1409 nor decreasing $K_2$ nor $K_4$ on the fly 1410 will lead to \emph{highly} anomalous 1411 behavior. Each invocation of the algorithm will map 1412 \texttt{state} back into the set identified in 1413 (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}). Thus on-the-fly changes 1414 to $K_2$ and $K_4$ will establish the rational counting algorithm 1415 immediately into steady-state behavior, and the result will not be 1416 \emph{highly} anomalous if such on-the-fly changes are not 1417 made very often. 1418 1419 \textbf{(Concern \#\ref{enum:crat0:sfdv0:sprc0:01:03}):} It can be deduced 1420 from 1421 (\ref{eq:lem:crat0:sfdv0:sprc0:04:02}), 1422 (\ref{eq:lem:crat0:sfdv0:sprc0:04:03}), and 1423 (\ref{eq:lem:crat0:sfdv0:sprc0:04:04}) that the value of the 1424 \texttt{state} variable in Algorithm \ref{alg:crat0:sfdv0:01a} 1425 satisfies the relationship 1426 1427 1428 \label{eq:crat0:sfdv0:sprc0:01} 1429 \overline{N_O} - N_O = \frac{state}{K_4} ; 1430 1431 1432 \noindent{}in other words, the \texttt{state} variable 1433 contains the remainder of an effective division by $K_4$ 1434 and thus maintains the fractional part of $\overline{N_O}$. 1435 Altering $K_4$ on the fly to a new value 1436 (say, $\overline{K_4}$) may be problematic, because 1437 to preserve the current fractional part 1438 of $\overline{N_O}$, one must adjust it for 1439 the new denominator $\overline{K_4}$. This requires 1440 solving the equation 1441 1442 1443 \label{eq:crat0:sfdv0:sprc0:02} 1444 \frac{state}{K_4} = \frac{n}{\;\;\overline{K_4}\;\;} 1445 1446 1447 \noindent{}for $n$ which must be an integer to avoid 1448 loss of information. In general, 1449 this would require that $K_4 \vworkdivides \overline{K_4}$, 1450 a constraint which would be rarely met. Thus, for high-precision 1451 applications where a new rational counting rate should become effective 1452 seamlessly, the best strategy would seem to be to modify $K_2$ only. 1453 It can be verified that modifying $K_2$ on the fly accomplishes 1454 a perfect rate transition. 1455 1456 \textbf{(Concern \#\ref{enum:crat0:sfdv0:sprc0:01:01}):} In microcontroller work, 1457 ordinal data types often represent machine-native data types. In such cases, 1458 it may be possible for one process to set $K_2$ or $K_4$ 1459 for another process that is asynchronous with respect to it by relying 1460 on the atomicity of machine instructions (i.e. without formal mutual 1461 exclusion protocol). However, in other cases where the ordinal data types 1462 of $K_2$ or $K_4$ are larger than can be accomodated by 1463 a single machine instruction or where $K_2$ and $K_4$ must be modified 1464 together atomically, mutual exclusion protocol should be used to 1465 prevent anomalous behavior due to race conditions (see 1466 Exercise \ref{exe:crat0:sexe0:14}). 1467 1468 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 1469 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 1470 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 1471 \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01b}] 1472 {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01b}} 1473 %Section tag: PRC1 1474 \label{crat0:sfdv0:sprc1} 1475 1476 Algorithm \ref{alg:crat0:sfdv0:01a} 1477 has the disadvantage that it requires $K_2/K_4 < 1$ (i.e. it can only 1478 decrease frequency, but never increase frequency). This deficiency 1479 can be corrected by using 1480 Algorithm \ref{alg:crat0:sfdv0:01b}. 1481 1482 Note that Algorithm \ref{alg:crat0:sfdv0:01b} will properly deal with $K_2$ and 1483 $K_4$ chosen such that $0 < K_2/K_4 < \infty$. 1484 1485 The most common reason that one may want a counting algorithm 1486 that will correctly handle 1487 $K_2/K_4 \geq 1$ is to conveniently handle $K_2/K_4 \approx 1$. 1488 In practice, $K_2/K_4$ may represent a quantity that is 1489 normally very close to 1490 1 but may also be slightly less than or slightly greater than 1. 1491 For example, one may use $K_2/K_4 \approx 1$ to correct for a 1492 crystal or a resonator which deviates slightly from its nominal 1493 frequency. We illustrate this with the following example. 1494 1495 \begin{vworkexamplestatement} 1496 \label{ex:crat0:sfdv0:sprc1:01} 1497 A microcontroller software load keeps time via an interrupt 1498 service routine that runs every 1ms, but this frequency may be 1499 off by as much as 1 part in 10,000 due to variations in 1500 crystal or resonator manufacture. The interrupt service routine 1501 updates a counter which represents the number of milliseconds elapsed since 1502 the software load was reset. Devise a rational counting strategy 1503 based on Algorithm \ref{alg:crat0:sfdv0:01b} 1504 which will allow the time accuracy to be trimmed to within 1505 one second per year or less by adjusting only $K_4$, and implement the counting strategy 1506 in software. 1507 \end{vworkexamplestatement} 1508 \begin{vworkexampleparsection}{Solution} 1509 $K_2/K_4$ will be nominally very close to 1 ($K_2 \approx K_4$). 1510 If we assume that each year has 365.2422\footnote{The period of the earth's 1511 rotation about the sun is not an integral number of days, which is why the 1512 rules for leap years exist. Ironically, the assignment of leap years is itself 1513 a problem very similar to the rational counting problems discussed in this chapter.} days 1514 ($\approx$ 31,556,926 seconds), then choosing 1515 $K_2 \approx K_4 = 31,556,926$ will yield satisfactory results. 1516 If we may need to compensate for up to 1 part in 10,000 of crystal or resonator 1517 inaccuracy, we may need to adjust $K_2$ as low as 0.9999 $\times$ 31,556,926 $\approx$ 1518 31,553,770 (to compensate for a fast 1519 crystal or resonator) or as 1520 high as 1.0001 $\times$ 31,556,926 1521 $\approx$ 31,560,082 1522 (to compensate for a slow crystal or resonator). Choosing 1523 $K_4 = 31,556,926$ yields the convenient relationship that each 1524 count in $K_2$ corresponds to one second per year. 1525 1526 \begin{figure} 1527 \begin{verbatim} 1528 /* The constants K1 through K4, which parameterize the */ 1529 /* counting behavior, are assumed assigned elsewhere in */ 1530 /* the code. */ 1531 /* */ 1532 /* The variable time_count below is the number of milli- */ 1533 /* seconds since the software was reset. */ 1534 int time_count = 0; 1535 1536 /* It is assumed that the base rate subroutine below is */ 1537 /* called every millisecond (or, at least what should be */ 1538 /* every millisecond of the crystal or resonator were */ 1539 /* perfect). */ 1540 1541 void base_rate_sub(void) 1542 { 1543 static int state = K1; 1544 1545 state += K2; 1546 1547 while (state >= K3) 1548 { 1549 state -= K4; 1550 time_count++; 1551 } 1552 } 1553 \end{verbatim} 1554 \caption{Algorithm \ref{alg:crat0:sfdv0:01b} Applied To Timekeeping 1555 (Example \ref{ex:crat0:sfdv0:sprc1:01})} 1556 \label{fig:ex:crat0:sfdv0:sprc1:01:01} 1557 \end{figure} 1558 1559 Figure \ref{fig:ex:crat0:sfdv0:sprc1:01:01} provides an illustration 1560 of Algorithm \ref{alg:crat0:sfdv0:01b} applied in this scenario. 1561 We assume that $K_4$ contains the constant value 31,556,926 1562 and that $K_2$ is modified about this value either downwards or upwards 1563 to trim the timekeeping. Note that Algorithm \ref{alg:crat0:sfdv0:01b} will correctly 1564 handle $K_2 \geq K_4$. 1565 1566 Also note in the implementation illustrated in Figure 1567 \ref{fig:ex:crat0:sfdv0:sprc1:01:01} that large integers (27 bits or more) 1568 are required. (See also Exercise \ref{exe:crat0:sexe0:09}). 1569 \end{vworkexampleparsection} 1570 \vworkexamplefooter{} 1571 1572 It may not be obvious whether Algorithm \ref{alg:crat0:sfdv0:01b} has the 1573 same or similar desirable properties as Algorithm \ref{alg:crat0:sfdv0:01a} 1574 presented 1575 in Lemmas 1576 \ref{lem:crat0:sfdv0:sprc0:01}, 1577 \ref{lem:crat0:sfdv0:sprc0:02}, 1578 \ref{lem:crat0:sfdv0:sprc0:04}, 1579 and 1580 \ref{lem:crat0:sfdv0:sprc0:03}. 1581 Algorithm \ref{alg:crat0:sfdv0:01b} does have these desirable 1582 properties, and these properties are presented as 1583 Lemmas \ref{lem:crat0:sfdv0:sprc1:01}, 1584 \ref{lem:crat0:sfdv0:sprc1:02}, 1585 \ref{lem:crat0:sfdv0:sprc1:03}, and 1586 \ref{lem:crat0:sfdv0:sprc1:04}. 1587 The proofs of these lemmas are identical or very similar to the proofs 1588 of Lemmas 1589 \ref{lem:crat0:sfdv0:sprc0:01}, 1590 \ref{lem:crat0:sfdv0:sprc0:02}, 1591 \ref{lem:crat0:sfdv0:sprc0:04}, 1592 and 1593 \ref{lem:crat0:sfdv0:sprc0:03}; 1594 and so these proofs when not identical are presented as exercises. 1595 Note that Algorithm \ref{alg:crat0:sfdv0:01b} behaves identically to 1596 Algorithm \ref{alg:crat0:sfdv0:01a} when $K_2 < K_4$, and the 1597 case of $K_2=K_4$ is trivial, so in general only 1598 the behavior when $K_2 > K_4$ remains to be proved. 1599 1600 \begin{vworklemmastatement} 1601 \label{lem:crat0:sfdv0:sprc1:01} 1602 $N_{STARTUP}$, the number of invocations of the base subroutine 1603 in Algorithm \ref{alg:crat0:sfdv0:01b} before \texttt{A()}'' is called 1604 for the first time, is given by 1605 1606 1607 \label{eq:lem:crat0:sfdv0:sprc1:01:01} 1608 N_{STARTUP} = 1609 \left\lceil 1610 { 1611 \frac{-K_1 - K_2 + K_3}{K_2} 1612 } 1613 \right\rceil . 1614 1615 \end{vworklemmastatement} 1616 \begin{vworklemmaproof} 1617 The proof is identical to the proof of Lemma 1618 \ref{lem:crat0:sfdv0:sprc0:01}. 1619 \end{vworklemmaproof} 1620 1621 1622 \begin{vworklemmastatement} 1623 \label{lem:crat0:sfdv0:sprc1:02} 1624 Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:01b} 1625 base subroutine 1626 is called, let $N_O$ be the number of times the 1627 \texttt{A()}'' subroutine is called, let 1628 $f_I$ be the frequency of invocation of the 1629 Algorithm \ref{alg:crat0:sfdv0:01a} base subroutine, and let 1630 $f_O$ be the frequency of invocation of 1631 \texttt{A()}''. 1632 1633 1634 \label{eq:lem:crat0:sfdv0:sprc1:02:01} 1635 \lim_{N_I\rightarrow\infty}\frac{N_O}{N_I} 1636 = 1637 \frac{f_O}{f_I} 1638 = 1639 \frac{K_2}{K_4} . 1640 1641 \end{vworklemmastatement} 1642 \begin{vworklemmaproof} 1643 See Exercise \ref{exe:crat0:sexe0:10}. 1644 \end{vworklemmaproof} 1645 1646 \begin{vworklemmastatement} 1647 \label{lem:crat0:sfdv0:sprc1:03} 1648 If $K_3=K_4$, $K_1=0$, and $\gcd(K_2, K_4)=1$\footnote{See also 1649 footnote \ref{footnote:lem:crat0:sfdv0:sprc0:04:01} in this chapter.}, the error between 1650 the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01b} 1651 and the ideal'' mapping is always 1652 in the set 1653 1654 1655 \label{eq:lem:crat0:sfdv0:sprc1:03:01} 1656 \left[ - \frac{K_4 - 1}{K_4} , 0 \right] , 1657 1658 1659 and no algorithm can be constructed to 1660 confine the error to a smaller interval. 1661 \end{vworklemmastatement} 1662 \begin{vworklemmaproof} 1663 The proof is identical to the proof of Lemma \ref{lem:crat0:sfdv0:sprc0:04}. 1664 \end{vworklemmaproof} 1665 1666 \begin{vworklemmastatement} 1667 \label{lem:crat0:sfdv0:sprc1:04} 1668 For Algorithm \ref{alg:crat0:sfdv0:01b} 1669 with 1670 $K_2 \geq K_4$, once steady 1671 state has been achieved (see Exercise 1672 \ref{exe:crat0:sexe0:13}), each invocation of the 1673 base subroutine will result in 1674 a number of invocations of 1675 \texttt{A()}'' which is in the set 1676 1677 1678 \label{eq:lem:crat0:sfdv0:sprc1:04:01} 1679 \left\{ 1680 \left\lfloor \frac{K_2}{K_4} \right\rfloor , 1681 \left\lceil \frac{K_2}{K_4} \right\rceil 1682 \right\}, 1683 1684 1685 which contains one integer if $K_4 \vworkdivides K_2$, 1686 or two integers otherwise. With $K_2 < K_4$, 1687 the behavior will be as specified in Lemma 1688 \ref{lem:crat0:sfdv0:sprc0:03}. 1689 \end{vworklemmastatement} 1690 \begin{vworklemmaproof} 1691 See Exercise \ref{exe:crat0:sexe0:12}. 1692 \end{vworklemmaproof} 1693 \begin{vworklemmaparsection}{Remark} 1694 Note that Lemma \ref{lem:crat0:sfdv0:sprc0:03} 1695 and this lemma specify different aspects of behavior, 1696 which is why (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}) 1697 and (\ref{eq:lem:crat0:sfdv0:sprc0:03:02}) take 1698 different forms than 1699 (\ref{eq:lem:crat0:sfdv0:sprc1:04:01}). 1700 Lemma \ref{lem:crat0:sfdv0:sprc0:03} specifies the number of consecutive 1701 invocations of the base subroutine for which \texttt{A()}'' 1702 will be run, but with $K_2 \geq K_4$ it does not make sense to 1703 specify behavior in this way since \texttt{A()}'' will be run 1704 on \emph{every} invocation of the base subroutine. This lemma specifies 1705 the number of times \texttt{A()}'' will be run on a \emph{single} 1706 invocation of the base subroutine (which is not meaningful if 1707 $K_2 < K_4$ since the result will always be 0 or 1). 1708 \end{vworklemmaparsection} 1709 %\vworklemmafooter{} 1710 1711 1712 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 1713 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 1714 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 1715 \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01c}] 1716 {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01c}} 1717 %Section tag: PRX0 1718 \label{crat0:sfdv0:sprx0} 1719 1720 Algorithm \ref{alg:crat0:sfdv0:01c}\footnote{Algorithm \ref{alg:crat0:sfdv0:01c} 1721 was contributed in March, 2003 1722 by Chuck B. Falconer \cite{bibref:i:chuckbfalconer} 1723 via the 1724 \texttt{comp.arch.embedded} \cite{bibref:n:comparchembedded} 1725 newsgroup.} 1726 is a variant of Algorithm \ref{alg:crat0:sfdv0:01a} 1727 which has one fewer 1728 degrees of freedom than Algorithms \ref{alg:crat0:sfdv0:01a} 1729 and \ref{alg:crat0:sfdv0:01b} and can be implemented 1730 more efficiently under most instruction sets. Algorithm \ref{alg:crat0:sfdv0:01c} 1731 is superior to Algorithms \ref{alg:crat0:sfdv0:01a} 1732 and \ref{alg:crat0:sfdv0:01b} 1733 from a computational efficiency 1734 point of view, but is less intuitive. 1735 1736 The superiority in computational efficiency of Algorithm \ref{alg:crat0:sfdv0:01c} 1737 comes from the possibility of using an implicit test against zero 1738 (rather than an explicit 1739 test against $K_3$, as is found in Algorithms \ref{alg:crat0:sfdv0:01a} 1740 and \ref{alg:crat0:sfdv0:01b}). 1741 Many machine instruction sets automatically set flags to indicate a negative 1742 result when the 1743 subtraction of $K_2$ is performed, thus often allowing a conditional branch 1744 without an additional instruction. Whether an instruction will be saved in 1745 the code of Figure \ref{fig:crat0:sfdv0:01c} depends on the sophistication 1746 of the C' compiler, but of course if the algorithm were coded in 1747 assembly-language an instruction could be saved on most processors. 1748 1749 The properties of rational counting Algorithm \ref{alg:crat0:sfdv0:01c} are nearly 1750 identical to those of Algorithm \ref{alg:crat0:sfdv0:01a}, 1751 and we prove the important properties 1752 now. 1753 1754 \begin{vworklemmastatement} 1755 \label{lem:crat0:sfdv0:sprx0:01} 1756 $N_{STARTUP}$, the number of invocations of the base subroutine 1757 in Algorithm \ref{alg:crat0:sfdv0:01c} before \texttt{A()}'' is called 1758 for the first time, is given by 1759 1760 1761 \label{eq:lem:crat0:sfdv0:sprx0:01:01} 1762 N_{STARTUP} = 1763 \left\lfloor 1764 { 1765 \frac{K_1}{K_2} 1766 } 1767 \right\rfloor . 1768 1769 \end{vworklemmastatement} 1770 \begin{vworklemmaproof} 1771 The value of \texttt{state} when tested against 1772 zero in the \texttt{if()} statement during the $n$th invocation 1773 of the base subroutine is $K_1 - n K_2$. In order for the test 1774 not to be met on the $n$th invocation 1775 of the base subroutine, we require that 1776 1777 1778 \label{eq:lem:crat0:sfdv0:sprx0:01:02} 1779 K_1 - n K_2 \geq 0 1780 1781 1782 \noindent{}or equivalently that 1783 1784 1785 \label{eq:lem:crat0:sfdv0:sprx0:01:03} 1786 n \leq \frac{K_1}{K_2} . 1787 1788 1789 Solving (\ref{eq:lem:crat0:sfdv0:sprx0:01:03}) for the 1790 largest value of $n \in \vworkintset$ which still meets the criterion 1791 yields (\ref{eq:lem:crat0:sfdv0:sprx0:01:01}). Note that 1792 the derivation of (\ref{eq:lem:crat0:sfdv0:sprx0:01:01}) requires 1793 that the restrictions on $K_1$, $K_2$, and $K_3$ documented in 1794 Figure \ref{fig:crat0:sfdv0:01c} be met. 1795 \end{vworklemmaproof} 1796 1797 \begin{vworklemmastatement} 1798 \label{lem:crat0:sfdv0:sprx0:02} 1799 Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:01c} 1800 base subroutine 1801 is called, let $N_O$ be the number of times the 1802 \texttt{A()}'' subroutine is called, let 1803 $f_I$ be the frequency of invocation of the 1804 Algorithm \ref{alg:crat0:sfdv0:01a} 1805 base subroutine, and let 1806 $f_O$ be the frequency of invocation of 1807 \texttt{A()}''. Provided the constraints 1808 on $K_1$, $K_2$, and $K_3$ documented in 1809 Figure \ref{fig:crat0:sfdv0:01c} are met, 1810 1811 1812 \label{eq:lem:crat0:sfdv0:sprx0:02:01} 1813 \lim_{N_I\rightarrow\infty}\frac{N_O}{N_I} 1814 = 1815 \frac{f_O}{f_I} 1816 = 1817 \frac{K_2}{K_4} . 1818 1819 \end{vworklemmastatement} 1820 \begin{vworklemmaproof} 1821 (\ref{eq:lem:crat0:sfdv0:sprx0:02:01}) indicates that once 1822 an initial delay (determined by $K_1$) has finished, 1823 $N_O/N_I$ will converge on a steady-state value of 1824 $K_2/K_4$. 1825 1826 The most straightforward way to analyze Algorithm \ref{alg:crat0:sfdv0:01c} 1827 is to show how an algorithm already 1828 understood (Algorithm \ref{alg:crat0:sfdv0:01a}) 1829 can be transformed to 1830 Algorithm \ref{alg:crat0:sfdv0:01c} 1831 in a way where the analysis of Algorithm \ref{alg:crat0:sfdv0:01a} 1832 also applies to Algorithm \ref{alg:crat0:sfdv0:01c}. 1833 Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01} shows 1834 how such a transformation can be performed in 1835 four steps. 1836 1837 \begin{figure} 1838 (a) Algorithm \ref{alg:crat0:sfdv0:01a} unchanged. 1839 $state_{a,n} \in \{0, 1, \ldots, K_4 - 1 \}$. 1840 \begin{verbatim} 1841 state += K2; 1842 if (state >= K4) 1843 { 1844 state -= K4; 1845 A(); 1846 } 1847 \end{verbatim} 1848 (b) \texttt{>=}'' changed to \texttt{>}''. $state_{b,n} \in \{1, 2, \ldots, K_4 \}$, 1849 $state_{b,n} = state_{a,n} + 1$. 1850 \begin{verbatim} 1851 state += K2; 1852 if (state > K4) 1853 { 1854 state -= K4; 1855 A(); 1856 } 1857 \end{verbatim} 1858 (c) Test against $K_4$ changed to test against zero. 1859 $state_{c,n} \in \{-K_4 + 1, -K_4 + 2, \ldots, 0 \}$, 1860 $state_{c,n} = state_{b,n} - K_4$. 1861 \begin{verbatim} 1862 state += K2; 1863 if (state > 0) 1864 { 1865 state -= K4; 1866 A(); 1867 } 1868 \end{verbatim} 1869 (d) Sign inversion. 1870 $state_{d,n} \in \{ 0, 1, \ldots, K_4 - 1 \}$, 1871 $state_{d,n} = - state_{c,n}$. 1872 \begin{verbatim} 1873 state -= K2; 1874 if (state < 0) 1875 { 1876 state += K4; 1877 A(); 1878 } 1879 \end{verbatim} 1880 (e) C' expression rearrangement. 1881 $state_{e,n} \in \{ 0, 1, \ldots, K_4 - 1 \}$, 1882 $state_{e,n} = state_{d,n}$. 1883 \begin{verbatim} 1884 if ((state -= K2) < 0) 1885 { 1886 state += K4; 1887 A(); 1888 } 1889 \end{verbatim} 1890 \caption{4-Step Transformation Of Algorithm \ref{alg:crat0:sfdv0:01a} 1891 To Algorithm \ref{alg:crat0:sfdv0:01c}} 1892 \label{fig:lem:crat0:sfdv0:sprx0:02:01} 1893 \end{figure} 1894 1895 In Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01}, each of the 1896 four steps required to transform from Algorithm \ref{alg:crat0:sfdv0:01a} to 1897 Algorithm \ref{alg:crat0:sfdv0:01c} includes an equation to transform the 1898 \texttt{state} variable. Combining all of these 1899 transformations yields 1900 1901 \begin{eqnarray} 1902 \label{eq:lem:crat0:sfdv0:sprx0:02:02} 1903 state_{e,n} & = & K_4 - 1 - state_{a,n} \\ 1904 \label{eq:lem:crat0:sfdv0:sprx0:02:03} 1905 state_{a,n} & = & K_4 - 1 - state_{e,n} 1906 \end{eqnarray} 1907 1908 We thus see that Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01}(a) 1909 (corresponding to Algorithm \ref{alg:crat0:sfdv0:01a}) and 1910 Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01}(e) 1911 (corresponding to Algorithm \ref{alg:crat0:sfdv0:01c}) have 1912 \texttt{state} semantics which involve the same range 1913 but a reversed order. (\ref{eq:lem:crat0:sfdv0:sprx0:02:01}) 1914 follows directly from this observation and from 1915 Lemma \ref{lem:crat0:sfdv0:sprc0:02}. 1916 \end{vworklemmaproof} 1917 %\vworklemmafooter{} 1918 1919 \begin{vworklemmastatement} 1920 \label{lem:crat0:sfdv0:sprx0:03} 1921 If $K_1=0$ and $\gcd(K_2, K_4)=1$\footnote{See also 1922 footnote \ref{footnote:lem:crat0:sfdv0:sprc0:04:01} in this chapter.}, the error between 1923 the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01c} 1924 and the ideal'' mapping is always 1925 in the set 1926 1927 1928 \label{eq:lem:crat0:sfdv0:sprx0:03:01} 1929 \left[ 0, \frac{K_4 - 1}{K_4} \right] , 1930 1931 1932 and no algorithm can be constructed to 1933 confine the error to a smaller interval. 1934 \end{vworklemmastatement} 1935 \begin{vworklemmaproof} 1936 Using the duality illustrated by 1937 (\ref{eq:lem:crat0:sfdv0:sprx0:02:02}) and 1938 (\ref{eq:lem:crat0:sfdv0:sprx0:02:03}), 1939 starting Algorithm \ref{alg:crat0:sfdv0:01c} with 1940 $state_0=0$ will yield a dual state vector 1941 with respect to starting Algorithm \ref{alg:crat0:sfdv0:01a} with 1942 $state_0=K_4-1$. Thus, 1943 1944 1945 \label{eq:lem:crat0:sfdv0:sprx0:03:02} 1946 N_O = \left\lfloor \frac{n K_2 + K_4 - 1}{K_4} \right\rfloor . 1947 1948 1949 Using this altered value of $N_O$ in (\ref{eq:lem:crat0:sfdv0:sprc0:04:04}) 1950 leads directly to (\ref{eq:lem:crat0:sfdv0:sprx0:03:01}). 1951 1952 The proof that there can be no better algorithm is identical 1953 to the same proof for Lemma \ref{lem:crat0:sfdv0:sprc0:04} (Exercise \ref{exe:crat0:sexe0:06}). 1954 \end{vworklemmaproof} 1955 %\vworklemmafooter{} 1956 1957 \begin{vworklemmastatement} 1958 \label{lem:crat0:sfdv0:sprx0:04} 1959 For Algorithm \ref{alg:crat0:sfdv0:01c}, once steady 1960 state has been achieved, the number of consecutive 1961 base subroutine invocations during which subroutine 1962 \texttt{A()}'' is executed is always in the set 1963 1964 1965 \label{eq:lem:crat0:sfdv0:sprx0:04:01} 1966 \left\{ 1967 \left\lfloor \frac{K_2}{K_4 - K_2} \right\rfloor , 1968 \left\lceil \frac{K_2}{K_4 - K_2} \right\rceil 1969 \right\} \cap \vworkintsetpos, 1970 1971 1972 which contains one integer if $K_2/K_4 \leq 1/2$ or $(K_4-K_2) \vworkdivides K_2$, 1973 or two integers otherwise. 1974 1975 Once steady state has been achieved, the number of 1976 consecutive base function invocations during which 1977 subroutine \texttt{A()}'' is not executed is 1978 always in the set 1979 1980 1981 \label{eq:lem:crat0:sfdv0:sprx0:04:02} 1982 \left\{ 1983 \left\lfloor \frac{K_4-K_2}{K_2} \right\rfloor , 1984 \left\lceil \frac{K_4-K_2}{K_2} \right\rceil 1985 \right\} \cap \vworkintsetpos, 1986 1987 1988 which contains one integer if $K_2/K_4 \geq 1/2$ or $K_2 \vworkdivides K_4$, 1989 or two integers otherwise. 1990 \end{vworklemmastatement} 1991 \begin{vworklemmaproof} 1992 The proof comes directly from the duality between algorithm 1993 Algorithms \ref{alg:crat0:sfdv0:01a} 1994 and \ref{alg:crat0:sfdv0:01c} established in the 1995 proof of Lemma \ref{lem:crat0:sfdv0:sprx0:01}, so that the results 1996 from Lemma \ref{lem:crat0:sfdv0:sprc0:03} apply without modification. 1997 \end{vworklemmaproof} 1998 \vworklemmafooter{} 1999 2000 2001 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2002 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2003 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2004 \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01d}] 2005 {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01d}} 2006 %Section tag: PRX1 2007 \label{crat0:sfdv0:sprx1} 2008 2009 Algorithm \ref{alg:crat0:sfdv0:01d}\footnote{Algorithm \ref{alg:crat0:sfdv0:01d} 2010 was contributed in March, 2003 2011 by John Larkin \cite{bibref:i:johnlarkin} 2012 via the 2013 \texttt{comp.arch.embedded} \cite{bibref:n:comparchembedded} 2014 newsgroup.} 2015 (Figure \ref{fig:crat0:sfdv0:01d}) is a further 2016 economization of Algorithms \ref{alg:crat0:sfdv0:01a} 2017 through \ref{alg:crat0:sfdv0:01c} that can be made by eliminating 2018 the addition or subtraction of $K_4$ and test against $K_3$ 2019 and instead using the 2020 inherent machine integer size of $W$ bits to perform 2021 arithmetic modulo $2^W$. Thus, effectively, Algorithm \ref{alg:crat0:sfdv0:01d} 2022 is equivalent to Algorithm \ref{alg:crat0:sfdv0:01a} with 2023 $K_4 = K_3 = 2^W$. 2024 2025 Figure \ref{fig:crat0:sfdv0:01d} shows both 2026 assembly-language (Texas Instruments TMS-370C8) and 2027 C' implementations of the algorithm. The assembly-language 2028 version uses the carry flag of the processor and thus 2029 is \emph{very} efficient. Because C' does not have access 2030 to the processor flags, the 'C' version is less efficient. 2031 The less than'' comparison when 2032 using unsigned integers is equivalent to a rollover test. 2033 2034 It is easy to see from the figure that Algorithm \ref{alg:crat0:sfdv0:01d} 2035 is equivalent in all 2036 respects to Algorithm \ref{alg:crat0:sfdv0:01a} with 2037 $K_3 = K_4$ fixed at $2^W$. It is not necessary to enforce any constraints 2038 on $K_2$ because $K_2 < K_3 = K_4 = 2^W$ due to the inherent size of 2039 a machine integer. Note that unlike Algorithms \ref{alg:crat0:sfdv0:01a} 2040 through \ref{alg:crat0:sfdv0:01c} which allow $K_2$ and $K_4$ to be chosen independently 2041 and from the Farey series of appropriate order, Algorithm \ref{alg:crat0:sfdv0:01c} 2042 only allows 2043 $K_2/K_4$ of the form $K_2/2^W$. 2044 2045 The properties below follow immediately 2046 from the properties of Algorithm \ref{alg:crat0:sfdv0:01a}. 2047 2048 \begin{vworklemmastatement} 2049 \label{lem:crat0:sfdv0:sprx1:01} 2050 $N_{STARTUP}$, the number of invocations of the base subroutine 2051 in Algorithm \ref{alg:crat0:sfdv0:01d} before \texttt{A()}'' is called 2052 for the first time, is given by 2053 2054 2055 \label{eq:lem:crat0:sfdv0:sprx1:01:01} 2056 N_{STARTUP} = 2057 \left\lfloor 2058 { 2059 \frac{2^W - K_1 - 1}{K_2} 2060 } 2061 \right\rfloor . 2062 2063 \end{vworklemmastatement} 2064 \begin{vworklemmaproof} 2065 The value of \texttt{state} after the $n$th invocation 2066 is $state_n = K_1 + n K_2$. In order for the test in the 2067 \texttt{if()} statement not to be met, we require that 2068 2069 2070 \label{eq:lem:crat0:sfdv0:sprx1:01:02} 2071 K_1 + n K_2 \leq 2^W - 1 2072 2073 2074 \noindent{}or equivalently that 2075 2076 2077 \label{eq:lem:crat0:sfdv0:sprx1:01:03} 2078 n \leq \frac{2^W - K_1 - 1}{K_2} . 2079 2080 2081 Solving (\ref{eq:lem:crat0:sfdv0:sprx1:01:03}) for the largest 2082 value of $n \in \vworkintset$ which still meets the criterion 2083 yields (\ref{eq:lem:crat0:sfdv0:sprx1:01:01}). 2084 \end{vworklemmaproof} 2085 2086 \begin{vworklemmastatement} 2087 \label{lem:crat0:sfdv0:sprx1:02} 2088 Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:01d} base subroutine 2089 is called, let $N_O$ be the number of times the 2090 \texttt{A()}'' subroutine is called, let 2091 $f_I$ be the frequency of invocation of the 2092 Algorithm \ref{alg:crat0:sfdv0:01d} base subroutine, and let 2093 $f_O$ be the frequency of invocation of 2094 \texttt{A()}''. Then 2095 2096 2097 \label{eq:lem:crat0:sfdv0:sprx1:02:01} 2098 \lim_{N_I\rightarrow\infty}\frac{N_O}{N_I} 2099 = 2100 \frac{f_O}{f_I} 2101 = 2102 \frac{K_2}{2^W} , 2103 2104 2105 where $W$ is the number of bits in a machine unsigned integer. 2106 Note that $K_2 < 2^W$ since $K_2 \in \{ 0, 1, \ldots , 2^W-1 \}$. 2107 \end{vworklemmastatement} 2108 \begin{vworklemmaproof} 2109 The proof is identical to the proof of 2110 Lemma \ref{lem:crat0:sfdv0:sprc0:02} with $K_3=K_4=2^W$. 2111 Note that Algorithm \ref{alg:crat0:sfdv0:01a} calculates $n K_2 \bmod K_4$ by 2112 subtraction, whereas Algorithm \ref{alg:crat0:sfdv0:01d} calculates 2113 $n K_2 \bmod 2^W$ by the properties of a $W$-bit counter 2114 which is allowed to roll over. 2115 \end{vworklemmaproof} 2116 %\vworklemmafooter{} 2117 2118 2119 \begin{vworklemmastatement} 2120 \label{lem:crat0:sfdv0:sprx1:03} 2121 If $\gcd(K_2, 2^W)=1$\footnote{See also footnote \ref{footnote:lem:crat0:sfdv0:sprc0:04:01} 2122 in this chapter. Note also that in this context the condition $\gcd(K_2, 2^W)=1$ 2123 is equivalent to the condition that $K_2$ be odd.}, the error between 2124 the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01d} 2125 and the ideal'' mapping is always 2126 in the set 2127 2128 2129 \label{eq:lem:crat0:sfdv0:sprx1:03:01} 2130 \left[ - \frac{2^W - 1}{2^W} , 0 \right] , 2131 2132 2133 and no algorithm can be constructed to 2134 confine the error to a smaller interval. 2135 \end{vworklemmastatement} 2136 \begin{vworklemmaproof} 2137 The proof is identical to the proof of Lemma 2138 \ref{lem:crat0:sfdv0:sprc0:04} with $K_4 = 2^W$. 2139 \end{vworklemmaproof} 2140 %\vworklemmafooter{} 2141 2142 \begin{vworklemmastatement} 2143 \label{lem:crat0:sfdv0:sprx1:04} 2144 For Algorithm \ref{alg:crat0:sfdv0:01d} 2145 (Figure \ref{fig:crat0:sfdv0:01d}), once steady 2146 state has been achieved, the number of consecutive 2147 base subroutine invocations during which subroutine 2148 \texttt{A()}'' is executed is always in the set 2149 2150 2151 \label{eq:lem:crat0:sfdv0:sprx1:04:01} 2152 \left\{ 2153 \left\lfloor \frac{K_2}{2^W - K_2} \right\rfloor , 2154 \left\lceil \frac{K_2}{2^W - K_2} \right\rceil 2155 \right\} \cap \vworkintsetpos, 2156 2157 2158 which contains one integer if $K_2/2^W \leq 1/2$ or $(2^W-K_2) \vworkdivides K_2$, 2159 or two integers otherwise. 2160 2161 Once steady state has been achieved, the number of 2162 consecutive base function invocations during which 2163 subroutine \texttt{A()}'' is not executed is 2164 always in the set 2165 2166 2167 \label{eq:lem:crat0:sfdv0:sprx1:04:02} 2168 \left\{ 2169 \left\lfloor \frac{2^W-K_2}{K_2} \right\rfloor , 2170 \left\lceil \frac{2^W-K_2}{K_2} \right\rceil 2171 \right\} \cap \vworkintsetpos, 2172 2173 2174 which contains one integer if $K_2/2^W \geq 1/2$ or $K_2 \vworkdivides 2^W$, 2175 or two integers otherwise. 2176 \end{vworklemmastatement} 2177 \begin{vworklemmaproof} 2178 The proof is identical to the proof of Lemma 2179 \ref{lem:crat0:sfdv0:sprc0:03} with $K_4 = 2^W$. 2180 \end{vworklemmaproof} 2181 \vworklemmafooter{} 2182 2183 2184 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2185 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2186 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2187 \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:02a}] 2188 {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:02a}} 2189 %Section tag: PRC2 2190 \label{crat0:sfdv0:sprc2} 2191 2192 Another useful rational counting algorithm is Algorithm \ref{alg:crat0:sfdv0:02a}. 2193 At first glance, it may appear that Algorithm \ref{alg:crat0:sfdv0:02a} 2194 is qualitatively 2195 different than Algorithms \ref{alg:crat0:sfdv0:01a} 2196 and \ref{alg:crat0:sfdv0:01b}. 2197 However, as the following lemmas demonstrate, Algorithm \ref{alg:crat0:sfdv0:02a} 2198 can be easily rearranged to be in the form 2199 of Algorithm \ref{alg:crat0:sfdv0:01a}. 2200 2201 \begin{vworklemmastatement} 2202 \label{lem:crat0:sfdv0:sprc2:01} 2203 $N_{STARTUP}$, the number of invocations of the base subroutine 2204 in Algorithm \ref{alg:crat0:sfdv0:02a} before \texttt{A()}'' is called 2205 for the first time, is given by 2206 2207 2208 \label{eq:lem:crat0:sfdv0:sprc2:01:01} 2209 N_{STARTUP} = 2210 \left\lceil 2211 { 2212 \frac{K_3 - K_1}{K_2} 2213 } 2214 \right\rceil . 2215 2216 \end{vworklemmastatement} 2217 \begin{vworklemmaproof} 2218 The value of \texttt{state} after the $n$th invocation 2219 is $K_1 + n K_2$. In order for the test in the 2220 \texttt{if()} statement to be met on the $n+1$'th invocation 2221 of the base subroutine, we require that 2222 2223 2224 \label{eq:lem:crat0:sfdv0:sprc2:01:02} 2225 K_1 + n K_2 \geq K_3 2226 2227 2228 \noindent{}or equivalently that 2229 2230 2231 \label{eq:lem:crat0:sfdv0:sprc2:01:03} 2232 n \geq \frac{K_3 - K_1}{K_2} . 2233 2234 2235 Solving (\ref{eq:lem:crat0:sfdv0:sprc2:01:03}) for the smallest 2236 value of $n \in \vworkintset$ which still meets the criterion 2237 yields (\ref{eq:lem:crat0:sfdv0:sprc2:01:01}). Note that 2238 the derivation of (\ref{eq:lem:crat0:sfdv0:sprc2:01:01}) requires 2239 that the restrictions on $K_1$ through $K_4$ documented in 2240 Figure \ref{fig:crat0:sfdv0:02a} be met. 2241 \end{vworklemmaproof} 2242 2243 \begin{vworklemmastatement} 2244 \label{lem:crat0:sfdv0:sprc2:02} 2245 Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:02a} base subroutine 2246 is called, let $N_{OA}$ be the number of times the 2247 \texttt{A()}'' subroutine is called, let 2248 $f_I$ be the frequency of invocation of the 2249 Algorithm \ref{alg:crat0:sfdv0:02a} base subroutine, and let 2250 $f_{OA}$ be the frequency of invocation of 2251 \texttt{A()}''. Then, the proportion of times the 2252 \texttt{A()}'' subroutine is called is given by 2253 2254 2255 \label{eq:lem:crat0:sfdv0:sprc2:02:01} 2256 \lim_{N_I\rightarrow\infty}\frac{N_{OA}}{N_I} 2257 = 2258 \frac{f_{OA}}{f_I} 2259 = 2260 \frac{K_2}{K_4 + K_2} , 2261 2262 2263 and the proportion of times the \texttt{B()}'' subroutine is called 2264 is given by 2265 2266 2267 \label{eq:lem:crat0:sfdv0:sprc2:02:02} 2268 \lim_{N_I\rightarrow\infty}\frac{N_{OB}}{N_I} 2269 = 2270 \frac{f_{OB}}{f_I} 2271 = 2272 1 - \frac{f_{OA}}{f_I} 2273 = 2274 \frac{K_4}{K_4 + K_2} . 2275 2276 \end{vworklemmastatement} 2277 \begin{vworklemmaproof} 2278 As in Lemma \ref{} and without 2279 loss of generality, we assume for analytic 2280 convenience that $K_1=0$ and $K_3=K_4$. Note that 2281 $K_1$ and $K_3$ influence only the transient startup 2282 behavior of the algorithm. 2283 2284 It can be observed from the algorithm that once steady 2285 state is achieved, \texttt{state} will be confined to the set 2286 2287 2288 \label{eq:lem:crat0:sfdv0:sprc2:02:10} 2289 state \in \{ 0, 1, \ldots , K_4 + K_2 - 1 \} . 2290 2291 2292 It is certainly possible to use results from 2293 number theory and analyze which values in the 2294 set (\ref{eq:lem:crat0:sfdv0:sprc2:02:10}) can be 2295 attained and the order in which they can be attained. 2296 However, an easier approach is to observe that 2297 Algorithm \ref{alg:crat0:sfdv0:02a} 2298 can be rearranged to take the form of 2299 rational counting Algorithm \ref{alg:crat0:sfdv0:01a}. 2300 This rearranged 2301 algorithm is presented as 2302 Figure \ref{fig:lem:crat0:sfdv0:sprc2:02:01}. Note that the 2303 algorithm is rearranged only for easier analysis. 2304 2305 \begin{figure} 2306 \begin{verbatim} 2307 void base_rate_sub(void) 2308 { 2309 static unsigned int state = K1; 2310 2311 state += K2; 2312 2313 if (state >= (K4 + K2)) 2314 { 2315 state -= (K4 + K2); 2316 A(); 2317 } 2318 else 2319 { 2320 B(); 2321 } 2322 } 2323 \end{verbatim} 2324 \caption{Algorithm \ref{alg:crat0:sfdv0:02a} Modified To Resemble Algorithm \ref{alg:crat0:sfdv0:01a} 2325 (Proof Of Lemma \ref{lem:crat0:sfdv0:sprc2:02})} 2326 \label{fig:lem:crat0:sfdv0:sprc2:02:01} 2327 \end{figure} 2328 2329 In Figure \ref{fig:lem:crat0:sfdv0:sprc2:02:01}, the 2330 statement \texttt{state += K2}'' has been removed from the 2331 \texttt{else} clause and placed above the \texttt{if()} statement, 2332 and other constants have been adjusted accordingly. 2333 It can be observed that the figure 2334 is structurally identical to rational counting algorithm, except for the 2335 \texttt{else} clause (which does not affect the counting behavior) and 2336 the specific constants for testing and incrementation. 2337 2338 In Figure \ref{fig:lem:crat0:sfdv0:sprc2:02:01}, as contrasted with 2339 Algorithm \ref{alg:crat0:sfdv0:01a}, $K_4 + K_2$'' takes the 2340 place of $K_4$. $\gcd(K_2, K_4 + K_2) = \gcd(K_2, K_4)$ 2341 (see Lemma \cprizeroxrefhyphen\ref{lem:cpri0:gcd0:01}), so the 2342 results from 2343 \end{vworklemmaproof} 2344 2345 \begin{vworklemmastatement} 2346 \label{lem:crat0:sfdv0:sprc2:03} 2347 If $K_3=K_4$, $K_1=0$, and $\gcd(K_2, K_4)=1$, the error between 2348 the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01b} 2349 and the ideal'' mapping is always 2350 in the set 2351 2352 2353 \label{eq:lem:crat0:sfdv0:sprc2:03:01} 2354 \left[ - \frac{K_4 - 1}{K_4} , 0 \right] , 2355 2356 2357 and no algorithm can be constructed to 2358 confine the error to a smaller interval. 2359 \end{vworklemmastatement} 2360 \begin{vworklemmaproof} 2361 The proof is identical to Lemma \ref{lem:crat0:sfdv0:sprc0:04}. 2362 \end{vworklemmaproof} 2363 2364 2365 2366 2367 2368 2369 2370 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2371 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2372 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2373 \section{Bresenham's Line Algorithm} 2374 %Section tag: BLA0 2375 \label{crat0:sbla0} 2376 2377 \index{Bresenham's line algorithm}\emph{Bresenham's line algorithm} is a 2378 very efficient algorithm for drawing lines on devices that have 2379 a rectangular array of pixels which can be individually illuminated. 2380 Bresenham's line algorithm is efficient for small microcontrollers 2381 because it relies only 2382 on integer addition, subtraction, shifting, and comparison. 2383 2384 Bresenham's line algorithm is presented for two reasons: 2385 2386 \begin{itemize} 2387 \item The algorithm is useful for drawing lines on LCD 2388 displays and other devices typically controlled by 2389 microcontrollers. 2390 \item The algorithm is an [extremely optimized] application 2391 of the rational 2392 counting algorithms presented in this chapter. 2393 \end{itemize} 2394 2395 \begin{figure} 2396 \begin{center} 2397 \begin{huge} 2398 Figure Space Reserved 2399 \end{huge} 2400 \end{center} 2401 \caption{Raster Grid For Development Of Bresenham's Line Algorithm} 2402 \label{fig:crat0:sbla0:01} 2403 \end{figure} 2404 2405 Assume that we wish to draw a line from $(0,0)$ to $(x_f, y_f)$ on 2406 a raster device (Figure \ref{fig:crat0:sbla0:01}). For simplicity of 2407 development, assume that $y_f \leq x_f$ (i.e. that the slope $m \leq 1$). 2408 2409 For each value of $x \in \vworkintset$, the ideal value of $y$ is given 2410 by 2411 2412 2413 \label{eq:crat0:sbla0:01} 2414 y = mx = \frac{y_f}{x_f} x = \frac{y_f x}{x_f} . 2415 2416 2417 \noindent{}However, on a raster device, we must usually 2418 choose an inexact pixel to illuminate, since it is typically 2419 rare that $x_f \vworkdivides y_f x$. If 2420 $x_f \vworkdivides y_f x$, then the ideal value of $y$ is 2421 an integer, and we choose to illuminate 2422 $(x, (y_f x)/x_f)$. However, if $x_f \vworknotdivides y_f x$, 2423 then we must choose either a pixel with the same y-coordinate 2424 as the previous pixel (we call this choice D') or the pixel 2425 with a y-coordinate one greater than the previous pixel (we 2426 call this choice U'). 2427 The fractional part of the quotient 2428 $(y_f x) / x_f$ indicates whether D or U is closer to the ideal line. 2429 If $y_f x \bmod x_f \geq x_f/2$, we choose U, otherwise we choose D 2430 (note that the decision to choose U in the equality case is arbitrary). 2431 2432 Using the rational approximation techniques presented in 2433 Section \ref{crat0:sfdv0}, it is straightforward to 2434 develop an algorithm, which is presented as the code 2435 in Figure \ref{fig:crat0:sbla0:02}. 2436 Note that this code will only work if $m = y_f/x_f \leq 1$. 2437 2438 \begin{figure} 2439 \begin{verbatim} 2440 /* Draws a line from (0,0) to (x_f,y_f) on a raster */ 2441 /* device. */ 2442 2443 void bresenham_line(int x_f, int y_f) 2444 { 2445 int d=0; /* The modulo counter. */ 2446 int x=0, y=0; 2447 /* x- and y-coordinates currently being */ 2448 /* evaluated. */ 2449 int d_old; /* Remembers previous value of d. */ 2450 2451 plotpoint(0,0); /* Plot initial point. */ 2452 while (x <= x_f) 2453 { 2454 d_old = d; 2455 d += y_f; 2456 if (d >= x_f) 2457 d -= x_f; 2458 x++; 2459 if ( 2460 ( 2461 (d == 0) && (d_old < x_f/2) 2462 ) 2463 || 2464 ( 2465 (d >= x_f/2) 2466 && 2467 ((d_old < x_f/2) || (d_old >= d)) 2468 ) 2469 ) 2470 y++; 2471 plotpoint(x,y); 2472 } 2473 } 2474 \end{verbatim} 2475 \caption{First Attempt At A Raster Device Line Algorithm 2476 Using Rational Counting Techniques} 2477 \label{fig:crat0:sbla0:02} 2478 \end{figure} 2479 2480 There are a few efficiency refinements that can be made to 2481 the code in Figure \ref{fig:crat0:sbla0:02}, but overall 2482 it is a very efficient algorithm. Note that 2483 nearly all compilers will handle the integer 2484 division by two using a shift 2485 operation rather than a division. 2486 2487 We can however substantially simplify and economize the code of 2488 Figure \ref{fig:crat0:sbla0:02} by using the technique 2489 presented in Figures \ref{fig:crat0:sfdv0:fab0:03} and 2490 \ref{fig:crat0:sfdv0:fab0:04}, and this improved code is 2491 presented as Figure \ref{fig:crat0:sbla0:03}. 2492 2493 \begin{figure} 2494 \begin{verbatim} 2495 /* Draws a line from (0,0) to (x_f,y_f) on a raster */ 2496 /* device. */ 2497 2498 void bresenham_line(int x_f, int y_f) 2499 { 2500 int d=y_f; /* Position of the ideal line minus */ 2501 /* the position of the line we are */ 2502 /* drawing, in units of 1/x_f. The */ 2503 /* initialization value is y_f because */ 2504 /* the algorithm is looking one pixel */ 2505 /* ahead in the x direction, so we */ 2506 /* begin at x=1. */ 2507 int x=0, y=0; 2508 /* x- and y-coordinates currently being */ 2509 /* evaluated. */ 2510 plotpoint(0,0); /* Plot initial point. */ 2511 while (x <= x_f) 2512 { 2513 x++; /* We move to the right regardless. */ 2514 if (d >= x_f/2) 2515 { 2516 /* The "U" choice. We must jump up a pixel */ 2517 /* to keep up with the ideal line. */ 2518 d += (y_f - x_f); 2519 y++; /* Jump up a pixel. */ 2520 } 2521 else /* d < x_f/2 */ 2522 { 2523 /* The "D" choice. Distance is not large */ 2524 /* enough to jump up a pixel. */ 2525 d += y_f; 2526 } 2527 plotpoint(x,y); 2528 } 2529 } 2530 \end{verbatim} 2531 \caption{Second Attempt At A Raster Device Line Algorithm 2532 Using Rational Counting Techniques} 2533 \label{fig:crat0:sbla0:03} 2534 \end{figure} 2535 2536 In order to understand the code of Figure \ref{fig:crat0:sbla0:03}, 2537 it is helpful to view the problem in an alternate way. 2538 For any $x \in \vworkintset$, let 2539 $d$ be the distance between the position of the ideal line 2540 (characterized by $y = y_f x / x_f$) and 2541 the actual pixel which will be illuminated. It is easy to 2542 observe that: 2543 2544 \begin{itemize} 2545 \item When drawing a raster line, if one proceeds from 2546 $(x, y)$ to $(x+1, y)$ (i.e. makes the D'' choice), 2547 $d$ will increase by $y_f/x_f$. 2548 \item When drawing a raster line, if one proceeds from 2549 $(x,y)$ to $(x+1, y+1)$ (i.e. makes the U'' choice), 2550 $d$ will increase by $(y_f - x_f)/x_f$. (The increase 2551 of $y_f/x_f$ comes about because the ideal line proceeds 2552 upward from $x$ to $x+1$, while the decrease of $x_f/x_f = 1$ 2553 comes about because the line being drawn jumps upward by one 2554 unit, thus tending to catch'' the ideal line.) 2555 \end{itemize} 2556 2557 The code of Figure \ref{fig:crat0:sbla0:03} implements the 2558 two observations above in a straightforward way. $d$ is maintained 2559 in units of $1/x_f$, and when U'' is chosen over D'' whenever 2560 the gap between the ideal line and the current row of pixels 2561 being drawn becomes too large. 2562 2563 The code in Figure \ref{fig:crat0:sbla0:03} does however contain logical 2564 and performance problems which should be corrected: 2565 2566 \begin{itemize} 2567 \item The test of $d$ against $x_f/2$ will perform as intended. 2568 For example, if $d=2$ and $x_f=5$, the test 2569 \texttt{d >= x\_f/2}'' in the code will evaluate true 2570 although the actual condition is false. To correct this 2571 defect, the units of $d$ should be changed from 2572 $1/x_f$ to $1/(2 x_f)$. 2573 \item The quantity $y_f - x_f$ is calculated repeatedly. This 2574 calculation should be moved out of the \emph{while()} loop. 2575 \item The test against $x_f$ may be more economical if changed to 2576 a test against 0 (but this requires a different initialization 2577 assignment for $d$). 2578 \end{itemize} 2579 2580 Figure \ref{fig:crat0:sbla0:04} corrects these defects 2581 from Figure \ref{fig:crat0:sbla0:03}. 2582 Figure \ref{fig:crat0:sbla0:04} is essentially the Bresenham 2583 line algorithm, except that it only draws starting from the 2584 origin and will only draw a line with a slope 2585 $m = y_f/x_f \leq 1$. 2586 2587 \begin{figure} 2588 \begin{verbatim} 2589 /* Draws a line from (0,0) to (x_f,y_f) on a raster */ 2590 /* device. */ 2591 2592 void bresenham_line(int x_f, int y_f) 2593 { 2594 int d = 2 * y_f - x_f; 2595 /* Position of the ideal line minus */ 2596 /* the position of the line we are */ 2597 /* drawing, in units of 1/(2 * x_f). */ 2598 /* Initialization value of 2 * y_f is */ 2599 /* because algorithm is looking one */ 2600 /* pixel ahead. Value of -x_f is from */ 2601 /* shifting the midpoint test (the */ 2602 /* "if" statement below) downward to a */ 2603 /* test against zero. */ 2604 int dD = 2 * y_f; 2605 int dU = dD - x_f; 2606 /* Amounts to add to d if "D" and "U" */ 2607 /* pixels are chosen, respectively. */ 2608 /* Calculated here outside of loop. */ 2609 int x=0, y=0; 2610 /* x- and y-coordinates currently being */ 2611 /* evaluated. */ 2612 plotpoint(0,0); /* Plot initial point. */ 2613 while (x <= x_f) 2614 { 2615 x++; /* We move to the right regardless. */ 2616 if (d >= 0) 2617 { 2618 /* The "U" choice. We must jump up a pixel */ 2619 /* to keep up with the ideal line. */ 2620 d += dU; 2621 y++; /* Jump up a pixel. */ 2622 } 2623 else /* d < 0 */ 2624 { 2625 /* The "D" choice. Distance is not large */ 2626 /* enough to jump up a pixel. */ 2627 d += dD; 2628 } 2629 plotpoint(x,y); 2630 } 2631 } 2632 \end{verbatim} 2633 \caption{Third Attempt At A Raster Device Line Algorithm 2634 Using Rational Counting Techniques} 2635 \label{fig:crat0:sbla0:04} 2636 \end{figure} 2637 2638 2639 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2640 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2641 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2642 \section{Authors And Acknowledgements} 2643 %Section tag: ACK0 2644 This chapter was primarily written by 2645 \index{Ashley, David T.} David T. Ashley 2646 \cite{bibref:i:daveashley}. 2647 2648 We would like to gratefully acknowledge the assistance of 2649 \index{Falconer, Chuck B.} Chuck B. Falconer \cite{bibref:i:chuckbfalconer}, 2650 \index{Hoffmann, Klaus} Klaus Hoffmann \cite{bibref:i:klaushoffmann}, 2651 \index{Larkin, John} John Larkin \cite{bibref:i:johnlarkin}, 2652 \index{Smith, Thad} Thad Smith \cite{bibref:i:thadsmith}, 2653 and 2654 \index{Voipio, Tauno} Tauno Voipio \cite{bibref:i:taunovoipio} 2655 for insight into rational counting approaches, contributed via the 2656 \texttt{sci.math} \cite{bibref:n:scimathnewsgroup} 2657 and 2658 \texttt{comp.arch.embedded} \cite{bibref:n:comparchembedded} 2659 newsgroups. 2660 2661 2662 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2663 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2664 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2665 \section{Exercises} 2666 %Section tag: EXE0 2667 2668 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2669 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2670 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2671 \subsection[\protect\mbox{\protect$h/2^q$} and \protect\mbox{\protect$2^q/k$} Rational Linear Approximation] 2672 {\protect\mbox{\protect\boldmath$h/2^q$} and \protect\mbox{\protect\boldmath$2^q/k$} Rational Linear Approximation} 2673 2674 \begin{vworkexercisestatement} 2675 \label{exe:crat0:sexe0:a01} 2676 Derive equations analogous to (\ref{eq:crat0:shqq0:dph0:03}) 2677 and (\ref{eq:crat0:shqq0:dph0:07}) if $r_A$ is chosen 2678 without rounding, i.e. 2679 $h=\lfloor r_I 2^q \rfloor$ and therefore 2680 $r_A=\lfloor r_I 2^q \rfloor/2^q$. 2681 \end{vworkexercisestatement} 2682 \vworkexercisefooter{} 2683 2684 \begin{vworkexercisestatement} 2685 \label{exe:crat0:sexe0:a02} 2686 Derive equations analogous to (\ref{eq:crat0:shqq0:dph0:03}) 2687 and (\ref{eq:crat0:shqq0:dph0:07}) if 2688 $z$ is chosen for rounding with the midpoint case rounded 2689 down, i.e. $z=2^{q-1}-1$, and applied as in 2690 (\ref{eq:crat0:sint0:01}). 2691 \end{vworkexercisestatement} 2692 \vworkexercisefooter{} 2693 2694 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2695 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2696 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2697 \subsection{Rational Counting} 2698 2699 2700 \begin{vworkexercisestatement} 2701 \label{exe:crat0:sexe0:01} 2702 For Algorithm \ref{alg:crat0:sfdv0:01a}, 2703 assume that one chooses $K_1 > K_3 - K_2$ (in contradiction to the 2704 restrictions in Figure \ref{fig:crat0:sfdv0:01a}). 2705 Derive a result similar to Lemma \ref{lem:crat0:sfdv0:sprc0:01} 2706 for the number of base subroutine invocations in which 2707 \texttt{A()}'' is run before it is 2708 \emph{not} run for the first time. 2709 \end{vworkexercisestatement} 2710 \vworkexercisefooter{} 2711 2712 \begin{vworkexercisestatement} 2713 \label{exe:crat0:sexe0:02} 2714 This will be the $\epsilon$ lemma proof. 2715 \end{vworkexercisestatement} 2716 \vworkexercisefooter{} 2717 2718 \begin{vworkexercisestatement} 2719 \label{exe:crat0:sexe0:03} 2720 Rederive appropriate results similar to 2721 Lemma \ref{lem:crat0:sfdv0:sprc0:04} in the case where 2722 $\gcd(K_2, K_4) > 1$. 2723 \end{vworkexercisestatement} 2724 \vworkexercisefooter{} 2725 2726 \begin{vworkexercisestatement} 2727 \label{exe:crat0:sexe0:04} 2728 Rederive appropriate results similar to 2729 Lemma \ref{lem:crat0:sfdv0:sprc0:04} in the case where 2730 $K_1 \neq 0$. 2731 \end{vworkexercisestatement} 2732 \vworkexercisefooter{} 2733 2734 \begin{vworkexercisestatement} 2735 \label{exe:crat0:sexe0:05} 2736 Rederive appropriate results similar to 2737 Lemma \ref{lem:crat0:sfdv0:sprc0:04} in the case where 2738 $\gcd(K_2, K_4) > 1$ and $K_1 \neq 0$. 2739 \end{vworkexercisestatement} 2740 \vworkexercisefooter{} 2741 2742 \begin{vworkexercisestatement} 2743 \label{exe:crat0:sexe0:06} 2744 For Lemma \ref{lem:crat0:sfdv0:sprc0:04}, 2745 complete the missing proof: 2746 show that if $\gcd(K_2, K_4) = 1$, no algorithm can 2747 lead to a tighter bound than (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}). 2748 \textbf{Hint:} start with the observation 2749 that with 2750 $\gcd(K_2, K_4) = 1$, $n K_2 \bmod K_4$ will attain every value in 2751 the set $\{ 0, \ldots , K_4-1 \}$. 2752 \end{vworkexercisestatement} 2753 \vworkexercisefooter{} 2754 2755 \begin{vworkexercisestatement} 2756 \label{exe:crat0:sexe0:07} 2757 For Lemma \ref{lem:crat0:sfdv0:sprc0:03}, 2758 show that if $K_1=0$, the number of initial invocations 2759 of the base subroutine before `\texttt{A()}'' is first 2760 called is in the set specified in 2761 (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}). 2762 \end{vworkexercisestatement} 2763 \vworkexercisefooter{} 2764 2765 \begin{vworkexercisestatement} 2766 \label{exe:crat0:sexe0:08} 2767 Develop other techniques to correct the state upset vulnerability 2768 of Algorithm \ref{alg:crat0:sfdv0:01a} besides 2769 the technique illustrated in 2770 Figure \ref{fig:crat0:sfdv0:sprc0:01}. 2771 \end{vworkexercisestatement} 2772 \vworkexercisefooter{} 2773 2774 \begin{vworkexercisestatement} 2775 \label{exe:crat0:sexe0:09} 2776 Show for Example \ref{ex:crat0:sfdv0:sprc1:01} that integers of at least 2777 27 bits are required. 2778 \end{vworkexercisestatement} 2779 \vworkexercisefooter{} 2780 2781 \begin{vworkexercisestatement} 2782 \label{exe:crat0:sexe0:10} 2783 Prove Lemma \ref{lem:crat0:sfdv0:sprc1:02}. 2784 \end{vworkexercisestatement} 2785 \vworkexercisefooter{} 2786 2787 \begin{vworkexercisestatement} 2788 \label{exe:crat0:sexe0:12} 2789 Prove Lemma \ref{lem:crat0:sfdv0:sprc1:04}. 2790 \end{vworkexercisestatement} 2791 \vworkexercisefooter{} 2792 2793 \begin{vworkexercisestatement} 2794 \label{exe:crat0:sexe0:13} 2795 Define the term \emph{steady state} as used in 2796 Lemma \ref{lem:crat0:sfdv0:sprc1:04} in terms of 2797 set membership of the \texttt{state} variable. 2798 \end{vworkexercisestatement} 2799 \vworkexercisefooter{} 2800 2801 \begin{vworkexercisestatement} 2802 \label{exe:crat0:sexe0:14} 2803 For Algorithm \ref{alg:crat0:sfdv0:01a}, devise examples of anomalous behavior due to 2804 race conditions that may occur if $K_2$ and/or $K_4$ are set in a process 2805 which is asynchronous with respect to the process which implements the 2806 rational counting algorithm if mutual exclusion protocol is not 2807 implemented. 2808 \end{vworkexercisestatement} 2809 \vworkexercisefooter{} 2810 2811 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2812 \vfill 2813 \noindent\begin{figure}[!b] 2814 \noindent\rule[-0.25in]{\textwidth}{1pt} 2815 \begin{tiny} 2816 \begin{verbatim} 2817 $HeadURL$ 2818 $Revision$ 2819 $Date$ 2820 $Author$ 2821 \end{verbatim} 2822 \end{tiny} 2823 \noindent\rule[0.25in]{\textwidth}{1pt} 2824 \end{figure} 2825 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2826 % 2827 %End of file C_RAT0.TEX

## Properties

Name Value
svn:eol-style native
svn:keywords Author Date Id Revision URL Header