Parent Directory | Revision Log

Revision **6** -
(**show annotations**)
(**download**)
(**as text**)

*Fri Oct 7 01:36:46 2016 UTC*
(6 years, 11 months ago)
by *dashley*

File MIME type: application/x-tex

File size: 111163 byte(s)

File MIME type: application/x-tex

File size: 111163 byte(s)

Initial commit after migrating from CVS.

1 | %$Header: /home/dashley/cvsrep/e3ft_gpl01/e3ft_gpl01/dtaipubs/esrgubka/c_rat0/c_rat0.tex,v 1.28 2004/02/22 19:27:48 dtashley Exp $ |

2 | |

3 | \chapter{Rational Linear Approximation} |

4 | |

5 | \label{crat0} |

6 | |

7 | \beginchapterquote{``Die ganzen Zahlen hat der liebe Gott gemacht, |

8 | alles andere ist Menschenwerk.''\footnote{German |

9 | language: God made the integers; everything |

10 | else was made by man.}} |

11 | {Leopold Kronecker} |

12 | |

13 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

14 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

15 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

16 | \section{Introduction} |

17 | %Section tag: INT0 |

18 | \label{crat0:sint0} |

19 | |

20 | In this chapter, we consider practical applications of |

21 | rational approximation. |

22 | Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0} |

23 | have presented algorithms for finding |

24 | the closest rational numbers to an arbitrary real number, |

25 | subject to constraints on the numerator and denominator. |

26 | The basis of these algorithms is complex and comes from number theory, and so |

27 | these algorithms and their basis have been presented in separate chapters. |

28 | |

29 | In Section \ref{crat0:srla0}, rational linear approximation itself |

30 | and associated error bounds are presented. By \emph{rational linear |

31 | approximation} we mean simply the approximation of a line |

32 | $y = r_I x$ ($y, r_I, x \in \vworkrealset$) by a line |

33 | |

34 | \begin{equation} |

35 | \label{eq:crat0:sint0:01} |

36 | y = \left\lfloor |

37 | \frac{h \lfloor x \rfloor + z}{k} |

38 | \right\rfloor , |

39 | \end{equation} |

40 | |

41 | \noindent{}where we choose $h/k \approx r_I$ and optionally choose $z$ to |

42 | shift the error introduced. Note that (\ref{eq:crat0:sint0:01}) is |

43 | very economical for microcontroller instruction sets, since only integer |

44 | arithmetic is required. We may choose $h/k$ from a Farey series (see |

45 | Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0}), or |

46 | we may choose a ratio $h/2^q$ so that the division in (\ref{eq:crat0:sint0:01}) |

47 | can be implemented |

48 | by a bitwise right shift. |

49 | |

50 | Section \ref{crat0:srla0} discusses linear rational approximation |

51 | in general, with a special eye on error analysis. |

52 | |

53 | Section \ref{crat0:spwi0} discusses piecewise linear rational approximation, |

54 | which is the approximation of a curve or complex mapping by a |

55 | number of joined line segments. |

56 | |

57 | Section \ref{crat0:sfdv0} discusses frequency division and rational counting. |

58 | Such techniques share the same mathematical framework as rational linear |

59 | approximation, and as with rational linear approximation the ratio |

60 | involved may be chosen from a Farey series or with a denominator of $2^q$, depending |

61 | on the algorithm employed. |

62 | |

63 | Section \ref{crat0:sbla0} discusses Bresenham's classic line algorithm, |

64 | which is a practical application of rational linear approximation. |

65 | |

66 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

67 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

68 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

69 | \section{Rational Linear Approximation} |

70 | %Section tag: RLA0 |

71 | \label{crat0:srla0} |

72 | |

73 | It occurs frequently in embedded software design that one wishes to |

74 | implement a linear scaling from a domain to a range of the form |

75 | |

76 | \begin{equation} |

77 | \label{eq:crat0:srla0:01} |

78 | f(x) = r_I x , |

79 | \end{equation} |

80 | |

81 | \noindent{}where $r_I$ is the \emph{ideal} |

82 | |

83 | |

84 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

85 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

86 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

87 | \subsection{Model Functions} |

88 | %Section tag: mfu0 |

89 | \label{crat0:srla0:smfu0} |

90 | |

91 | In general, we seek to approximate the ideal function |

92 | |

93 | |

94 | \noindent{}by some less ideal function where |

95 | |

96 | \begin{itemize} |

97 | \item $r_A \neq r_I$, although we seek to choose $r_A \approx r_I$. |

98 | \item The input to the function, $x$, may already contain |

99 | quantization error. |

100 | \item Although $r_I x \in \vworkrealsetnonneg$, we must choose |

101 | an integer as the function output. |

102 | \end{itemize} |

103 | |

104 | In modeling quantization error, we use the floor function\index{floor function} |

105 | ($\lfloor\cdot\rfloor$) |

106 | for algebraic simplicity. The floor function precisely |

107 | describes the behavior of integer division instructions (where |

108 | remainders are discarded), but may not describe other sources of |

109 | quantization, such as quantization that occurs in A/D conversion. |

110 | However, techniques identical to those presented in this |

111 | section may be used when quantization is not best described |

112 | by the floor function, and these results are left to the reader. |

113 | |

114 | Traditionally, because addition of integers is an inexpensive |

115 | machine operation, a parameter $z \in \vworkintset$ may optionally |

116 | be added to the product $hx$ in order to round or otherwise |

117 | shift the result. |

118 | |

119 | If $x$ is assumed to be without error, the ideal function is |

120 | given by (\ref{eq:crat0:srla0:smfu0:01}), whereas the function |

121 | that can be economically implemented is |

122 | |

123 | \begin{equation} |

124 | \label{eq:crat0:srla0:smfu0:02} |

125 | g(x) = \left\lfloor \frac{hx + z}{k} \right\rfloor |

126 | = |

127 | \left\lfloor r_A x + \frac{z}{k} \right\rfloor . |

128 | \end{equation} |

129 | |

130 | If, on the other hand, $x$ may be already quantized, |

131 | the function that can actually be implemented is |

132 | |

133 | \begin{equation} |

134 | \label{eq:crat0:srla0:smfu0:03} |

135 | h(x) = \left\lfloor \frac{h \lfloor x \rfloor + z}{k} \right\rfloor |

136 | = |

137 | \left\lfloor r_A \lfloor x \rfloor + \frac{z}{k} \right\rfloor . |

138 | \end{equation} |

139 | |

140 | |

141 | |

142 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

143 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

144 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

145 | \section[\protect\mbox{\protect$h/2^q$} and \protect\mbox{\protect$2^q/k$} Rational Linear Approximation] |

146 | {\protect\mbox{\protect\boldmath$h/2^q$} and \protect\mbox{\protect\boldmath$2^q/k$} Rational Linear Approximation} |

147 | %Section tag: HQQ0 |

148 | \label{crat0:shqq0} |

149 | |

150 | \index{h/2q@$h/2^q$ rational linear approximation} |

151 | \index{rational linear approximation!h/2q@$h/2^q$} |

152 | The algorithms presented in |

153 | Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0} |

154 | will always provide the rational number $h/k$ closest to |

155 | an arbitrary real number $r_I$ subject to the constraints |

156 | $h \leq h_{MAX}$ and $k \leq k_{MAX}$. |

157 | |

158 | However, because shifting in order |

159 | to implement multiplication or division by a power of 2 |

160 | is at least as fast (and often \emph{much} faster) |

161 | on all processors as arbitrary multiplication or division, |

162 | and because not all processors have multiplication and division instructions, |

163 | it is worthwhile to examine choosing $h/k$ so that either $h$ or $k$ are |

164 | powers of 2. |

165 | |

166 | There are thus three rational linear approximation techniques to be |

167 | examined: |

168 | |

169 | \begin{enumerate} |

170 | \item \emph{$h/k$ rational linear approximation}, in which an arbitrary |

171 | $h \leq h_{MAX}$ and an arbitrary $k \leq k_{MAX}$ are used, |

172 | with $r_A = h/k$. $h$ and $k$ can be chosen using the algorithms |

173 | presented in Chapters \cfryzeroxrefhyphen\ref{cfry0} and \ccfrzeroxrefhyphen\ref{ccfr0}. |

174 | Implementation of this technique would most often involve a single integer |

175 | multiplication instruction to form the product $hx$, followed by an optional single |

176 | addition instruction to form the sum $hx+z$, and then |

177 | followed by by a single division instruction |

178 | to form the quotient $\lfloor (hx+z)/k \rfloor$. Implementation may also less commonly involve |

179 | multiplication, addition, and division of operands too large to be processed |

180 | with single machine instructions. |

181 | \item \emph{$h/2^q$ rational linear approximation}, in which an arbitrary |

182 | $h \leq h_{MAX}$ and an integral power of two $k=2^q$ are used, with |

183 | $r_A = h/2^q$. |

184 | Implementation of this technique would most often involve a single integer |

185 | multiplication instruction to form the product $hx$, followed by an optional single |

186 | addition instruction to form the sum $hx+z$, and then |

187 | followed by right shift instruction(s) |

188 | to form the quotient $\lfloor (hx+z)/2^q \rfloor$. Implementation may also less commonly involve |

189 | multiplication, addition, and right shift of operands too large to be processed |

190 | with single machine instructions. |

191 | \item \emph{$2^q/k$ rational linear approximation}, in which an integral |

192 | power of two $h=2^q$ and an arbitrary $k \leq k_{MAX}$ are used, with |

193 | $r_A = 2^q/k$. |

194 | Implementation of this technique would most often involve left shift |

195 | instruction(s) to form the product $2^qx$, followed by an optional single |

196 | addition instruction to form the sum $2^qx+z$, and then |

197 | followed by a single division instruction to form |

198 | the quotient $\lfloor (2^qx+z)/k \rfloor$. Implementation may also less |

199 | commonly involve |

200 | left shift, addition, and division of operands too large to be processed |

201 | with single machine instructions. |

202 | \end{enumerate} |

203 | |

204 | We use the nomenclature ``\emph{$h/k$ rational linear approximation}'', |

205 | ``\emph{$h/2^q$ rational linear approximation}'', and |

206 | ``\emph{$2^q/k$ rational linear approximation}'' to identify the three |

207 | techniques enumerated above. |

208 | |

209 | |

210 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

211 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

212 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

213 | \subsection{Integer Arithmetic and Processor Instruction Set Characteristics} |

214 | %Subsection tag: pis0 |

215 | \label{crat0:shqq0:pis0} |

216 | |

217 | The following observations about integer arithmetic and about processors |

218 | used in embedded control can be made: |

219 | |

220 | \begin{enumerate} |

221 | \item \label{enum:crat0:shqq0:pis0:01:01a} |

222 | \emph{Shifting is the fastest method of integer multiplication or division |

223 | (by $2^q$ only), |

224 | followed by utilization of the processor multiplication or division instructions (for arbitrary |

225 | operands), |

226 | followed by software implementation of multiplication or division (for arbitrary operands).} |

227 | Relative costs vary depending on the processor, but the monotonic |

228 | ordering always holds. |

229 | $h/2^q$ and $2^q/k$ rational linear |

230 | approximation are thus worthy of investigation. (Note also that in many practical |

231 | applications of $h/2^q$ and $2^q/k$ rational linear approximation, |

232 | the required shift is performed by |

233 | addressing the operand with an offset, |

234 | and so has no cost.) |

235 | \item \label{enum:crat0:shqq0:pis0:01:01b} |

236 | \emph{Shifting is $O(N)$ (where $N$ is the number of bits in the argument), |

237 | but both |

238 | multiplication and division are $O(N^2)$ for |

239 | practical\footnote{\index{Karatsuba multiplication}Karatsuba |

240 | multiplication, for example, is |

241 | $O(N^{\log_2 3}) \approx O(N^{1.58}) \ll O(N^2)$. However, Karatsuba |

242 | multiplication cannot be applied economically to the small |

243 | operands that typically occur in embedded control work. It would |

244 | be rare in embedded control applications |

245 | for the length of a multiplication operand to exceed four |

246 | times the length that is accommodated by a machine instruction; and this |

247 | is far below the threshold at which Karatsuba multiplication is |

248 | economical. Thus, for all intents and purposes in embedded control work, |

249 | multiplication is $O(N^2)$.} operands (where |

250 | $N$ is the number of bits in each |

251 | operand).} It follows that $2^q/k$ and $h/2^q$ rational |

252 | linear approximation |

253 | will scale to large operands better than $h/k$ rational linear approximation. |

254 | \item \label{enum:crat0:shqq0:pis0:01:02a} |

255 | \emph{Integer division instructions take as long or longer than |

256 | integer multiplication instructions.} In designing digital logic |

257 | to implement basic integer arithmetic, division is the operation most difficult |

258 | to perform economically.\footnote{For some processors, the penalty is extreme. |

259 | For example, on the NEC V850 (a RISC processor), |

260 | a division requires 36 clock cycles, |

261 | whereas multiplication, addition, and subtraction each effectively |

262 | require 1 clock cycle.} |

263 | It follows that multiplication using operands that exceed the machine's word size |

264 | is often far less expensive than division using operands that exceed the |

265 | machine's word size. |

266 | \item \label{enum:crat0:shqq0:pis0:01:03a} |

267 | \emph{All processors that have an integer division instruction also |

268 | have an integer multiplication instruction.} |

269 | Phrased |

270 | differently, no processor has an integer division instruction but no |

271 | integer multiplication instruction. |

272 | \end{enumerate} |

273 | |

274 | Enumerated items |

275 | (\ref{enum:crat0:shqq0:pis0:01:01a}) through |

276 | (\ref{enum:crat0:shqq0:pis0:01:03a}) above lead to the following conclusions. |

277 | |

278 | \begin{enumerate} |

279 | \item $h/2^q$ rational linear approximation is likely to be implementable |

280 | more efficiently on most processors than $h/k$ rational linear approximation. |

281 | (\emph{Rationale:} shift instruction(s) or accessing a |

282 | memory address with an offset |

283 | is |

284 | likely to be more economical than division, particularly if $k$ would exceed |

285 | the native |

286 | operand size of the processor.) |

287 | \item $h/2^q$ rational linear approximation is likely to be a more useful |

288 | technique than $2^q/k$ rational linear approximation. |

289 | (\emph{Rationale:} the generally high cost of division compared to |

290 | multiplication, and the existence of processors that possess a multiplication |

291 | instruction but no division instruction.) |

292 | \end{enumerate} |

293 | |

294 | |

295 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

296 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

297 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

298 | \subsection[Design Procedure For \protect\mbox{\protect$h/2^q$} Rational Linear Approximations] |

299 | {Design Procedure For \protect\mbox{\protect\boldmath$h/2^q$} Rational Linear Approximation} |

300 | %Subsection tag: dph0 |

301 | \label{crat0:shqq0:dph0} |

302 | |

303 | An $h/2^q$ rational linear approximation is parameterized by: |

304 | |

305 | \begin{itemize} |

306 | \item The unsigned or signed nature of $h$ and $x$. (Rational linear approximations |

307 | may involve either signed or unsigned domains and ranges. Furthermore, |

308 | signed integers may be maintained using either 2's-complement |

309 | or sign-magnitude representation, and the processor instruction set |

310 | may or may not directly support signed multiplication.) |

311 | \item $r_I$, the real number we wish to approximate by $r_A = h/2^q$. |

312 | \item $x_{MAX}$, the maximum possible value of the input argument $x$. (Typically, |

313 | software contains a test to clip the output if $x > x_{MAX}$.) |

314 | \item $w_h$, the width in bits allowed for $h$. (Typically, $w_h$ is |

315 | the maximum operand size of a machine multiplication instruction.) |

316 | \item $w_r$, the width in bits allowed for the result $hx$. (Typically, |

317 | $w_r$ is the maximum result size of a machine multiplication instruction.) |

318 | \item The rounding mode when choosing $h$ (and thus effectively $r_A$) |

319 | based on $r_I$. It is common to choose the |

320 | closest value, |

321 | $r_A=\lfloor r_I 2^q + 1/2 \rfloor/2^q$ |

322 | or |

323 | $r_A=\lceil r_I 2^q - 1/2 \rceil/2^q$, |

324 | but other choices are possible. |

325 | \item The rounding mode for the result (i.e. the choice of $z$ in |

326 | Eq. \ref{eq:crat0:sint0:01}). |

327 | \end{itemize} |

328 | |

329 | This section develops a design procedure for $h/2^q$ rational linear |

330 | approximations with the most typical set of assumptions: unsigned arithmetic, |

331 | $r_A=\lfloor r_I 2^q + 1/2 \rfloor/2^q$, |

332 | and $z=0$. Design procedures for other scenarios are presented as exercises. |

333 | |

334 | By definition, $h$ is constrained in two ways: |

335 | |

336 | \begin{equation} |

337 | \label{eq:crat0:shqq0:dph0:00} |

338 | h \leq 2^{w_h} - 1 |

339 | \end{equation} |

340 | |

341 | \noindent{}and |

342 | |

343 | \begin{equation} |

344 | \label{eq:crat0:shqq0:dph0:01} |

345 | h \leq \frac{2^{w_r} - 1}{x_{MAX}} . |

346 | \end{equation} |

347 | |

348 | \noindent{}(\ref{eq:crat0:shqq0:dph0:00}) comes directly from the |

349 | requirement that $h$ fit in $w_h$ bits. |

350 | (\ref{eq:crat0:shqq0:dph0:01}) comes directly from the requirement |

351 | that $hx$ fit in $w_r$ bits. |

352 | (\ref{eq:crat0:shqq0:dph0:00}) and (\ref{eq:crat0:shqq0:dph0:01}) |

353 | may be combined to form one inequality: |

354 | |

355 | \begin{equation} |

356 | \label{eq:crat0:shqq0:dph0:02} |

357 | h \leq \min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) . |

358 | \end{equation} |

359 | |

360 | If $q$ is known, the choice of $h$ that will be made so as to minimize |

361 | $|r_A-r_I| = |h/2^q - r_I|$ is |

362 | |

363 | \begin{equation} |

364 | \label{eq:crat0:shqq0:dph0:03} |

365 | h=\left\lfloor r_I 2^q + \frac{1}{2} \right\rfloor . |

366 | \end{equation} |

367 | |

368 | \noindent{}It is required that the choice of $h$ specified by |

369 | (\ref{eq:crat0:shqq0:dph0:03}) meet |

370 | (\ref{eq:crat0:shqq0:dph0:02}). Making the most pessimistic |

371 | assumption about the rounding of $h$ and substituting into |

372 | (\ref{eq:crat0:shqq0:dph0:02}) leads to |

373 | |

374 | \begin{equation} |

375 | \label{eq:crat0:shqq0:dph0:04} |

376 | r_I 2^q + \frac{1}{2} |

377 | \leq |

378 | \min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) . |

379 | \end{equation} |

380 | |

381 | \noindent{}Isolating $q$ in (\ref{eq:crat0:shqq0:dph0:04}) |

382 | yields |

383 | |

384 | \begin{equation} |

385 | \label{eq:crat0:shqq0:dph0:05} |

386 | 2^q |

387 | \leq |

388 | \frac{\min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) - \frac{1}{2}} |

389 | {r_I}. |

390 | \end{equation} |

391 | |

392 | \noindent{}Solving |

393 | (\ref{eq:crat0:shqq0:dph0:05}) |

394 | for maximum value of $q$ that meets the constraint yields |

395 | |

396 | \begin{equation} |

397 | \label{eq:crat0:shqq0:dph0:06} |

398 | q= |

399 | \left\lfloor |

400 | { |

401 | \log_2 |

402 | \left( |

403 | { |

404 | \frac{\min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) - \frac{1}{2}}{r_I} |

405 | } |

406 | \right) |

407 | } |

408 | \right\rfloor . |

409 | \end{equation} |

410 | |

411 | \noindent{}(\ref{eq:crat0:shqq0:dph0:06}) |

412 | can be rewritten for easier calculation using most calculators (which do |

413 | not allow the direct evaluation of base-2 logarithms): |

414 | |

415 | \begin{equation} |

416 | \label{eq:crat0:shqq0:dph0:07} |

417 | q= |

418 | \left\lfloor |

419 | \frac |

420 | { |

421 | { |

422 | \ln |

423 | \left( |

424 | { |

425 | \frac{\min \left( { 2^{w_h} - 1, \frac{2^{w_r} - 1}{x_{MAX}} } \right ) - \frac{1}{2}}{r_I} |

426 | } |

427 | \right) |

428 | } |

429 | } |

430 | {\ln 2} |

431 | \right\rfloor . |

432 | \end{equation} |

433 | |

434 | \noindent{}Once $q$ is established using (\ref{eq:crat0:shqq0:dph0:07}), |

435 | $h$ can be calculated using (\ref{eq:crat0:shqq0:dph0:03}). |

436 | |

437 | In embedded control work (as well as in operating system internals), |

438 | $h/2^q$ rational linear approximations are often used in conjunction with |

439 | tabulated constants or calibratable parameters |

440 | where each constant or calibratable parameter may vary over a range of |

441 | $[0, r_I]$, and where $r_I$ is the value used in the design procedure |

442 | presented above. In these applications, the values of $h$ are |

443 | tabulated, but $q$ is invariant (usually hard-coded) |

444 | and is chosen at design time based on the upper bound $r_I$ |

445 | of the interval $[0, r_I]$ in which each tabulated constant or calibratable |

446 | parameter will fall. With $q$ fixed, |

447 | $r_A$ can be adjusted in steps of $1/2^q$. |

448 | |

449 | If $r_I$ is invariant, a final design step may be to reduce the rational |

450 | number $h/2^q$ by dividing some or all occurrences of 2 as a factor from both the |

451 | numerator and denominator. With some processors and in some applications, this |

452 | may save execution time by reducing the number of shift instructions that |

453 | must be executed, reducing the execution time of the shift instructions |

454 | that are executed, or allowing shifting via offset addressing. |

455 | For example, on a byte-addressible machine, if the design procedure |

456 | yields $h=608$ and $q=10$, it may be desirable to divide both $h$ and $2^q$ by 4 to |

457 | yield $h=152$ and $q=8$, as this allows the shift by 8 to be done by fetching |

458 | alternate bytes (rather than by actual shifting). In other applications, it may |

459 | be desirable to remove \emph{all} occurrences of 2 as a prime factor |

460 | from $h$. |

461 | |

462 | For an invariant $r_I$, a suitable design procedure is: |

463 | |

464 | \begin{enumerate} |

465 | \item Choose $q$ using (\ref{eq:crat0:shqq0:dph0:07}). |

466 | \item With $q$ fixed, choose $h$ using (\ref{eq:crat0:shqq0:dph0:03}). |

467 | \item If economies can be achieved on the target processor, |

468 | examine the possibility of removing some or all occurrences |

469 | of 2 as a prime factor from $h$ and decreasing $q$. |

470 | \end{enumerate} |

471 | |

472 | For tabulated or calibratable constants in the |

473 | interval $[0,r_I]$, a suitable design procedure is to use the |

474 | procedure presented immediately above but without the third step. |

475 | Each tabulated value of $h$ is chosen using (\ref{eq:crat0:shqq0:dph0:03}). |

476 | |

477 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

478 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

479 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

480 | \subsection[Design Procedure For \protect\mbox{\protect$2^q/k$} Rational Linear Approximations] |

481 | {Design Procedure For \protect\mbox{\protect\boldmath$2^q/k$} Rational Linear Approximation} |

482 | %Subsection tag: dpk0 |

483 | \label{crat0:shqq0:dpk0} |

484 | |

485 | TBD. |

486 | |

487 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

488 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

489 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

490 | \section{Piecewise Rational Linear Approximation} |

491 | %Section tag: PWI0 |

492 | \label{crat0:spwi0} |

493 | |

494 | TBD. |

495 | |

496 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

497 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

498 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

499 | \section[Frequency Division And Rational Counting] |

500 | {Frequency Division And Rational Counting Techniques} |

501 | %Section tag: FDV0 |

502 | \label{crat0:sfdv0} |

503 | |

504 | \index{frequency division}\index{rational counting}\index{counting}% |

505 | Often, software must ``divide down'' an execution rate. For example, |

506 | an interrupt service routine may be scheduled by hardware every |

507 | 10ms, but may perform useful processing only every 50ms. This requires |

508 | that the ISR maintain a counter and only perform useful processing |

509 | every fifth invocation. This section deals with counting strategies |

510 | used to achieve invocation frequency division and other similar results. |

511 | |

512 | Frequency division and |

513 | rational counting techniques presented in this section find application |

514 | primarily in the following scenarios: |

515 | |

516 | \begin{itemize} |

517 | \item ISRs and other software components which must divide down |

518 | their invocation rate. |

519 | \item Pulse counting and scaling from encoders and other |

520 | similar systems. |

521 | \item The correction of inaccuracies in timebases (such as crystals |

522 | which oscillate at a frequency different than the |

523 | nominal rate). |

524 | \end{itemize} |

525 | |

526 | Because the techniques presented must be usable with inexpensive |

527 | microcontrollers, such techniques must meet these constraints: |

528 | |

529 | \begin{enumerate} |

530 | \item \label{enum:01:crat0:sfdv0:econex} |

531 | The counting techniques must be economical to execute on |

532 | an inexpensive microcontroller. |

533 | \item \label{enum:01:crat0:sfdv0:econcccalc} |

534 | An inexpensive microcontroller must be capable of calculating any |

535 | constants used as limits in counting (i.e. it cannot necessarily |

536 | be assumed that a more powerful computer calculates these constants, |

537 | and it cannot be assumed that these limits do not change on the fly). |

538 | \end{enumerate} |

539 | |

540 | In this section, we analyze the behavior of several types of |

541 | rational counting algorithms, supplied as Algorithms |

542 | \ref{alg:crat0:sfdv0:01a} |

543 | through |

544 | \ref{alg:crat0:sfdv0:02a}. |

545 | |

546 | \begin{algorithm} |

547 | \begin{verbatim} |

548 | /* The constants K1 through K4, which parameterize the */ |

549 | /* counting behavior, are assumed assigned elsewhere in */ |

550 | /* the code. The solution is analyzed in terms of the */ |

551 | /* parameters K1 through K4. */ |

552 | /* */ |

553 | /* We also place the following restrictions on K1 through */ |

554 | /* K4: */ |

555 | /* K1 : K1 <= K3 - K2. */ |

556 | /* K2 : K4 > K2 > 0. */ |

557 | /* K3 : No restrictions. */ |

558 | /* K4 : K4 > K2 > 0. */ |

559 | |

560 | void base_rate_sub(void) |

561 | { |

562 | static int state = K1; |

563 | |

564 | state += K2; |

565 | |

566 | if (state >= K3) |

567 | { |

568 | state -= K4; |

569 | A(); |

570 | } |

571 | } |

572 | \end{verbatim} |

573 | \caption{Rational Counting Algorithm For $K_2/K_4 < 1$} |

574 | \label{alg:crat0:sfdv0:01a} |

575 | \end{algorithm} |

576 | |

577 | \begin{algorithm} |

578 | \begin{verbatim} |

579 | /* The constants K1 through K4, which parameterize the */ |

580 | /* counting behavior, are assumed assigned elsewhere in */ |

581 | /* the code. The solution is analyzed in terms of the */ |

582 | /* parameters K1 through K4. */ |

583 | /* */ |

584 | /* We also place the following restrictions on K1 through */ |

585 | /* K4: */ |

586 | /* K1 : K1 <= K3 - K2. */ |

587 | /* K2 : K2 > 0. */ |

588 | /* K3 : No restrictions. */ |

589 | /* K4 : K4 > 0. */ |

590 | |

591 | void base_rate_sub(void) |

592 | { |

593 | static int state = K1; |

594 | |

595 | state += K2; |

596 | |

597 | while (state >= K3) |

598 | { |

599 | state -= K4; |

600 | A(); |

601 | } |

602 | } |

603 | \end{verbatim} |

604 | \caption{Rational Counting Algorithm For $K_2/K_4 \geq 1$} |

605 | \label{alg:crat0:sfdv0:01b} |

606 | \end{algorithm} |

607 | |

608 | \begin{algorithm} |

609 | \begin{verbatim} |

610 | /* The constants K1, K2, and K4, which parameterize the */ |

611 | /* counting behavior, are assumed assigned elsewhere in */ |

612 | /* the code. The solution is analyzed in terms of the */ |

613 | /* parameters K1 through K4. */ |

614 | /* */ |

615 | /* We also place the following restrictions on K1, K2, */ |

616 | /* and K4: */ |

617 | /* K1 : K1 >= 0. */ |

618 | /* K2 : K4 > K2 > 0. */ |

619 | /* K4 : K4 > K2 > 0. */ |

620 | /* */ |

621 | /* Special thanks to Chuck B. Falconer (of the */ |

622 | /* comp.arch.embedded newsgroup) for this rational */ |

623 | /* counting algorithm. */ |

624 | /* */ |

625 | /* Note below that the test against K3 does not exist, */ |

626 | /* instead a test against zero is used, which many */ |

627 | /* machine instruction sets will do as part of the */ |

628 | /* subtraction (but perhaps this needs to be coded in */ |

629 | /* A/L). This saves machine code and also eliminates */ |

630 | /* one unnecessary degree of freedom (K3). */ |

631 | |

632 | void base_rate_sub(void) |

633 | { |

634 | static int state = K1; |

635 | |

636 | if ((state -= K2) < 0) |

637 | { |

638 | state += K4; |

639 | A(); |

640 | } |

641 | } |

642 | \end{verbatim} |

643 | \caption{Zero-Test Rational Counting Algorithm For $K_2/K_4 < 1$} |

644 | \label{alg:crat0:sfdv0:01c} |

645 | \end{algorithm} |

646 | |

647 | \begin{algorithm} |

648 | \begin{verbatim} |

649 | ;Special thanks to John Larkin (of the comp.arch.embedded |

650 | ;newsgroup) for this rational counting algorithm. |

651 | ; |

652 | ;This is the TMS-370C8 assembly-language version of the |

653 | ;algorithm. The algorithm is parameterized solely by |

654 | ;K1 and K2, with no restrictions on their values, because |

655 | ;the values are naturally constrained by the data types. |

656 | ;K1, which is the initial value of "state", is assumed |

657 | ;assigned elsewhere. The snippet shown here uses only |

658 | ;K2. |

659 | MOV state, A ;Get "state". |

660 | ADD #K2, A ;Increase by K2. Carry flag |

661 | ;will be set if rollover to or |

662 | ;past zero. |

663 | PUSH ST ;Save carry flag. |

664 | MOV A, state ;Move new value back. |

665 | POP ST ;Restore carry flag. |

666 | JNC done ;If didn't roll, don't run sub. |

667 | CALL A_SUBROUTINE ;Run sub. |

668 | done: |

669 | |

670 | /* This is the 'C' version of the algorithm. It is not */ |

671 | /* as easy or efficient in 'C' to detect rollover. */ |

672 | |

673 | void base_rate_sub(void) |

674 | { |

675 | static unsigned int state = K1; |

676 | unsigned int old_state; |

677 | |

678 | old_state = state; |

679 | state += K2; |

680 | if (state < old_state) |

681 | { |

682 | A(); |

683 | } |

684 | } |

685 | \end{verbatim} |

686 | \caption{$2^q$ Rollover Rational Counting Algorithm} |

687 | \label{alg:crat0:sfdv0:01d} |

688 | \end{algorithm} |

689 | |

690 | \begin{algorithm} |

691 | \begin{verbatim} |

692 | /* The constants K1 through K4, which parameterize the */ |

693 | /* counting behavior, are assumed assigned elsewhere in */ |

694 | /* the code. The solution is analyzed in terms of the */ |

695 | /* parameters K1 through K4. */ |

696 | /* */ |

697 | /* We also place the following restrictions on K1 through */ |

698 | /* K4: */ |

699 | /* K1 : K1 <= K3. */ |

700 | /* K2 : K2 > 0. */ |

701 | /* K3 : No restrictions. */ |

702 | /* K4 : K4 > 0. */ |

703 | |

704 | void base_rate_sub(void) |

705 | { |

706 | static unsigned int state = K1; |

707 | |

708 | if (state >= K3) |

709 | { |

710 | state -= K4; |

711 | A(); |

712 | } |

713 | else |

714 | { |

715 | state += K2; |

716 | B(); |

717 | } |

718 | } |

719 | \end{verbatim} |

720 | \caption{Rational Counting Algorithm With \texttt{else} Clause} |

721 | \label{alg:crat0:sfdv0:02a} |

722 | \end{algorithm} |

723 | |

724 | |

725 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

726 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

727 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

728 | \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01a}] |

729 | {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01a}} |

730 | %Section tag: PRC0 |

731 | \label{crat0:sfdv0:sprc0} |

732 | |

733 | Algorithm \ref{alg:crat0:sfdv0:01a} |

734 | is used frequently in microcontroller |

735 | software. A base rate subroutine\footnote{For brevity, we usually |

736 | call this just the \emph{base subroutine}.} (named ``\texttt{base\_rate\_sub()}'' |

737 | in the algorithm) is called at a periodic rate, and subroutine |

738 | ``\texttt{A()}'' is called at a lesser rate. |

739 | We are interested in determining the relationships between the rates |

740 | as a function of $K_1$, $K_2$, $K_3$, and $K_4$; and we are interested |

741 | in developing other properties. |

742 | |

743 | Notationally when analyzing rational counting algorithms, we agree |

744 | that $state_n$ denotes the value of the \texttt{state} variable |

745 | after the $n$th invocation and before the $n+1$'th invocation |

746 | of the base rate subroutine. |

747 | Using this convention with Algorithm \ref{alg:crat0:sfdv0:01a}, |

748 | $state_0 = K_1$.\footnote{Algorithm \ref{alg:crat0:sfdv0:01a} |

749 | requires a knowledge of |

750 | `C' to fully understand. The \texttt{static} keyword ensures that the |

751 | variable \texttt{state} is initialized only once, at the time the program |

752 | is loaded. \texttt{state} is \emph{not} initialized each time the |

753 | base subroutine runs.} |

754 | |

755 | We can first easily derive the number of initial invocations of |

756 | the base subroutine before ``\texttt{A()}'' is called for the first |

757 | time. |

758 | |

759 | \begin{vworklemmastatement} |

760 | \label{lem:crat0:sfdv0:sprc0:01} |

761 | $N_{STARTUP}$, the number of invocations of the base subroutine |

762 | in Algorithm \ref{alg:crat0:sfdv0:01a} before ``\texttt{A()}'' is called |

763 | for the first time, is given by |

764 | |

765 | \begin{equation} |

766 | \label{eq:lem:crat0:sfdv0:sprc0:01:01} |

767 | N_{STARTUP} = |

768 | \left\lceil |

769 | { |

770 | \frac{-K_1 - K_2 + K_3}{K_2} |

771 | } |

772 | \right\rceil . |

773 | \end{equation} |

774 | \end{vworklemmastatement} |

775 | \begin{vworklemmaproof} |

776 | The value of \texttt{state} after the $n$th invocation |

777 | is $state_n = K_1 + n K_2$. In order for the test in the |

778 | \texttt{if()} statement not to be met, we require that |

779 | |

780 | \begin{equation} |

781 | \label{eq:lem:crat0:sfdv0:sprc0:01:02} |

782 | K_1 + n K_2 < K_3 |

783 | \end{equation} |

784 | |

785 | \noindent{}or equivalently that |

786 | |

787 | \begin{equation} |

788 | \label{eq:lem:crat0:sfdv0:sprc0:01:03} |

789 | n < \frac{K_3 - K_1}{K_2} . |

790 | \end{equation} |

791 | |

792 | Solving (\ref{eq:lem:crat0:sfdv0:sprc0:01:03}) for the largest |

793 | value of $n \in \vworkintset$ which still meets the criterion |

794 | yields (\ref{eq:lem:crat0:sfdv0:sprc0:01:01}). Note that |

795 | the derivation of (\ref{eq:lem:crat0:sfdv0:sprc0:01:01}) requires |

796 | that the restrictions on $K_1$ through $K_4$ documented in |

797 | Algorithm \ref{alg:crat0:sfdv0:01a} be met. |

798 | \end{vworklemmaproof} |

799 | \begin{vworklemmaparsection}{Remarks} |

800 | Note that if one chooses $K_1 > K_3 - K_2$ (in contradiction to the |

801 | restrictions in Algorithm \ref{alg:crat0:sfdv0:01a}), it is possible |

802 | to devise a counting scheme (and results analogous to this lemma) where |

803 | ``\texttt{A()}'' is run a number of times before it is |

804 | \emph{not} run for the first time. The construction of an analogous |

805 | lemma is the topic of Exercise \ref{exe:crat0:sexe0:01}. |

806 | \end{vworklemmaparsection} |

807 | |

808 | \begin{vworklemmastatement} |

809 | \label{lem:crat0:sfdv0:sprc0:02} |

810 | Let $N_I$ be the number of times the Algorithm |

811 | \ref{alg:crat0:sfdv0:01a} base subroutine |

812 | is called, let $N_O$ be the number of times the |

813 | ``\texttt{A()}'' subroutine is called, let |

814 | $f_I$ be the frequency of invocation of the |

815 | Algorithm \ref{alg:crat0:sfdv0:01a} base subroutine, and let |

816 | $f_O$ be the frequency of invocation of |

817 | ``\texttt{A()}''. Provided the constraints |

818 | on $K_1$ through $K_4$ documented in |

819 | Algorithm \ref{alg:crat0:sfdv0:01a} are met, |

820 | |

821 | \begin{equation} |

822 | \label{eq:lem:crat0:sfdv0:sprc0:02:01} |

823 | \lim_{N_I\rightarrow\infty}\frac{N_O}{N_I} |

824 | = |

825 | \frac{f_O}{f_I} |

826 | = |

827 | \frac{K_2}{K_4} . |

828 | \end{equation} |

829 | \end{vworklemmastatement} |

830 | \begin{vworklemmaproof} |

831 | (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) indicates that once |

832 | the initial delay (determined by $K_1$ and $K_3$) has finished, |

833 | $N_O/N_I$ will converge on a steady-state value of |

834 | $K_2/K_4$. |

835 | |

836 | Assume that $K_1=0$ and $K_3=K_4$. The |

837 | conditional subtraction then calculates |

838 | $state \bmod K_4$. After the $n$th |

839 | invocation of the base subroutine, the value |

840 | of \texttt{state} will be |

841 | |

842 | \begin{equation} |

843 | \label{eq:lem:crat0:sfdv0:sprc0:02:02} |

844 | state_n|_{K_1=0, K_3=K_4} = n K_2 \bmod K_4 . |

845 | \end{equation} |

846 | |

847 | Assume that for two distinct values of |

848 | $n \in \vworkintsetnonneg$, $n_1$ and $n_2$, |

849 | the value of the \texttt{state} variable is the same: |

850 | |

851 | \begin{equation} |

852 | \label{eq:lem:crat0:sfdv0:sprc0:02:03} |

853 | n_1 K_2 \bmod K_4 = n_2 K_2 \bmod K_4. |

854 | \end{equation} |

855 | |

856 | Then |

857 | |

858 | \begin{equation} |

859 | \label{eq:lem:crat0:sfdv0:sprc0:02:04} |

860 | (n_2 - n_1) K_2 = i K_4, \; \exists i \in \vworkintsetpos . |

861 | \end{equation} |

862 | |

863 | However, we have no knowledge of whether $K_2$ and $K_4$ are |

864 | coprime (they are not required to be). We may rewrite |

865 | (\ref{eq:lem:crat0:sfdv0:sprc0:02:04}) equivalently as |

866 | |

867 | \begin{equation} |

868 | \label{eq:lem:crat0:sfdv0:sprc0:02:05} |

869 | (n_2 - n_1) \frac{K_2}{\gcd(K_2, K_4)} = i \frac{K_4}{\gcd(K_2, K_4)}, |

870 | \; \exists i \in \vworkintsetpos |

871 | \end{equation} |

872 | |

873 | where of course by definition |

874 | |

875 | \begin{equation} |

876 | \label{eq:lem:crat0:sfdv0:sprc0:02:06} |

877 | \gcd \left( { \frac{K_2}{\gcd(K_2, K_4)}, \frac{K_4}{\gcd(K_2, K_4)} } \right) = 1. |

878 | \end{equation} |

879 | |

880 | In order to satisfy (\ref{eq:lem:crat0:sfdv0:sprc0:02:05}), |

881 | $n_2 - n_1$ must contain all of the prime factors of |

882 | $K_4/\gcd(K_2,K_4)$ in at least the same multiplicities, |

883 | and it follows that the set of values |

884 | of $n_2-n_1$ that satisfies |

885 | (\ref{eq:lem:crat0:sfdv0:sprc0:02:03}) is |

886 | precisely the set of multiples of $K_4/\gcd(K_2,K_4)$: |

887 | |

888 | \begin{equation} |

889 | \label{eq:lem:crat0:sfdv0:sprc0:02:07} |

890 | n_2 - n_1 = j \frac{K_4}{\gcd(K_2, K_4)}, \; \exists j \in \vworkintsetpos . |

891 | \end{equation} |

892 | |

893 | Examining (\ref{eq:lem:crat0:sfdv0:sprc0:02:02}), it can |

894 | also be seen that |

895 | |

896 | \begin{equation} |

897 | \label{eq:lem:crat0:sfdv0:sprc0:02:08} |

898 | \gcd(K_2, K_4) \vworkdivides (n K_2 \bmod K_4), |

899 | \end{equation} |

900 | |

901 | and so |

902 | |

903 | \begin{eqnarray} |

904 | \label{eq:lem:crat0:sfdv0:sprc0:02:09} |

905 | & n K_2 \bmod K_4 \in & \\ |

906 | \nonumber |

907 | & \{ 0, \gcd(K_2, K_4), 2 \gcd(K_2, K_4), \ldots , K_4 - \gcd(K_2, K_4) \} , & |

908 | \end{eqnarray} |

909 | |

910 | a set which contains exactly $K_4/\gcd(K_2, K_4)$ elements. |

911 | |

912 | Thus we've established by the pigeonhole principle |

913 | that the sequence of the |

914 | values of the variable \texttt{state} |

915 | specified by (\ref{eq:lem:crat0:sfdv0:sprc0:02:02}) |

916 | repeats perfectly with periodicity $K_4/\gcd(K_2, K_4)$, |

917 | and we've established that in one period, every element of the set |

918 | specified in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) appears exactly |

919 | once. (However, we have not specified the order in which the |

920 | elements appear, but this is not important for this lemma. In general |

921 | the elements appear out of the order shown in |

922 | Eq. \ref{eq:lem:crat0:sfdv0:sprc0:02:09}.) |

923 | |

924 | To establish the frequency with which the test against |

925 | $K_4$ is met, note that if $state_n + K_2 \geq K_4$, then |

926 | |

927 | \begin{eqnarray} |

928 | \label{eq:lem:crat0:sfdv0:sprc0:02:10} |

929 | & \displaystyle{state_n \in \left\{ \frac{K_4-K_2}{\gcd(K_2,K_4)} \gcd(K_2, K_4), \right.} & \\ |

930 | \nonumber & \displaystyle{\left. \left(\frac{K_4-K_2}{\gcd(K_2,K_4)} + 1 \right) \gcd(K_2, K_4), \ldots , |

931 | K_4 - \gcd(K_2, K_4)\right\} ,} & |

932 | \end{eqnarray} |

933 | |

934 | which has a cardinality $K_2/K_4$ that of the set in |

935 | (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}). Since the |

936 | \texttt{state} variable cycles through the set in |

937 | (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) with perfect periodicity and since |

938 | $K_2/K_4$ of the set elements lead to the \texttt{if()} statement |

939 | test being |

940 | met, (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) is also met as |

941 | $N_I\rightarrow\infty$. |

942 | |

943 | Note that if $K_1 \neq 0$, it simply changes the startup |

944 | behavior of the rational counting. So long as $K_2 < K_4$, |

945 | Algorithm \ref{alg:crat0:sfdv0:01a} will reach a steady state where |

946 | (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) holds. |

947 | Note that if $K_3 \neq K_4$, it simply ``shifts'' the sets |

948 | specified in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) |

949 | and (\ref{eq:lem:crat0:sfdv0:sprc0:02:10}), but |

950 | (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}) still holds. |

951 | The lemma has thus been proved |

952 | for every case. (We have neglected to give |

953 | the formal proof as required by the definition of a limit that |

954 | for any arbitrarily small error $\epsilon$, a |

955 | finite $N_I$ can be found so that |

956 | the error is at or below $\epsilon$; however the skeptical reader |

957 | is encouraged to complete Exercise \ref{exe:crat0:sexe0:02}.) |

958 | \end{vworklemmaproof} |

959 | \begin{vworklemmaparsection}{Remarks} |

960 | It is possible to view the long-term accuracy of |

961 | Algorithm \ref{alg:crat0:sfdv0:01a} in terms of a limit, as is done in |

962 | (\ref{eq:lem:crat0:sfdv0:sprc0:02:01}). However, it is also |

963 | possible to observe that $K_1$ and $K_3$ set a delay until |

964 | the counting algorithm reaches steady state. |

965 | With $K_3=K_4$, the attainment of |

966 | steady state is characterized by the \texttt{state} variable |

967 | being assigned for the first time to one of the values in |

968 | (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}). Once in steady state, |

969 | the algorithm cycles with perfect periodic behavior through all of the |

970 | $K_4/\gcd(K_2,K_4)$ elements in |

971 | (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}), but not necessarily in |

972 | the order shown in the equation. |

973 | During this period of length $K_4/\gcd(K_2,K_4)$, |

974 | exactly $K_2/\gcd(K_2,K_4)$ invocations of the base |

975 | subroutine result in |

976 | subroutine ``\texttt{A()}'' being run, and exactly |

977 | $(K_4-K_2)/\gcd(K_2,K_4)$ do not. Thus, after reaching steady-state the |

978 | algorithm has \emph{perfect} accuracy if one considers periods of |

979 | length $K_4/\gcd(K_2,K_4)$. |

980 | \end{vworklemmaparsection} |

981 | %\vworklemmafooter{} |

982 | |

983 | \begin{vworklemmastatement} |

984 | \label{lem:crat0:sfdv0:sprc0:04} |

985 | If $K_3=K_4$, $K_1=0$, and |

986 | $\gcd(K_2, K_4)=1$\footnote{\label{footnote:lem:crat0:sfdv0:sprc0:04:01}If |

987 | $\gcd(K_2, K_4) > 1$, then by Theorem |

988 | \cprizeroxrefhyphen\ref{thm:cpri0:ppn0:00a} the largest |

989 | value that $n K_2 \bmod K_4$ can attain is |

990 | $K_4-\gcd(K_2, K_4)$ and the interval in |

991 | (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}) is correspondingly |

992 | smaller. (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}) is |

993 | technically correct but not as conservative as possible. |

994 | This is a minor point and we do not dwell on it.}, the error between |

995 | the approximation to $N_O$ implemented by |

996 | Algorithm \ref{alg:crat0:sfdv0:01a} and the ``ideal'' mapping is always |

997 | in the set |

998 | |

999 | \begin{equation} |

1000 | \label{eq:lem:crat0:sfdv0:sprc0:04:01} |

1001 | \left[ - \frac{K_4 - 1}{K_4} , 0 \right] , |

1002 | \end{equation} |

1003 | |

1004 | and no algorithm can be constructed to |

1005 | confine the error to a smaller interval. |

1006 | \end{vworklemmastatement} |

1007 | \begin{vworklemmaproof} |

1008 | With $K_1=0$ and $K_3 = K_4$, it can be verified analytically that |

1009 | the total number of times the function ``\texttt{A()}'' has been |

1010 | invoked up to and including the $n$th invocation of the base subroutine |

1011 | is |

1012 | |

1013 | \begin{equation} |

1014 | \label{eq:lem:crat0:sfdv0:sprc0:04:02} |

1015 | N_O = \left\lfloor \frac{n K_2}{K_4} \right\rfloor . |

1016 | \end{equation} |

1017 | |

1018 | On the other hand, the ``ideal'' number of invocations, which |

1019 | we denote $\overline{N_O}$, is given by |

1020 | |

1021 | \begin{equation} |

1022 | \label{eq:lem:crat0:sfdv0:sprc0:04:03} |

1023 | \overline{N_O} = \frac{n K_2}{K_4} . |

1024 | \end{equation} |

1025 | |

1026 | Quantization of the rational number in (\ref{eq:lem:crat0:sfdv0:sprc0:04:02}) |

1027 | can introduce an error of up to $-(K_4-1)/K_4$, therefore |

1028 | |

1029 | \begin{equation} |

1030 | \label{eq:lem:crat0:sfdv0:sprc0:04:04} |

1031 | N_O - \overline{N_O} = |

1032 | \left\lfloor \frac{n K_2}{K_4} \right\rfloor - \frac{n K_2}{K_4} |

1033 | \in \left[ - \frac{K_4 - 1}{K_4} , 0 \right] . |

1034 | \end{equation} |

1035 | |

1036 | This proves the error bound for Algorithm \ref{alg:crat0:sfdv0:01a}. |

1037 | The proof that there can be no better algorithm is the topic |

1038 | of Exercise \ref{exe:crat0:sexe0:06}. |

1039 | \end{vworklemmaproof} |

1040 | \begin{vworklemmaparsection}{Remarks} |

1041 | Algorithm \ref{alg:crat0:sfdv0:01a} is \emph{optimal} in the |

1042 | sense that no algorithm can achieve a tighter error |

1043 | bound than (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}). As |

1044 | demonstrated in Exercises \ref{exe:crat0:sexe0:04} |

1045 | and \ref{exe:crat0:sexe0:05}, $K_1 \neq 0$ can be chosen |

1046 | to shift the interval in (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}), but |

1047 | the span of the interval cannot be reduced. |

1048 | \end{vworklemmaparsection} |

1049 | \vworklemmafooter{} |

1050 | |

1051 | Lemmas \ref{lem:crat0:sfdv0:sprc0:02} |

1052 | and \ref{lem:crat0:sfdv0:sprc0:04} have demonstrated that the ratio of |

1053 | counts $N_O/N_I$ will asymptotically |

1054 | approach $K_2/K_4$ |

1055 | (i.e. the long-term accuracy of Algorithm \ref{alg:crat0:sfdv0:01a} |

1056 | is \emph{perfect}). |

1057 | However, |

1058 | for many applications it is also desirable to have a lack of |

1059 | ``bursty'' behavior. We demonstrate the lack of bursty |

1060 | behavior in the following lemma. |

1061 | |

1062 | \begin{vworklemmastatement} |

1063 | \label{lem:crat0:sfdv0:sprc0:03} |

1064 | For Algorithm \ref{alg:crat0:sfdv0:01a}, once steady |

1065 | state has been achieved, the number of consecutive |

1066 | base subroutine invocations during which subroutine |

1067 | ``\texttt{A()}'' is executed is always in the set |

1068 | |

1069 | \begin{equation} |

1070 | \label{eq:lem:crat0:sfdv0:sprc0:03:01} |

1071 | \left\{ |

1072 | \left\lfloor \frac{K_2}{K_4 - K_2} \right\rfloor , |

1073 | \left\lceil \frac{K_2}{K_4 - K_2} \right\rceil |

1074 | \right\} \cap \vworkintsetpos, |

1075 | \end{equation} |

1076 | |

1077 | which contains one integer if $K_2/K_4 \leq 1/2$ or $(K_4-K_2) \vworkdivides K_2$, |

1078 | or two integers otherwise. |

1079 | |

1080 | Once steady state has been achieved, the number of |

1081 | consecutive base function invocations during which |

1082 | subroutine ``\texttt{A()}'' is not executed is |

1083 | always in the set |

1084 | |

1085 | \begin{equation} |

1086 | \label{eq:lem:crat0:sfdv0:sprc0:03:02} |

1087 | \left\{ |

1088 | \left\lfloor \frac{K_4-K_2}{K_2} \right\rfloor , |

1089 | \left\lceil \frac{K_4-K_2}{K_2} \right\rceil |

1090 | \right\} \cap \vworkintsetpos, |

1091 | \end{equation} |

1092 | |

1093 | which contains one integer if $K_2/K_4 \geq 1/2$ or $K_2 \vworkdivides K_4$, |

1094 | or two integers otherwise. |

1095 | \end{vworklemmastatement} |

1096 | \begin{vworklemmaproof} |

1097 | As before in Lemma \ref{lem:crat0:sfdv0:sprc0:02} |

1098 | for convenience and without |

1099 | loss of generality, assume $K_3=K_4$ and |

1100 | $K_1=0$. Then after a transient period |

1101 | determined by $K_1$ and $K_3$, the \texttt{state} |

1102 | variable will be assigned one of the values in |

1103 | (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) and cycle through |

1104 | those values in an unestablished order but with perfect |

1105 | periodicity. To accomplish this proof, we must establish |

1106 | something about the order in which the \texttt{state} variable attains |

1107 | the values in the set in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}). |

1108 | |

1109 | We can partition the set in (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}) |

1110 | into two sets; the set of values of \texttt{state} for which if the |

1111 | base subroutine is invoked with \texttt{state} in this set, subroutine |

1112 | ``\texttt{A()}'' will not be invoked (we call this set $\phi_1$), |

1113 | and the set of values of \texttt{state} for which if the |

1114 | base subroutine is invoked with \texttt{state} in this set, subroutine |

1115 | ``\texttt{A()}'' will be invoked (we call this set $\phi_2$). |

1116 | $\phi_1$ and $\phi_2$ are identified below. |

1117 | |

1118 | \begin{eqnarray} |

1119 | \label{eq:lem:crat0:sfdv0:sprc0:03:03} |

1120 | & \phi_1 = & \\ |

1121 | \nonumber & |

1122 | \displaystyle{\left\{ |

1123 | 0, \gcd(K_2, K_4), 2 \gcd(K_2, K_4), \ldots , |

1124 | \left(\frac{K_4-K_2}{\gcd(K_2,K_4)} - 1 \right) \gcd(K_2, K_4) |

1125 | \right\}} & |

1126 | \end{eqnarray} |

1127 | |

1128 | \begin{eqnarray} |

1129 | \label{eq:lem:crat0:sfdv0:sprc0:03:04} |

1130 | & \displaystyle{ |

1131 | \phi_2 = \left\{\left(\frac{K_4-K_2}{\gcd(K_2,K_4)}\right) \gcd(K_2, K_4),\right.} & \\ |

1132 | \nonumber & \displaystyle{\left. |

1133 | \left(\frac{K_4-K_2}{\gcd(K_2,K_4)} + 1 \right) \gcd(K_2, K_4) , |

1134 | \ldots , |

1135 | K_4 - \gcd(K_2, K_4) |

1136 | \right\}} & |

1137 | \end{eqnarray} |

1138 | |

1139 | We can also make the following four additional useful observations |

1140 | about $\phi_1$ and $\phi_2$. Note that |

1141 | (\ref{eq:lem:crat0:sfdv0:sprc0:03:07}) and |

1142 | (\ref{eq:lem:crat0:sfdv0:sprc0:03:08}) become equality |

1143 | if $\gcd(K_2, K_4) = 1$. |

1144 | |

1145 | \begin{equation} |

1146 | \label{eq:lem:crat0:sfdv0:sprc0:03:05} |

1147 | n(\phi_1) = \frac{K_4 - K_2}{\gcd(K_2, K_4)} |

1148 | \end{equation} |

1149 | |

1150 | \begin{equation} |

1151 | \label{eq:lem:crat0:sfdv0:sprc0:03:06} |

1152 | n(\phi_2) = \frac{K_2}{\gcd(K_2, K_4)} |

1153 | \end{equation} |

1154 | |

1155 | \begin{equation} |

1156 | \label{eq:lem:crat0:sfdv0:sprc0:03:07} |

1157 | \phi_1 \subseteq \{ 0, 1, \ldots , K_4 - K_2 - 1 \} |

1158 | \end{equation} |

1159 | |

1160 | \begin{equation} |

1161 | \label{eq:lem:crat0:sfdv0:sprc0:03:08} |

1162 | \phi_2 \subseteq \{K_4 - K_2, \ldots , K_4 - 1 \} |

1163 | \end{equation} |

1164 | |

1165 | We first prove (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}). |

1166 | If $state_n \in \phi_2$ at the time the base function |

1167 | is invoked, then |

1168 | ``\texttt{A()}'' will be invoked. We also know that |

1169 | since $state_n \in \phi_2$, $state_n + K_2 \geq K_4$, so |

1170 | |

1171 | \begin{equation} |

1172 | \label{eq:lem:crat0:sfdv0:sprc0:03:09} |

1173 | state_{n+1} \;\; =|_{state_n \in \phi_2} \;\; state_n - (K_4 - K_2) . |

1174 | \end{equation} |

1175 | |

1176 | Thus so long as $state_n \in \phi_2$, $state_{n+1} < state_n$ |

1177 | as specified above in (\ref{eq:lem:crat0:sfdv0:sprc0:03:09}). |

1178 | With each invocation of the base subroutine, \texttt{state} will |

1179 | ``walk downward'' through $\phi_2$. It can |

1180 | also be observed that when \texttt{state} drops below the smallest |

1181 | element of $\phi_2$, the next value of \texttt{state} will |

1182 | be in $\phi_1$. |

1183 | |

1184 | Note also that although the downward walk specified in |

1185 | (\ref{eq:lem:crat0:sfdv0:sprc0:03:09}) walks downward in absolute steps |

1186 | of $K_4-K_2$, this corresponds to $(K_4-K_2) / \gcd(K_2, K_4)$ |

1187 | \emph{elements} of $\phi_2$, since the elements of $\phi_2$ are |

1188 | separated by $\gcd(K_2, K_4)$. |

1189 | |

1190 | Given the ``downward walk'' specified in (\ref{eq:lem:crat0:sfdv0:sprc0:03:09}), |

1191 | the only question to be answered is how many consecutive values of |

1192 | \texttt{state}, separated by $K_4-K_2$ (or $(K_4-K_2)/\gcd(K_2, K_4)$ elements), |

1193 | can ``fit'' into |

1194 | $\phi_2$. Considering that $n(\phi_2) = K_2/\gcd(K_2, K_4)$ |

1195 | (Eq. \ref{eq:lem:crat0:sfdv0:sprc0:03:06}) and that the |

1196 | downward step represents $(K_4-K_2)/\gcd(K_2, K_4)$ set elements, |

1197 | (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}) comes immediately by |

1198 | a graphical argument. |

1199 | |

1200 | We now prove (\ref{eq:lem:crat0:sfdv0:sprc0:03:02}). |

1201 | This can be proved using exactly the same arguments |

1202 | as for (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}), but |

1203 | considering the upward walk through $\phi_1$ rather |

1204 | than the downward walk through $\phi_2$. |

1205 | |

1206 | As with Lemma \ref{lem:crat0:sfdv0:sprc0:02}, |

1207 | note that the choices of $K_1$ and $K_3$ do not |

1208 | materially affect the proof above. $K_1$ and |

1209 | $K_3$ only set a delay until the rational counting |

1210 | algorithm reaches steady state. $K_3$ only shifts |

1211 | the sets $\phi_1$ and $\phi_2$. |

1212 | \end{vworklemmaproof} |

1213 | \begin{vworklemmaparsection}{Remark \#1} |

1214 | This lemma proves an \emph{extremely} important property for the |

1215 | usability of Algorithm \ref{alg:crat0:sfdv0:01a}. It says that once |

1216 | steady state has been reached, the variability in the number of consecutive |

1217 | times ``\texttt{A()}'' is run or not run is at most one count. |

1218 | \end{vworklemmaparsection} |

1219 | \begin{vworklemmaparsection}{Remark \#2} |

1220 | It is probably also possible to construct a rational counting algorithm |

1221 | so that the number of consecutive times ``\texttt{A()}'' is run is constant, |

1222 | but the algorithm achieves long-term accuracy by varying only the number |

1223 | of consecutive times ``\texttt{A()}'' is not run (or vice-versa), but this |

1224 | is not done here. |

1225 | \end{vworklemmaparsection} |

1226 | \begin{vworklemmaparsection}{Remark \#3} |

1227 | There is no requirement that $K_2$ and $K_4$ be coprime. In fact, as |

1228 | demonstrated later, it may be advantageous to choose a large $K_2$ and |

1229 | $K_4$ to approximate a simple ratio so that very fine adjustments can be |

1230 | made. For example, if the ideal ratio is 1/2, it may be desirable |

1231 | in some applications to |

1232 | choose $K_2$=1,000 and $K_4$=2,000 so that fine adjustments can be made |

1233 | by slightly perturbing $K_2$ or $K_4$. One might adjust 1,000/2,000 downward |

1234 | to 999/2,000 or upward to 1,001/2,000 by modifying $K_2$ |

1235 | (both very fine adjustments). |

1236 | \end{vworklemmaparsection} |

1237 | \begin{vworklemmaparsection}{Remark \#4} |

1238 | The most common choice of $K_1$ in practice is 0. If $K_1=0$ is chosen, |

1239 | it can be shown that the number of initial invocations of the |

1240 | base subroutine is in the set identified in |

1241 | (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}). |

1242 | (See Exercise \ref{exe:crat0:sexe0:07}.) |

1243 | \end{vworklemmaparsection} |

1244 | \vworklemmafooter{} |

1245 | |

1246 | For microcontroller work, it is considered |

1247 | a desirable property that software components be resilient |

1248 | to state upset |

1249 | (see Section \chgrzeroxrefhyphen\ref{chgr0:sdda0:srob0}). |

1250 | It can be observed that Algorithm \ref{alg:crat0:sfdv0:01a} will |

1251 | exhibit very anomalous behavior if \texttt{state} is upset to a very negative |

1252 | value. One possible correction to this shortcoming is illustrated |

1253 | in Figure \ref{fig:crat0:sfdv0:sprc0:01}. Other possible |

1254 | corrections are the topic of Exercise \ref{exe:crat0:sexe0:08}. |

1255 | |

1256 | \begin{figure} |

1257 | \begin{verbatim} |

1258 | /* The constants K1 through K4, which parameterize the */ |

1259 | /* counting behavior, are assumed assigned elsewhere in */ |

1260 | /* the code. The solution is analyzed in terms of the */ |

1261 | /* parameters K1 through K4. */ |

1262 | /* */ |

1263 | /* We also place the following restrictions on K1 through */ |

1264 | /* K4: */ |

1265 | /* K1 : K1 <= K3 - K2. */ |

1266 | /* K2 : K4 > K2 > 0. */ |

1267 | /* K3 : No restrictions. */ |

1268 | /* K4 : K4 > K2 > 0. */ |

1269 | |

1270 | void base_rate_func(void) |

1271 | { |

1272 | static int state = K1; |

1273 | |

1274 | state += K2; |

1275 | |

1276 | if ((state < K1) || (state >= K3)) |

1277 | { |

1278 | state -= K4; |

1279 | A(); |

1280 | } |

1281 | } |

1282 | \end{verbatim} |

1283 | \caption{Algorithm \ref{alg:crat0:sfdv0:01a} With State Upset Shortcoming |

1284 | Corrected} |

1285 | \label{fig:crat0:sfdv0:sprc0:01} |

1286 | \end{figure} |

1287 | |

1288 | \begin{vworkexamplestatement} |

1289 | \label{ex:crat0:sfdv0:sprc0:01} |

1290 | Determine the behavior of Algorithm \ref{alg:crat0:sfdv0:01a} with |

1291 | $K_1=0$, $K_2=30$, and $K_3=K_4=50$. |

1292 | \end{vworkexamplestatement} |

1293 | \begin{vworkexampleparsection}{Solution} |

1294 | We first predict the behavior, and then trace the algorithm to |

1295 | verify whether the predictions are accurate. |

1296 | |

1297 | We make the following predictions: |

1298 | |

1299 | \begin{itemize} |

1300 | \item The steady state sequence of invocations of ``\texttt{A()}'' will |

1301 | be periodic with period |

1302 | $K_4/\gcd(K_2, K_4) = 50/10 = 5$, as described |

1303 | in Lemma \ref{lem:crat0:sfdv0:sprc0:02}. |

1304 | \item The number of initial invocations of the |

1305 | base subroutine in which ``\texttt{A()}'' |

1306 | is not run will be |

1307 | $\lceil (K_4 - K_2) / K_2 \rceil = \lceil 2/3 \rceil = 1$, |

1308 | as described in Remark \#4 of |

1309 | Lemma \ref{lem:crat0:sfdv0:sprc0:03} and in the solution to |

1310 | Exercise \ref{exe:crat0:sexe0:07}. |

1311 | \item In steady state, the number of consecutive invocations of the |

1312 | base subroutine during which ``\texttt{A()}'' |

1313 | is not executed will always be 1, as |

1314 | described in Equation \ref{eq:lem:crat0:sfdv0:sprc0:03:02} of |

1315 | Lemma \ref{lem:crat0:sfdv0:sprc0:03}. |

1316 | (Applying Eq. \ref{eq:lem:crat0:sfdv0:sprc0:03:02} |

1317 | yields \ |

1318 | $\{ \lfloor 20/30 \rfloor , \lceil 20/30 \rceil \} \cap \vworkintsetpos % |

1319 | = \{ 0,1 \} \cap \{1, 2, \ldots \} = \{ 1 \}$.) |

1320 | \item In steady state, the number of consecutive invocations of the |

1321 | base subroutine during which ``\texttt{A()}'' |

1322 | is executed will always be 1 or 2, as |

1323 | described in Equation \ref{eq:lem:crat0:sfdv0:sprc0:03:01} of |

1324 | Lemma \ref{lem:crat0:sfdv0:sprc0:03}. |

1325 | (Applying Eq. \ref{eq:lem:crat0:sfdv0:sprc0:03:01} |

1326 | yields \ |

1327 | $\{ \lfloor 30/20 \rfloor , \lceil 30/20 \rceil \} \cap \vworkintsetpos % |

1328 | = \{ 1,2 \} \cap \{1, 2, \ldots \} = \{ 1,2 \}$.) |

1329 | \item The rational counting algorithm will have |

1330 | perfect long-term accuracy. |

1331 | \end{itemize} |

1332 | |

1333 | We can verify the predictions above by tracing the behavior of |

1334 | Algorithm \ref{alg:crat0:sfdv0:01a}. We adopt the convention |

1335 | that $A_n = 1$ if subroutine ``\texttt{A()}'' is invoked during |

1336 | the $n$th invocation of the base subroutine. |

1337 | Table \ref{tbl:crat0:sfdv0:sprc0:01} |

1338 | contains the results of tracing Algorithm \ref{alg:crat0:sfdv0:01a} |

1339 | with $K_1=0$, $K_2=30$, and $K_3=K_4=50$. |

1340 | |

1341 | \begin{table} |

1342 | \caption{Trace Of Algorithm \ref{alg:crat0:sfdv0:01a} With |

1343 | $K_1=0$, $K_2=30$, And $K_3=K_4=50$ (Example \ref{ex:crat0:sfdv0:sprc0:01})} |

1344 | \label{tbl:crat0:sfdv0:sprc0:01} |

1345 | \begin{center} |

1346 | \begin{tabular}{|c|c|c|} |

1347 | \hline |

1348 | Index ($n$) & $state_n$ & $A_n$ \\ |

1349 | \hline |

1350 | \hline |

1351 | 0 & 0 & N/A \\ |

1352 | \hline |

1353 | 1 & 30 & 0 \\ |

1354 | \hline |

1355 | 2 & 10 & 1 \\ |

1356 | \hline |

1357 | 3 & 40 & 0 \\ |

1358 | \hline |

1359 | 4 & 20 & 1 \\ |

1360 | \hline |

1361 | 5 & 0 & 1 \\ |

1362 | \hline |

1363 | 6 & 30 & 0 \\ |

1364 | \hline |

1365 | 7 & 10 & 1 \\ |

1366 | \hline |

1367 | 8 & 40 & 0 \\ |

1368 | \hline |

1369 | 9 & 20 & 1 \\ |

1370 | \hline |

1371 | 10 & 0 & 1 \\ |

1372 | \hline |

1373 | \end{tabular} |

1374 | \end{center} |

1375 | \end{table} |

1376 | |

1377 | It can be verfied from the table that all of the |

1378 | predicted properties are exhibited by the |

1379 | algorithm. |

1380 | \end{vworkexampleparsection} |

1381 | \vworkexamplefooter{} |

1382 | |

1383 | A second characteristic of Algorithm \ref{alg:crat0:sfdv0:01a} |

1384 | that should be analyzed carefully is the behavior |

1385 | of the algorithm if parameters $K_2$ and $K_4$ are adjusted |

1386 | ``on the fly''. ``On-the-fly'' adjustment |

1387 | raises the following concerns. We assume for convenience |

1388 | that $K_1=0$ and $K_3=K_4$. |

1389 | |

1390 | \begin{enumerate} |

1391 | \item \label{enum:crat0:sfdv0:sprc0:01:01} |

1392 | \textbf{Critical section protocol:} if the |

1393 | rational counting algorithm is implemented in a process which |

1394 | is asynchronous to the process which desires to change |

1395 | $K_2$ and $K_4$, what precautions must be taken? |

1396 | \item \label{enum:crat0:sfdv0:sprc0:01:02} |

1397 | \textbf{Anomalous behavior:} will the rational |

1398 | counting algorithm behave in a \emph{very} unexpected way |

1399 | if $K_2$ and $K_4$ are changed on the fly? |

1400 | \item \label{enum:crat0:sfdv0:sprc0:01:03} |

1401 | \textbf{Preservation of accuracy:} even if the behavior |

1402 | exhibited is not \emph{extremely} anomalous, how should |

1403 | $K_2$ and $K_4$ be modified on the fly so as to preserve the |

1404 | maximum accuracy? |

1405 | \end{enumerate} |

1406 | |

1407 | \textbf{(Concern \#\ref{enum:crat0:sfdv0:sprc0:01:02}):} It can be observed |

1408 | with Algorithm \ref{alg:crat0:sfdv0:01a} that neither increasing |

1409 | nor decreasing $K_2$ nor $K_4$ on the fly |

1410 | will lead to \emph{highly} anomalous |

1411 | behavior. Each invocation of the algorithm will map |

1412 | \texttt{state} back into the set identified in |

1413 | (\ref{eq:lem:crat0:sfdv0:sprc0:02:09}). Thus on-the-fly changes |

1414 | to $K_2$ and $K_4$ will establish the rational counting algorithm |

1415 | immediately into steady-state behavior, and the result will not be |

1416 | \emph{highly} anomalous if such on-the-fly changes are not |

1417 | made very often. |

1418 | |

1419 | \textbf{(Concern \#\ref{enum:crat0:sfdv0:sprc0:01:03}):} It can be deduced |

1420 | from |

1421 | (\ref{eq:lem:crat0:sfdv0:sprc0:04:02}), |

1422 | (\ref{eq:lem:crat0:sfdv0:sprc0:04:03}), and |

1423 | (\ref{eq:lem:crat0:sfdv0:sprc0:04:04}) that the value of the |

1424 | \texttt{state} variable in Algorithm \ref{alg:crat0:sfdv0:01a} |

1425 | satisfies the relationship |

1426 | |

1427 | \begin{equation} |

1428 | \label{eq:crat0:sfdv0:sprc0:01} |

1429 | \overline{N_O} - N_O = \frac{state}{K_4} ; |

1430 | \end{equation} |

1431 | |

1432 | \noindent{}in other words, the \texttt{state} variable |

1433 | contains the remainder of an effective division by $K_4$ |

1434 | and thus maintains the fractional part of $\overline{N_O}$. |

1435 | Altering $K_4$ on the fly to a new value |

1436 | (say, $\overline{K_4}$) may be problematic, because |

1437 | to preserve the current fractional part |

1438 | of $\overline{N_O}$, one must adjust it for |

1439 | the new denominator $\overline{K_4}$. This requires |

1440 | solving the equation |

1441 | |

1442 | \begin{equation} |

1443 | \label{eq:crat0:sfdv0:sprc0:02} |

1444 | \frac{state}{K_4} = \frac{n}{\;\;\overline{K_4}\;\;} |

1445 | \end{equation} |

1446 | |

1447 | \noindent{}for $n$ which must be an integer to avoid |

1448 | loss of information. In general, |

1449 | this would require that $K_4 \vworkdivides \overline{K_4}$, |

1450 | a constraint which would be rarely met. Thus, for high-precision |

1451 | applications where a new rational counting rate should become effective |

1452 | seamlessly, the best strategy would seem to be to modify $K_2$ only. |

1453 | It can be verified that modifying $K_2$ on the fly accomplishes |

1454 | a perfect rate transition. |

1455 | |

1456 | \textbf{(Concern \#\ref{enum:crat0:sfdv0:sprc0:01:01}):} In microcontroller work, |

1457 | ordinal data types often represent machine-native data types. In such cases, |

1458 | it may be possible for one process to set $K_2$ or $K_4$ |

1459 | for another process that is asynchronous with respect to it by relying |

1460 | on the atomicity of machine instructions (i.e. without formal mutual |

1461 | exclusion protocol). However, in other cases where the ordinal data types |

1462 | of $K_2$ or $K_4$ are larger than can be accomodated by |

1463 | a single machine instruction or where $K_2$ and $K_4$ must be modified |

1464 | together atomically, mutual exclusion protocol should be used to |

1465 | prevent anomalous behavior due to race conditions (see |

1466 | Exercise \ref{exe:crat0:sexe0:14}). |

1467 | |

1468 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

1469 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

1470 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

1471 | \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01b}] |

1472 | {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01b}} |

1473 | %Section tag: PRC1 |

1474 | \label{crat0:sfdv0:sprc1} |

1475 | |

1476 | Algorithm \ref{alg:crat0:sfdv0:01a} |

1477 | has the disadvantage that it requires $K_2/K_4 < 1$ (i.e. it can only |

1478 | decrease frequency, but never increase frequency). This deficiency |

1479 | can be corrected by using |

1480 | Algorithm \ref{alg:crat0:sfdv0:01b}. |

1481 | |

1482 | Note that Algorithm \ref{alg:crat0:sfdv0:01b} will properly deal with $K_2$ and |

1483 | $K_4$ chosen such that $0 < K_2/K_4 < \infty$. |

1484 | |

1485 | The most common reason that one may want a counting algorithm |

1486 | that will correctly handle |

1487 | $K_2/K_4 \geq 1$ is to conveniently handle $K_2/K_4 \approx 1$. |

1488 | In practice, $K_2/K_4$ may represent a quantity that is |

1489 | normally very close to |

1490 | 1 but may also be slightly less than or slightly greater than 1. |

1491 | For example, one may use $K_2/K_4 \approx 1$ to correct for a |

1492 | crystal or a resonator which deviates slightly from its nominal |

1493 | frequency. We illustrate this with the following example. |

1494 | |

1495 | \begin{vworkexamplestatement} |

1496 | \label{ex:crat0:sfdv0:sprc1:01} |

1497 | A microcontroller software load keeps time via an interrupt |

1498 | service routine that runs every 1ms, but this frequency may be |

1499 | off by as much as 1 part in 10,000 due to variations in |

1500 | crystal or resonator manufacture. The interrupt service routine |

1501 | updates a counter which represents the number of milliseconds elapsed since |

1502 | the software load was reset. Devise a rational counting strategy |

1503 | based on Algorithm \ref{alg:crat0:sfdv0:01b} |

1504 | which will allow the time accuracy to be trimmed to within |

1505 | one second per year or less by adjusting only $K_4$, and implement the counting strategy |

1506 | in software. |

1507 | \end{vworkexamplestatement} |

1508 | \begin{vworkexampleparsection}{Solution} |

1509 | $K_2/K_4$ will be nominally very close to 1 ($K_2 \approx K_4$). |

1510 | If we assume that each year has 365.2422\footnote{The period of the earth's |

1511 | rotation about the sun is not an integral number of days, which is why the |

1512 | rules for leap years exist. Ironically, the assignment of leap years is itself |

1513 | a problem very similar to the rational counting problems discussed in this chapter.} days |

1514 | ($\approx$ 31,556,926 seconds), then choosing |

1515 | $K_2 \approx K_4 = 31,556,926$ will yield satisfactory results. |

1516 | If we may need to compensate for up to 1 part in 10,000 of crystal or resonator |

1517 | inaccuracy, we may need to adjust $K_2$ as low as 0.9999 $\times$ 31,556,926 $\approx$ |

1518 | 31,553,770 (to compensate for a fast |

1519 | crystal or resonator) or as |

1520 | high as 1.0001 $\times$ 31,556,926 |

1521 | $\approx$ 31,560,082 |

1522 | (to compensate for a slow crystal or resonator). Choosing |

1523 | $K_4 = 31,556,926$ yields the convenient relationship that each |

1524 | count in $K_2$ corresponds to one second per year. |

1525 | |

1526 | \begin{figure} |

1527 | \begin{verbatim} |

1528 | /* The constants K1 through K4, which parameterize the */ |

1529 | /* counting behavior, are assumed assigned elsewhere in */ |

1530 | /* the code. */ |

1531 | /* */ |

1532 | /* The variable time_count below is the number of milli- */ |

1533 | /* seconds since the software was reset. */ |

1534 | int time_count = 0; |

1535 | |

1536 | /* It is assumed that the base rate subroutine below is */ |

1537 | /* called every millisecond (or, at least what should be */ |

1538 | /* every millisecond of the crystal or resonator were */ |

1539 | /* perfect). */ |

1540 | |

1541 | void base_rate_sub(void) |

1542 | { |

1543 | static int state = K1; |

1544 | |

1545 | state += K2; |

1546 | |

1547 | while (state >= K3) |

1548 | { |

1549 | state -= K4; |

1550 | time_count++; |

1551 | } |

1552 | } |

1553 | \end{verbatim} |

1554 | \caption{Algorithm \ref{alg:crat0:sfdv0:01b} Applied To Timekeeping |

1555 | (Example \ref{ex:crat0:sfdv0:sprc1:01})} |

1556 | \label{fig:ex:crat0:sfdv0:sprc1:01:01} |

1557 | \end{figure} |

1558 | |

1559 | Figure \ref{fig:ex:crat0:sfdv0:sprc1:01:01} provides an illustration |

1560 | of Algorithm \ref{alg:crat0:sfdv0:01b} applied in this scenario. |

1561 | We assume that $K_4$ contains the constant value 31,556,926 |

1562 | and that $K_2$ is modified about this value either downwards or upwards |

1563 | to trim the timekeeping. Note that Algorithm \ref{alg:crat0:sfdv0:01b} will correctly |

1564 | handle $K_2 \geq K_4$. |

1565 | |

1566 | Also note in the implementation illustrated in Figure |

1567 | \ref{fig:ex:crat0:sfdv0:sprc1:01:01} that large integers (27 bits or more) |

1568 | are required. (See also Exercise \ref{exe:crat0:sexe0:09}). |

1569 | \end{vworkexampleparsection} |

1570 | \vworkexamplefooter{} |

1571 | |

1572 | It may not be obvious whether Algorithm \ref{alg:crat0:sfdv0:01b} has the |

1573 | same or similar desirable properties as Algorithm \ref{alg:crat0:sfdv0:01a} |

1574 | presented |

1575 | in Lemmas |

1576 | \ref{lem:crat0:sfdv0:sprc0:01}, |

1577 | \ref{lem:crat0:sfdv0:sprc0:02}, |

1578 | \ref{lem:crat0:sfdv0:sprc0:04}, |

1579 | and |

1580 | \ref{lem:crat0:sfdv0:sprc0:03}. |

1581 | Algorithm \ref{alg:crat0:sfdv0:01b} does have these desirable |

1582 | properties, and these properties are presented as |

1583 | Lemmas \ref{lem:crat0:sfdv0:sprc1:01}, |

1584 | \ref{lem:crat0:sfdv0:sprc1:02}, |

1585 | \ref{lem:crat0:sfdv0:sprc1:03}, and |

1586 | \ref{lem:crat0:sfdv0:sprc1:04}. |

1587 | The proofs of these lemmas are identical or very similar to the proofs |

1588 | of Lemmas |

1589 | \ref{lem:crat0:sfdv0:sprc0:01}, |

1590 | \ref{lem:crat0:sfdv0:sprc0:02}, |

1591 | \ref{lem:crat0:sfdv0:sprc0:04}, |

1592 | and |

1593 | \ref{lem:crat0:sfdv0:sprc0:03}; |

1594 | and so these proofs when not identical are presented as exercises. |

1595 | Note that Algorithm \ref{alg:crat0:sfdv0:01b} behaves identically to |

1596 | Algorithm \ref{alg:crat0:sfdv0:01a} when $K_2 < K_4$, and the |

1597 | case of $K_2=K_4$ is trivial, so in general only |

1598 | the behavior when $K_2 > K_4$ remains to be proved. |

1599 | |

1600 | \begin{vworklemmastatement} |

1601 | \label{lem:crat0:sfdv0:sprc1:01} |

1602 | $N_{STARTUP}$, the number of invocations of the base subroutine |

1603 | in Algorithm \ref{alg:crat0:sfdv0:01b} before ``\texttt{A()}'' is called |

1604 | for the first time, is given by |

1605 | |

1606 | \begin{equation} |

1607 | \label{eq:lem:crat0:sfdv0:sprc1:01:01} |

1608 | N_{STARTUP} = |

1609 | \left\lceil |

1610 | { |

1611 | \frac{-K_1 - K_2 + K_3}{K_2} |

1612 | } |

1613 | \right\rceil . |

1614 | \end{equation} |

1615 | \end{vworklemmastatement} |

1616 | \begin{vworklemmaproof} |

1617 | The proof is identical to the proof of Lemma |

1618 | \ref{lem:crat0:sfdv0:sprc0:01}. |

1619 | \end{vworklemmaproof} |

1620 | |

1621 | |

1622 | \begin{vworklemmastatement} |

1623 | \label{lem:crat0:sfdv0:sprc1:02} |

1624 | Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:01b} |

1625 | base subroutine |

1626 | is called, let $N_O$ be the number of times the |

1627 | ``\texttt{A()}'' subroutine is called, let |

1628 | $f_I$ be the frequency of invocation of the |

1629 | Algorithm \ref{alg:crat0:sfdv0:01a} base subroutine, and let |

1630 | $f_O$ be the frequency of invocation of |

1631 | ``\texttt{A()}''. |

1632 | |

1633 | \begin{equation} |

1634 | \label{eq:lem:crat0:sfdv0:sprc1:02:01} |

1635 | \lim_{N_I\rightarrow\infty}\frac{N_O}{N_I} |

1636 | = |

1637 | \frac{f_O}{f_I} |

1638 | = |

1639 | \frac{K_2}{K_4} . |

1640 | \end{equation} |

1641 | \end{vworklemmastatement} |

1642 | \begin{vworklemmaproof} |

1643 | See Exercise \ref{exe:crat0:sexe0:10}. |

1644 | \end{vworklemmaproof} |

1645 | |

1646 | \begin{vworklemmastatement} |

1647 | \label{lem:crat0:sfdv0:sprc1:03} |

1648 | If $K_3=K_4$, $K_1=0$, and $\gcd(K_2, K_4)=1$\footnote{See also |

1649 | footnote \ref{footnote:lem:crat0:sfdv0:sprc0:04:01} in this chapter.}, the error between |

1650 | the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01b} |

1651 | and the ``ideal'' mapping is always |

1652 | in the set |

1653 | |

1654 | \begin{equation} |

1655 | \label{eq:lem:crat0:sfdv0:sprc1:03:01} |

1656 | \left[ - \frac{K_4 - 1}{K_4} , 0 \right] , |

1657 | \end{equation} |

1658 | |

1659 | and no algorithm can be constructed to |

1660 | confine the error to a smaller interval. |

1661 | \end{vworklemmastatement} |

1662 | \begin{vworklemmaproof} |

1663 | The proof is identical to the proof of Lemma \ref{lem:crat0:sfdv0:sprc0:04}. |

1664 | \end{vworklemmaproof} |

1665 | |

1666 | \begin{vworklemmastatement} |

1667 | \label{lem:crat0:sfdv0:sprc1:04} |

1668 | For Algorithm \ref{alg:crat0:sfdv0:01b} |

1669 | with |

1670 | $K_2 \geq K_4$, once steady |

1671 | state has been achieved (see Exercise |

1672 | \ref{exe:crat0:sexe0:13}), each invocation of the |

1673 | base subroutine will result in |

1674 | a number of invocations of |

1675 | ``\texttt{A()}'' which is in the set |

1676 | |

1677 | \begin{equation} |

1678 | \label{eq:lem:crat0:sfdv0:sprc1:04:01} |

1679 | \left\{ |

1680 | \left\lfloor \frac{K_2}{K_4} \right\rfloor , |

1681 | \left\lceil \frac{K_2}{K_4} \right\rceil |

1682 | \right\}, |

1683 | \end{equation} |

1684 | |

1685 | which contains one integer if $K_4 \vworkdivides K_2$, |

1686 | or two integers otherwise. With $K_2 < K_4$, |

1687 | the behavior will be as specified in Lemma |

1688 | \ref{lem:crat0:sfdv0:sprc0:03}. |

1689 | \end{vworklemmastatement} |

1690 | \begin{vworklemmaproof} |

1691 | See Exercise \ref{exe:crat0:sexe0:12}. |

1692 | \end{vworklemmaproof} |

1693 | \begin{vworklemmaparsection}{Remark} |

1694 | Note that Lemma \ref{lem:crat0:sfdv0:sprc0:03} |

1695 | and this lemma specify different aspects of behavior, |

1696 | which is why (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}) |

1697 | and (\ref{eq:lem:crat0:sfdv0:sprc0:03:02}) take |

1698 | different forms than |

1699 | (\ref{eq:lem:crat0:sfdv0:sprc1:04:01}). |

1700 | Lemma \ref{lem:crat0:sfdv0:sprc0:03} specifies the number of consecutive |

1701 | invocations of the base subroutine for which ``\texttt{A()}'' |

1702 | will be run, but with $K_2 \geq K_4$ it does not make sense to |

1703 | specify behavior in this way since ``\texttt{A()}'' will be run |

1704 | on \emph{every} invocation of the base subroutine. This lemma specifies |

1705 | the number of times ``\texttt{A()}'' will be run on a \emph{single} |

1706 | invocation of the base subroutine (which is not meaningful if |

1707 | $K_2 < K_4$ since the result will always be 0 or 1). |

1708 | \end{vworklemmaparsection} |

1709 | %\vworklemmafooter{} |

1710 | |

1711 | |

1712 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

1713 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

1714 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

1715 | \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01c}] |

1716 | {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01c}} |

1717 | %Section tag: PRX0 |

1718 | \label{crat0:sfdv0:sprx0} |

1719 | |

1720 | Algorithm \ref{alg:crat0:sfdv0:01c}\footnote{Algorithm \ref{alg:crat0:sfdv0:01c} |

1721 | was contributed in March, 2003 |

1722 | by Chuck B. Falconer \cite{bibref:i:chuckbfalconer} |

1723 | via the |

1724 | \texttt{comp.arch.embedded} \cite{bibref:n:comparchembedded} |

1725 | newsgroup.} |

1726 | is a variant of Algorithm \ref{alg:crat0:sfdv0:01a} |

1727 | which has one fewer |

1728 | degrees of freedom than Algorithms \ref{alg:crat0:sfdv0:01a} |

1729 | and \ref{alg:crat0:sfdv0:01b} and can be implemented |

1730 | more efficiently under most instruction sets. Algorithm \ref{alg:crat0:sfdv0:01c} |

1731 | is superior to Algorithms \ref{alg:crat0:sfdv0:01a} |

1732 | and \ref{alg:crat0:sfdv0:01b} |

1733 | from a computational efficiency |

1734 | point of view, but is less intuitive. |

1735 | |

1736 | The superiority in computational efficiency of Algorithm \ref{alg:crat0:sfdv0:01c} |

1737 | comes from the possibility of using an implicit test against zero |

1738 | (rather than an explicit |

1739 | test against $K_3$, as is found in Algorithms \ref{alg:crat0:sfdv0:01a} |

1740 | and \ref{alg:crat0:sfdv0:01b}). |

1741 | Many machine instruction sets automatically set flags to indicate a negative |

1742 | result when the |

1743 | subtraction of $K_2$ is performed, thus often allowing a conditional branch |

1744 | without an additional instruction. Whether an instruction will be saved in |

1745 | the code of Figure \ref{fig:crat0:sfdv0:01c} depends on the sophistication |

1746 | of the `C' compiler, but of course if the algorithm were coded in |

1747 | assembly-language an instruction could be saved on most processors. |

1748 | |

1749 | The properties of rational counting Algorithm \ref{alg:crat0:sfdv0:01c} are nearly |

1750 | identical to those of Algorithm \ref{alg:crat0:sfdv0:01a}, |

1751 | and we prove the important properties |

1752 | now. |

1753 | |

1754 | \begin{vworklemmastatement} |

1755 | \label{lem:crat0:sfdv0:sprx0:01} |

1756 | $N_{STARTUP}$, the number of invocations of the base subroutine |

1757 | in Algorithm \ref{alg:crat0:sfdv0:01c} before ``\texttt{A()}'' is called |

1758 | for the first time, is given by |

1759 | |

1760 | \begin{equation} |

1761 | \label{eq:lem:crat0:sfdv0:sprx0:01:01} |

1762 | N_{STARTUP} = |

1763 | \left\lfloor |

1764 | { |

1765 | \frac{K_1}{K_2} |

1766 | } |

1767 | \right\rfloor . |

1768 | \end{equation} |

1769 | \end{vworklemmastatement} |

1770 | \begin{vworklemmaproof} |

1771 | The value of \texttt{state} when tested against |

1772 | zero in the \texttt{if()} statement during the $n$th invocation |

1773 | of the base subroutine is $K_1 - n K_2$. In order for the test |

1774 | not to be met on the $n$th invocation |

1775 | of the base subroutine, we require that |

1776 | |

1777 | \begin{equation} |

1778 | \label{eq:lem:crat0:sfdv0:sprx0:01:02} |

1779 | K_1 - n K_2 \geq 0 |

1780 | \end{equation} |

1781 | |

1782 | \noindent{}or equivalently that |

1783 | |

1784 | \begin{equation} |

1785 | \label{eq:lem:crat0:sfdv0:sprx0:01:03} |

1786 | n \leq \frac{K_1}{K_2} . |

1787 | \end{equation} |

1788 | |

1789 | Solving (\ref{eq:lem:crat0:sfdv0:sprx0:01:03}) for the |

1790 | largest value of $n \in \vworkintset$ which still meets the criterion |

1791 | yields (\ref{eq:lem:crat0:sfdv0:sprx0:01:01}). Note that |

1792 | the derivation of (\ref{eq:lem:crat0:sfdv0:sprx0:01:01}) requires |

1793 | that the restrictions on $K_1$, $K_2$, and $K_3$ documented in |

1794 | Figure \ref{fig:crat0:sfdv0:01c} be met. |

1795 | \end{vworklemmaproof} |

1796 | |

1797 | \begin{vworklemmastatement} |

1798 | \label{lem:crat0:sfdv0:sprx0:02} |

1799 | Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:01c} |

1800 | base subroutine |

1801 | is called, let $N_O$ be the number of times the |

1802 | ``\texttt{A()}'' subroutine is called, let |

1803 | $f_I$ be the frequency of invocation of the |

1804 | Algorithm \ref{alg:crat0:sfdv0:01a} |

1805 | base subroutine, and let |

1806 | $f_O$ be the frequency of invocation of |

1807 | ``\texttt{A()}''. Provided the constraints |

1808 | on $K_1$, $K_2$, and $K_3$ documented in |

1809 | Figure \ref{fig:crat0:sfdv0:01c} are met, |

1810 | |

1811 | \begin{equation} |

1812 | \label{eq:lem:crat0:sfdv0:sprx0:02:01} |

1813 | \lim_{N_I\rightarrow\infty}\frac{N_O}{N_I} |

1814 | = |

1815 | \frac{f_O}{f_I} |

1816 | = |

1817 | \frac{K_2}{K_4} . |

1818 | \end{equation} |

1819 | \end{vworklemmastatement} |

1820 | \begin{vworklemmaproof} |

1821 | (\ref{eq:lem:crat0:sfdv0:sprx0:02:01}) indicates that once |

1822 | an initial delay (determined by $K_1$) has finished, |

1823 | $N_O/N_I$ will converge on a steady-state value of |

1824 | $K_2/K_4$. |

1825 | |

1826 | The most straightforward way to analyze Algorithm \ref{alg:crat0:sfdv0:01c} |

1827 | is to show how an algorithm already |

1828 | understood (Algorithm \ref{alg:crat0:sfdv0:01a}) |

1829 | can be transformed to |

1830 | Algorithm \ref{alg:crat0:sfdv0:01c} |

1831 | in a way where the analysis of Algorithm \ref{alg:crat0:sfdv0:01a} |

1832 | also applies to Algorithm \ref{alg:crat0:sfdv0:01c}. |

1833 | Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01} shows |

1834 | how such a transformation can be performed in |

1835 | four steps. |

1836 | |

1837 | \begin{figure} |

1838 | (a) Algorithm \ref{alg:crat0:sfdv0:01a} unchanged. |

1839 | $state_{a,n} \in \{0, 1, \ldots, K_4 - 1 \}$. |

1840 | \begin{verbatim} |

1841 | state += K2; |

1842 | if (state >= K4) |

1843 | { |

1844 | state -= K4; |

1845 | A(); |

1846 | } |

1847 | \end{verbatim} |

1848 | (b) ``\texttt{>=}'' changed to ``\texttt{>}''. $state_{b,n} \in \{1, 2, \ldots, K_4 \}$, |

1849 | $state_{b,n} = state_{a,n} + 1$. |

1850 | \begin{verbatim} |

1851 | state += K2; |

1852 | if (state > K4) |

1853 | { |

1854 | state -= K4; |

1855 | A(); |

1856 | } |

1857 | \end{verbatim} |

1858 | (c) Test against $K_4$ changed to test against zero. |

1859 | $state_{c,n} \in \{-K_4 + 1, -K_4 + 2, \ldots, 0 \}$, |

1860 | $state_{c,n} = state_{b,n} - K_4$. |

1861 | \begin{verbatim} |

1862 | state += K2; |

1863 | if (state > 0) |

1864 | { |

1865 | state -= K4; |

1866 | A(); |

1867 | } |

1868 | \end{verbatim} |

1869 | (d) Sign inversion. |

1870 | $state_{d,n} \in \{ 0, 1, \ldots, K_4 - 1 \}$, |

1871 | $state_{d,n} = - state_{c,n}$. |

1872 | \begin{verbatim} |

1873 | state -= K2; |

1874 | if (state < 0) |

1875 | { |

1876 | state += K4; |

1877 | A(); |

1878 | } |

1879 | \end{verbatim} |

1880 | (e) `C' expression rearrangement. |

1881 | $state_{e,n} \in \{ 0, 1, \ldots, K_4 - 1 \}$, |

1882 | $state_{e,n} = state_{d,n}$. |

1883 | \begin{verbatim} |

1884 | if ((state -= K2) < 0) |

1885 | { |

1886 | state += K4; |

1887 | A(); |

1888 | } |

1889 | \end{verbatim} |

1890 | \caption{4-Step Transformation Of Algorithm \ref{alg:crat0:sfdv0:01a} |

1891 | To Algorithm \ref{alg:crat0:sfdv0:01c}} |

1892 | \label{fig:lem:crat0:sfdv0:sprx0:02:01} |

1893 | \end{figure} |

1894 | |

1895 | In Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01}, each of the |

1896 | four steps required to transform from Algorithm \ref{alg:crat0:sfdv0:01a} to |

1897 | Algorithm \ref{alg:crat0:sfdv0:01c} includes an equation to transform the |

1898 | \texttt{state} variable. Combining all of these |

1899 | transformations yields |

1900 | |

1901 | \begin{eqnarray} |

1902 | \label{eq:lem:crat0:sfdv0:sprx0:02:02} |

1903 | state_{e,n} & = & K_4 - 1 - state_{a,n} \\ |

1904 | \label{eq:lem:crat0:sfdv0:sprx0:02:03} |

1905 | state_{a,n} & = & K_4 - 1 - state_{e,n} |

1906 | \end{eqnarray} |

1907 | |

1908 | We thus see that Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01}(a) |

1909 | (corresponding to Algorithm \ref{alg:crat0:sfdv0:01a}) and |

1910 | Figure \ref{fig:lem:crat0:sfdv0:sprx0:02:01}(e) |

1911 | (corresponding to Algorithm \ref{alg:crat0:sfdv0:01c}) have |

1912 | \texttt{state} semantics which involve the same range |

1913 | but a reversed order. (\ref{eq:lem:crat0:sfdv0:sprx0:02:01}) |

1914 | follows directly from this observation and from |

1915 | Lemma \ref{lem:crat0:sfdv0:sprc0:02}. |

1916 | \end{vworklemmaproof} |

1917 | %\vworklemmafooter{} |

1918 | |

1919 | \begin{vworklemmastatement} |

1920 | \label{lem:crat0:sfdv0:sprx0:03} |

1921 | If $K_1=0$ and $\gcd(K_2, K_4)=1$\footnote{See also |

1922 | footnote \ref{footnote:lem:crat0:sfdv0:sprc0:04:01} in this chapter.}, the error between |

1923 | the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01c} |

1924 | and the ``ideal'' mapping is always |

1925 | in the set |

1926 | |

1927 | \begin{equation} |

1928 | \label{eq:lem:crat0:sfdv0:sprx0:03:01} |

1929 | \left[ 0, \frac{K_4 - 1}{K_4} \right] , |

1930 | \end{equation} |

1931 | |

1932 | and no algorithm can be constructed to |

1933 | confine the error to a smaller interval. |

1934 | \end{vworklemmastatement} |

1935 | \begin{vworklemmaproof} |

1936 | Using the duality illustrated by |

1937 | (\ref{eq:lem:crat0:sfdv0:sprx0:02:02}) and |

1938 | (\ref{eq:lem:crat0:sfdv0:sprx0:02:03}), |

1939 | starting Algorithm \ref{alg:crat0:sfdv0:01c} with |

1940 | $state_0=0$ will yield a dual state vector |

1941 | with respect to starting Algorithm \ref{alg:crat0:sfdv0:01a} with |

1942 | $state_0=K_4-1$. Thus, |

1943 | |

1944 | \begin{equation} |

1945 | \label{eq:lem:crat0:sfdv0:sprx0:03:02} |

1946 | N_O = \left\lfloor \frac{n K_2 + K_4 - 1}{K_4} \right\rfloor . |

1947 | \end{equation} |

1948 | |

1949 | Using this altered value of $N_O$ in (\ref{eq:lem:crat0:sfdv0:sprc0:04:04}) |

1950 | leads directly to (\ref{eq:lem:crat0:sfdv0:sprx0:03:01}). |

1951 | |

1952 | The proof that there can be no better algorithm is identical |

1953 | to the same proof for Lemma \ref{lem:crat0:sfdv0:sprc0:04} (Exercise \ref{exe:crat0:sexe0:06}). |

1954 | \end{vworklemmaproof} |

1955 | %\vworklemmafooter{} |

1956 | |

1957 | \begin{vworklemmastatement} |

1958 | \label{lem:crat0:sfdv0:sprx0:04} |

1959 | For Algorithm \ref{alg:crat0:sfdv0:01c}, once steady |

1960 | state has been achieved, the number of consecutive |

1961 | base subroutine invocations during which subroutine |

1962 | ``\texttt{A()}'' is executed is always in the set |

1963 | |

1964 | \begin{equation} |

1965 | \label{eq:lem:crat0:sfdv0:sprx0:04:01} |

1966 | \left\{ |

1967 | \left\lfloor \frac{K_2}{K_4 - K_2} \right\rfloor , |

1968 | \left\lceil \frac{K_2}{K_4 - K_2} \right\rceil |

1969 | \right\} \cap \vworkintsetpos, |

1970 | \end{equation} |

1971 | |

1972 | which contains one integer if $K_2/K_4 \leq 1/2$ or $(K_4-K_2) \vworkdivides K_2$, |

1973 | or two integers otherwise. |

1974 | |

1975 | Once steady state has been achieved, the number of |

1976 | consecutive base function invocations during which |

1977 | subroutine ``\texttt{A()}'' is not executed is |

1978 | always in the set |

1979 | |

1980 | \begin{equation} |

1981 | \label{eq:lem:crat0:sfdv0:sprx0:04:02} |

1982 | \left\{ |

1983 | \left\lfloor \frac{K_4-K_2}{K_2} \right\rfloor , |

1984 | \left\lceil \frac{K_4-K_2}{K_2} \right\rceil |

1985 | \right\} \cap \vworkintsetpos, |

1986 | \end{equation} |

1987 | |

1988 | which contains one integer if $K_2/K_4 \geq 1/2$ or $K_2 \vworkdivides K_4$, |

1989 | or two integers otherwise. |

1990 | \end{vworklemmastatement} |

1991 | \begin{vworklemmaproof} |

1992 | The proof comes directly from the duality between algorithm |

1993 | Algorithms \ref{alg:crat0:sfdv0:01a} |

1994 | and \ref{alg:crat0:sfdv0:01c} established in the |

1995 | proof of Lemma \ref{lem:crat0:sfdv0:sprx0:01}, so that the results |

1996 | from Lemma \ref{lem:crat0:sfdv0:sprc0:03} apply without modification. |

1997 | \end{vworklemmaproof} |

1998 | \vworklemmafooter{} |

1999 | |

2000 | |

2001 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2002 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2003 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2004 | \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:01d}] |

2005 | {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:01d}} |

2006 | %Section tag: PRX1 |

2007 | \label{crat0:sfdv0:sprx1} |

2008 | |

2009 | Algorithm \ref{alg:crat0:sfdv0:01d}\footnote{Algorithm \ref{alg:crat0:sfdv0:01d} |

2010 | was contributed in March, 2003 |

2011 | by John Larkin \cite{bibref:i:johnlarkin} |

2012 | via the |

2013 | \texttt{comp.arch.embedded} \cite{bibref:n:comparchembedded} |

2014 | newsgroup.} |

2015 | (Figure \ref{fig:crat0:sfdv0:01d}) is a further |

2016 | economization of Algorithms \ref{alg:crat0:sfdv0:01a} |

2017 | through \ref{alg:crat0:sfdv0:01c} that can be made by eliminating |

2018 | the addition or subtraction of $K_4$ and test against $K_3$ |

2019 | and instead using the |

2020 | inherent machine integer size of $W$ bits to perform |

2021 | arithmetic modulo $2^W$. Thus, effectively, Algorithm \ref{alg:crat0:sfdv0:01d} |

2022 | is equivalent to Algorithm \ref{alg:crat0:sfdv0:01a} with |

2023 | $K_4 = K_3 = 2^W$. |

2024 | |

2025 | Figure \ref{fig:crat0:sfdv0:01d} shows both |

2026 | assembly-language (Texas Instruments TMS-370C8) and |

2027 | `C' implementations of the algorithm. The assembly-language |

2028 | version uses the carry flag of the processor and thus |

2029 | is \emph{very} efficient. Because `C' does not have access |

2030 | to the processor flags, the 'C' version is less efficient. |

2031 | The ``less than'' comparison when |

2032 | using unsigned integers is equivalent to a rollover test. |

2033 | |

2034 | It is easy to see from the figure that Algorithm \ref{alg:crat0:sfdv0:01d} |

2035 | is equivalent in all |

2036 | respects to Algorithm \ref{alg:crat0:sfdv0:01a} with |

2037 | $K_3 = K_4$ fixed at $2^W$. It is not necessary to enforce any constraints |

2038 | on $K_2$ because $K_2 < K_3 = K_4 = 2^W$ due to the inherent size of |

2039 | a machine integer. Note that unlike Algorithms \ref{alg:crat0:sfdv0:01a} |

2040 | through \ref{alg:crat0:sfdv0:01c} which allow $K_2$ and $K_4$ to be chosen independently |

2041 | and from the Farey series of appropriate order, Algorithm \ref{alg:crat0:sfdv0:01c} |

2042 | only allows |

2043 | $K_2/K_4$ of the form $K_2/2^W$. |

2044 | |

2045 | The properties below follow immediately |

2046 | from the properties of Algorithm \ref{alg:crat0:sfdv0:01a}. |

2047 | |

2048 | \begin{vworklemmastatement} |

2049 | \label{lem:crat0:sfdv0:sprx1:01} |

2050 | $N_{STARTUP}$, the number of invocations of the base subroutine |

2051 | in Algorithm \ref{alg:crat0:sfdv0:01d} before ``\texttt{A()}'' is called |

2052 | for the first time, is given by |

2053 | |

2054 | \begin{equation} |

2055 | \label{eq:lem:crat0:sfdv0:sprx1:01:01} |

2056 | N_{STARTUP} = |

2057 | \left\lfloor |

2058 | { |

2059 | \frac{2^W - K_1 - 1}{K_2} |

2060 | } |

2061 | \right\rfloor . |

2062 | \end{equation} |

2063 | \end{vworklemmastatement} |

2064 | \begin{vworklemmaproof} |

2065 | The value of \texttt{state} after the $n$th invocation |

2066 | is $state_n = K_1 + n K_2$. In order for the test in the |

2067 | \texttt{if()} statement not to be met, we require that |

2068 | |

2069 | \begin{equation} |

2070 | \label{eq:lem:crat0:sfdv0:sprx1:01:02} |

2071 | K_1 + n K_2 \leq 2^W - 1 |

2072 | \end{equation} |

2073 | |

2074 | \noindent{}or equivalently that |

2075 | |

2076 | \begin{equation} |

2077 | \label{eq:lem:crat0:sfdv0:sprx1:01:03} |

2078 | n \leq \frac{2^W - K_1 - 1}{K_2} . |

2079 | \end{equation} |

2080 | |

2081 | Solving (\ref{eq:lem:crat0:sfdv0:sprx1:01:03}) for the largest |

2082 | value of $n \in \vworkintset$ which still meets the criterion |

2083 | yields (\ref{eq:lem:crat0:sfdv0:sprx1:01:01}). |

2084 | \end{vworklemmaproof} |

2085 | |

2086 | \begin{vworklemmastatement} |

2087 | \label{lem:crat0:sfdv0:sprx1:02} |

2088 | Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:01d} base subroutine |

2089 | is called, let $N_O$ be the number of times the |

2090 | ``\texttt{A()}'' subroutine is called, let |

2091 | $f_I$ be the frequency of invocation of the |

2092 | Algorithm \ref{alg:crat0:sfdv0:01d} base subroutine, and let |

2093 | $f_O$ be the frequency of invocation of |

2094 | ``\texttt{A()}''. Then |

2095 | |

2096 | \begin{equation} |

2097 | \label{eq:lem:crat0:sfdv0:sprx1:02:01} |

2098 | \lim_{N_I\rightarrow\infty}\frac{N_O}{N_I} |

2099 | = |

2100 | \frac{f_O}{f_I} |

2101 | = |

2102 | \frac{K_2}{2^W} , |

2103 | \end{equation} |

2104 | |

2105 | where $W$ is the number of bits in a machine unsigned integer. |

2106 | Note that $K_2 < 2^W$ since $K_2 \in \{ 0, 1, \ldots , 2^W-1 \}$. |

2107 | \end{vworklemmastatement} |

2108 | \begin{vworklemmaproof} |

2109 | The proof is identical to the proof of |

2110 | Lemma \ref{lem:crat0:sfdv0:sprc0:02} with $K_3=K_4=2^W$. |

2111 | Note that Algorithm \ref{alg:crat0:sfdv0:01a} calculates $n K_2 \bmod K_4$ by |

2112 | subtraction, whereas Algorithm \ref{alg:crat0:sfdv0:01d} calculates |

2113 | $n K_2 \bmod 2^W$ by the properties of a $W$-bit counter |

2114 | which is allowed to roll over. |

2115 | \end{vworklemmaproof} |

2116 | %\vworklemmafooter{} |

2117 | |

2118 | |

2119 | \begin{vworklemmastatement} |

2120 | \label{lem:crat0:sfdv0:sprx1:03} |

2121 | If $\gcd(K_2, 2^W)=1$\footnote{See also footnote \ref{footnote:lem:crat0:sfdv0:sprc0:04:01} |

2122 | in this chapter. Note also that in this context the condition $\gcd(K_2, 2^W)=1$ |

2123 | is equivalent to the condition that $K_2$ be odd.}, the error between |

2124 | the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01d} |

2125 | and the ``ideal'' mapping is always |

2126 | in the set |

2127 | |

2128 | \begin{equation} |

2129 | \label{eq:lem:crat0:sfdv0:sprx1:03:01} |

2130 | \left[ - \frac{2^W - 1}{2^W} , 0 \right] , |

2131 | \end{equation} |

2132 | |

2133 | and no algorithm can be constructed to |

2134 | confine the error to a smaller interval. |

2135 | \end{vworklemmastatement} |

2136 | \begin{vworklemmaproof} |

2137 | The proof is identical to the proof of Lemma |

2138 | \ref{lem:crat0:sfdv0:sprc0:04} with $K_4 = 2^W$. |

2139 | \end{vworklemmaproof} |

2140 | %\vworklemmafooter{} |

2141 | |

2142 | \begin{vworklemmastatement} |

2143 | \label{lem:crat0:sfdv0:sprx1:04} |

2144 | For Algorithm \ref{alg:crat0:sfdv0:01d} |

2145 | (Figure \ref{fig:crat0:sfdv0:01d}), once steady |

2146 | state has been achieved, the number of consecutive |

2147 | base subroutine invocations during which subroutine |

2148 | ``\texttt{A()}'' is executed is always in the set |

2149 | |

2150 | \begin{equation} |

2151 | \label{eq:lem:crat0:sfdv0:sprx1:04:01} |

2152 | \left\{ |

2153 | \left\lfloor \frac{K_2}{2^W - K_2} \right\rfloor , |

2154 | \left\lceil \frac{K_2}{2^W - K_2} \right\rceil |

2155 | \right\} \cap \vworkintsetpos, |

2156 | \end{equation} |

2157 | |

2158 | which contains one integer if $K_2/2^W \leq 1/2$ or $(2^W-K_2) \vworkdivides K_2$, |

2159 | or two integers otherwise. |

2160 | |

2161 | Once steady state has been achieved, the number of |

2162 | consecutive base function invocations during which |

2163 | subroutine ``\texttt{A()}'' is not executed is |

2164 | always in the set |

2165 | |

2166 | \begin{equation} |

2167 | \label{eq:lem:crat0:sfdv0:sprx1:04:02} |

2168 | \left\{ |

2169 | \left\lfloor \frac{2^W-K_2}{K_2} \right\rfloor , |

2170 | \left\lceil \frac{2^W-K_2}{K_2} \right\rceil |

2171 | \right\} \cap \vworkintsetpos, |

2172 | \end{equation} |

2173 | |

2174 | which contains one integer if $K_2/2^W \geq 1/2$ or $K_2 \vworkdivides 2^W$, |

2175 | or two integers otherwise. |

2176 | \end{vworklemmastatement} |

2177 | \begin{vworklemmaproof} |

2178 | The proof is identical to the proof of Lemma |

2179 | \ref{lem:crat0:sfdv0:sprc0:03} with $K_4 = 2^W$. |

2180 | \end{vworklemmaproof} |

2181 | \vworklemmafooter{} |

2182 | |

2183 | |

2184 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2185 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2186 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2187 | \subsection[Properties Of Algorithm \ref{alg:crat0:sfdv0:02a}] |

2188 | {Properties Of Rational Counting Algorithm \ref{alg:crat0:sfdv0:02a}} |

2189 | %Section tag: PRC2 |

2190 | \label{crat0:sfdv0:sprc2} |

2191 | |

2192 | Another useful rational counting algorithm is Algorithm \ref{alg:crat0:sfdv0:02a}. |

2193 | At first glance, it may appear that Algorithm \ref{alg:crat0:sfdv0:02a} |

2194 | is qualitatively |

2195 | different than Algorithms \ref{alg:crat0:sfdv0:01a} |

2196 | and \ref{alg:crat0:sfdv0:01b}. |

2197 | However, as the following lemmas demonstrate, Algorithm \ref{alg:crat0:sfdv0:02a} |

2198 | can be easily rearranged to be in the form |

2199 | of Algorithm \ref{alg:crat0:sfdv0:01a}. |

2200 | |

2201 | \begin{vworklemmastatement} |

2202 | \label{lem:crat0:sfdv0:sprc2:01} |

2203 | $N_{STARTUP}$, the number of invocations of the base subroutine |

2204 | in Algorithm \ref{alg:crat0:sfdv0:02a} before ``\texttt{A()}'' is called |

2205 | for the first time, is given by |

2206 | |

2207 | \begin{equation} |

2208 | \label{eq:lem:crat0:sfdv0:sprc2:01:01} |

2209 | N_{STARTUP} = |

2210 | \left\lceil |

2211 | { |

2212 | \frac{K_3 - K_1}{K_2} |

2213 | } |

2214 | \right\rceil . |

2215 | \end{equation} |

2216 | \end{vworklemmastatement} |

2217 | \begin{vworklemmaproof} |

2218 | The value of \texttt{state} after the $n$th invocation |

2219 | is $K_1 + n K_2$. In order for the test in the |

2220 | \texttt{if()} statement to be met on the $n+1$'th invocation |

2221 | of the base subroutine, we require that |

2222 | |

2223 | \begin{equation} |

2224 | \label{eq:lem:crat0:sfdv0:sprc2:01:02} |

2225 | K_1 + n K_2 \geq K_3 |

2226 | \end{equation} |

2227 | |

2228 | \noindent{}or equivalently that |

2229 | |

2230 | \begin{equation} |

2231 | \label{eq:lem:crat0:sfdv0:sprc2:01:03} |

2232 | n \geq \frac{K_3 - K_1}{K_2} . |

2233 | \end{equation} |

2234 | |

2235 | Solving (\ref{eq:lem:crat0:sfdv0:sprc2:01:03}) for the smallest |

2236 | value of $n \in \vworkintset$ which still meets the criterion |

2237 | yields (\ref{eq:lem:crat0:sfdv0:sprc2:01:01}). Note that |

2238 | the derivation of (\ref{eq:lem:crat0:sfdv0:sprc2:01:01}) requires |

2239 | that the restrictions on $K_1$ through $K_4$ documented in |

2240 | Figure \ref{fig:crat0:sfdv0:02a} be met. |

2241 | \end{vworklemmaproof} |

2242 | |

2243 | \begin{vworklemmastatement} |

2244 | \label{lem:crat0:sfdv0:sprc2:02} |

2245 | Let $N_I$ be the number of times the Algorithm \ref{alg:crat0:sfdv0:02a} base subroutine |

2246 | is called, let $N_{OA}$ be the number of times the |

2247 | ``\texttt{A()}'' subroutine is called, let |

2248 | $f_I$ be the frequency of invocation of the |

2249 | Algorithm \ref{alg:crat0:sfdv0:02a} base subroutine, and let |

2250 | $f_{OA}$ be the frequency of invocation of |

2251 | ``\texttt{A()}''. Then, the proportion of times the |

2252 | ``\texttt{A()}'' subroutine is called is given by |

2253 | |

2254 | \begin{equation} |

2255 | \label{eq:lem:crat0:sfdv0:sprc2:02:01} |

2256 | \lim_{N_I\rightarrow\infty}\frac{N_{OA}}{N_I} |

2257 | = |

2258 | \frac{f_{OA}}{f_I} |

2259 | = |

2260 | \frac{K_2}{K_4 + K_2} , |

2261 | \end{equation} |

2262 | |

2263 | and the proportion of times the ``\texttt{B()}'' subroutine is called |

2264 | is given by |

2265 | |

2266 | \begin{equation} |

2267 | \label{eq:lem:crat0:sfdv0:sprc2:02:02} |

2268 | \lim_{N_I\rightarrow\infty}\frac{N_{OB}}{N_I} |

2269 | = |

2270 | \frac{f_{OB}}{f_I} |

2271 | = |

2272 | 1 - \frac{f_{OA}}{f_I} |

2273 | = |

2274 | \frac{K_4}{K_4 + K_2} . |

2275 | \end{equation} |

2276 | \end{vworklemmastatement} |

2277 | \begin{vworklemmaproof} |

2278 | As in Lemma \ref{} and without |

2279 | loss of generality, we assume for analytic |

2280 | convenience that $K_1=0$ and $K_3=K_4$. Note that |

2281 | $K_1$ and $K_3$ influence only the transient startup |

2282 | behavior of the algorithm. |

2283 | |

2284 | It can be observed from the algorithm that once steady |

2285 | state is achieved, \texttt{state} will be confined to the set |

2286 | |

2287 | \begin{equation} |

2288 | \label{eq:lem:crat0:sfdv0:sprc2:02:10} |

2289 | state \in \{ 0, 1, \ldots , K_4 + K_2 - 1 \} . |

2290 | \end{equation} |

2291 | |

2292 | It is certainly possible to use results from |

2293 | number theory and analyze which values in the |

2294 | set (\ref{eq:lem:crat0:sfdv0:sprc2:02:10}) can be |

2295 | attained and the order in which they can be attained. |

2296 | However, an easier approach is to observe that |

2297 | Algorithm \ref{alg:crat0:sfdv0:02a} |

2298 | can be rearranged to take the form of |

2299 | rational counting Algorithm \ref{alg:crat0:sfdv0:01a}. |

2300 | This rearranged |

2301 | algorithm is presented as |

2302 | Figure \ref{fig:lem:crat0:sfdv0:sprc2:02:01}. Note that the |

2303 | algorithm is rearranged only for easier analysis. |

2304 | |

2305 | \begin{figure} |

2306 | \begin{verbatim} |

2307 | void base_rate_sub(void) |

2308 | { |

2309 | static unsigned int state = K1; |

2310 | |

2311 | state += K2; |

2312 | |

2313 | if (state >= (K4 + K2)) |

2314 | { |

2315 | state -= (K4 + K2); |

2316 | A(); |

2317 | } |

2318 | else |

2319 | { |

2320 | B(); |

2321 | } |

2322 | } |

2323 | \end{verbatim} |

2324 | \caption{Algorithm \ref{alg:crat0:sfdv0:02a} Modified To Resemble Algorithm \ref{alg:crat0:sfdv0:01a} |

2325 | (Proof Of Lemma \ref{lem:crat0:sfdv0:sprc2:02})} |

2326 | \label{fig:lem:crat0:sfdv0:sprc2:02:01} |

2327 | \end{figure} |

2328 | |

2329 | In Figure \ref{fig:lem:crat0:sfdv0:sprc2:02:01}, the |

2330 | statement ``\texttt{state += K2}'' has been removed from the |

2331 | \texttt{else} clause and placed above the \texttt{if()} statement, |

2332 | and other constants have been adjusted accordingly. |

2333 | It can be observed that the figure |

2334 | is structurally identical to rational counting algorithm, except for the |

2335 | \texttt{else} clause (which does not affect the counting behavior) and |

2336 | the specific constants for testing and incrementation. |

2337 | |

2338 | In Figure \ref{fig:lem:crat0:sfdv0:sprc2:02:01}, as contrasted with |

2339 | Algorithm \ref{alg:crat0:sfdv0:01a}, ``$K_4 + K_2$'' takes the |

2340 | place of $K_4$. $\gcd(K_2, K_4 + K_2) = \gcd(K_2, K_4)$ |

2341 | (see Lemma \cprizeroxrefhyphen\ref{lem:cpri0:gcd0:01}), so the |

2342 | results from |

2343 | \end{vworklemmaproof} |

2344 | |

2345 | \begin{vworklemmastatement} |

2346 | \label{lem:crat0:sfdv0:sprc2:03} |

2347 | If $K_3=K_4$, $K_1=0$, and $\gcd(K_2, K_4)=1$, the error between |

2348 | the approximation to $N_O$ implemented by Algorithm \ref{alg:crat0:sfdv0:01b} |

2349 | and the ``ideal'' mapping is always |

2350 | in the set |

2351 | |

2352 | \begin{equation} |

2353 | \label{eq:lem:crat0:sfdv0:sprc2:03:01} |

2354 | \left[ - \frac{K_4 - 1}{K_4} , 0 \right] , |

2355 | \end{equation} |

2356 | |

2357 | and no algorithm can be constructed to |

2358 | confine the error to a smaller interval. |

2359 | \end{vworklemmastatement} |

2360 | \begin{vworklemmaproof} |

2361 | The proof is identical to Lemma \ref{lem:crat0:sfdv0:sprc0:04}. |

2362 | \end{vworklemmaproof} |

2363 | |

2364 | |

2365 | |

2366 | |

2367 | |

2368 | |

2369 | |

2370 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2371 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2372 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2373 | \section{Bresenham's Line Algorithm} |

2374 | %Section tag: BLA0 |

2375 | \label{crat0:sbla0} |

2376 | |

2377 | \index{Bresenham's line algorithm}\emph{Bresenham's line algorithm} is a |

2378 | very efficient algorithm for drawing lines on devices that have |

2379 | a rectangular array of pixels which can be individually illuminated. |

2380 | Bresenham's line algorithm is efficient for small microcontrollers |

2381 | because it relies only |

2382 | on integer addition, subtraction, shifting, and comparison. |

2383 | |

2384 | Bresenham's line algorithm is presented for two reasons: |

2385 | |

2386 | \begin{itemize} |

2387 | \item The algorithm is useful for drawing lines on LCD |

2388 | displays and other devices typically controlled by |

2389 | microcontrollers. |

2390 | \item The algorithm is an [extremely optimized] application |

2391 | of the rational |

2392 | counting algorithms presented in this chapter. |

2393 | \end{itemize} |

2394 | |

2395 | \begin{figure} |

2396 | \begin{center} |

2397 | \begin{huge} |

2398 | Figure Space Reserved |

2399 | \end{huge} |

2400 | \end{center} |

2401 | \caption{Raster Grid For Development Of Bresenham's Line Algorithm} |

2402 | \label{fig:crat0:sbla0:01} |

2403 | \end{figure} |

2404 | |

2405 | Assume that we wish to draw a line from $(0,0)$ to $(x_f, y_f)$ on |

2406 | a raster device (Figure \ref{fig:crat0:sbla0:01}). For simplicity of |

2407 | development, assume that $y_f \leq x_f$ (i.e. that the slope $m \leq 1$). |

2408 | |

2409 | For each value of $x \in \vworkintset$, the ideal value of $y$ is given |

2410 | by |

2411 | |

2412 | \begin{equation} |

2413 | \label{eq:crat0:sbla0:01} |

2414 | y = mx = \frac{y_f}{x_f} x = \frac{y_f x}{x_f} . |

2415 | \end{equation} |

2416 | |

2417 | \noindent{}However, on a raster device, we must usually |

2418 | choose an inexact pixel to illuminate, since it is typically |

2419 | rare that $x_f \vworkdivides y_f x$. If |

2420 | $x_f \vworkdivides y_f x$, then the ideal value of $y$ is |

2421 | an integer, and we choose to illuminate |

2422 | $(x, (y_f x)/x_f)$. However, if $x_f \vworknotdivides y_f x$, |

2423 | then we must choose either a pixel with the same y-coordinate |

2424 | as the previous pixel (we call this choice `D') or the pixel |

2425 | with a y-coordinate one greater than the previous pixel (we |

2426 | call this choice `U'). |

2427 | The fractional part of the quotient |

2428 | $(y_f x) / x_f$ indicates whether D or U is closer to the ideal line. |

2429 | If $y_f x \bmod x_f \geq x_f/2$, we choose U, otherwise we choose D |

2430 | (note that the decision to choose U in the equality case is arbitrary). |

2431 | |

2432 | Using the rational approximation techniques presented in |

2433 | Section \ref{crat0:sfdv0}, it is straightforward to |

2434 | develop an algorithm, which is presented as the code |

2435 | in Figure \ref{fig:crat0:sbla0:02}. |

2436 | Note that this code will only work if $m = y_f/x_f \leq 1$. |

2437 | |

2438 | \begin{figure} |

2439 | \begin{verbatim} |

2440 | /* Draws a line from (0,0) to (x_f,y_f) on a raster */ |

2441 | /* device. */ |

2442 | |

2443 | void bresenham_line(int x_f, int y_f) |

2444 | { |

2445 | int d=0; /* The modulo counter. */ |

2446 | int x=0, y=0; |

2447 | /* x- and y-coordinates currently being */ |

2448 | /* evaluated. */ |

2449 | int d_old; /* Remembers previous value of d. */ |

2450 | |

2451 | plotpoint(0,0); /* Plot initial point. */ |

2452 | while (x <= x_f) |

2453 | { |

2454 | d_old = d; |

2455 | d += y_f; |

2456 | if (d >= x_f) |

2457 | d -= x_f; |

2458 | x++; |

2459 | if ( |

2460 | ( |

2461 | (d == 0) && (d_old < x_f/2) |

2462 | ) |

2463 | || |

2464 | ( |

2465 | (d >= x_f/2) |

2466 | && |

2467 | ((d_old < x_f/2) || (d_old >= d)) |

2468 | ) |

2469 | ) |

2470 | y++; |

2471 | plotpoint(x,y); |

2472 | } |

2473 | } |

2474 | \end{verbatim} |

2475 | \caption{First Attempt At A Raster Device Line Algorithm |

2476 | Using Rational Counting Techniques} |

2477 | \label{fig:crat0:sbla0:02} |

2478 | \end{figure} |

2479 | |

2480 | There are a few efficiency refinements that can be made to |

2481 | the code in Figure \ref{fig:crat0:sbla0:02}, but overall |

2482 | it is a very efficient algorithm. Note that |

2483 | nearly all compilers will handle the integer |

2484 | division by two using a shift |

2485 | operation rather than a division. |

2486 | |

2487 | We can however substantially simplify and economize the code of |

2488 | Figure \ref{fig:crat0:sbla0:02} by using the technique |

2489 | presented in Figures \ref{fig:crat0:sfdv0:fab0:03} and |

2490 | \ref{fig:crat0:sfdv0:fab0:04}, and this improved code is |

2491 | presented as Figure \ref{fig:crat0:sbla0:03}. |

2492 | |

2493 | \begin{figure} |

2494 | \begin{verbatim} |

2495 | /* Draws a line from (0,0) to (x_f,y_f) on a raster */ |

2496 | /* device. */ |

2497 | |

2498 | void bresenham_line(int x_f, int y_f) |

2499 | { |

2500 | int d=y_f; /* Position of the ideal line minus */ |

2501 | /* the position of the line we are */ |

2502 | /* drawing, in units of 1/x_f. The */ |

2503 | /* initialization value is y_f because */ |

2504 | /* the algorithm is looking one pixel */ |

2505 | /* ahead in the x direction, so we */ |

2506 | /* begin at x=1. */ |

2507 | int x=0, y=0; |

2508 | /* x- and y-coordinates currently being */ |

2509 | /* evaluated. */ |

2510 | plotpoint(0,0); /* Plot initial point. */ |

2511 | while (x <= x_f) |

2512 | { |

2513 | x++; /* We move to the right regardless. */ |

2514 | if (d >= x_f/2) |

2515 | { |

2516 | /* The "U" choice. We must jump up a pixel */ |

2517 | /* to keep up with the ideal line. */ |

2518 | d += (y_f - x_f); |

2519 | y++; /* Jump up a pixel. */ |

2520 | } |

2521 | else /* d < x_f/2 */ |

2522 | { |

2523 | /* The "D" choice. Distance is not large */ |

2524 | /* enough to jump up a pixel. */ |

2525 | d += y_f; |

2526 | } |

2527 | plotpoint(x,y); |

2528 | } |

2529 | } |

2530 | \end{verbatim} |

2531 | \caption{Second Attempt At A Raster Device Line Algorithm |

2532 | Using Rational Counting Techniques} |

2533 | \label{fig:crat0:sbla0:03} |

2534 | \end{figure} |

2535 | |

2536 | In order to understand the code of Figure \ref{fig:crat0:sbla0:03}, |

2537 | it is helpful to view the problem in an alternate way. |

2538 | For any $x \in \vworkintset$, let |

2539 | $d$ be the distance between the position of the ideal line |

2540 | (characterized by $y = y_f x / x_f$) and |

2541 | the actual pixel which will be illuminated. It is easy to |

2542 | observe that: |

2543 | |

2544 | \begin{itemize} |

2545 | \item When drawing a raster line, if one proceeds from |

2546 | $(x, y)$ to $(x+1, y)$ (i.e. makes the ``D'' choice), |

2547 | $d$ will increase by $y_f/x_f$. |

2548 | \item When drawing a raster line, if one proceeds from |

2549 | $(x,y)$ to $(x+1, y+1)$ (i.e. makes the ``U'' choice), |

2550 | $d$ will increase by $(y_f - x_f)/x_f$. (The increase |

2551 | of $y_f/x_f$ comes about because the ideal line proceeds |

2552 | upward from $x$ to $x+1$, while the decrease of $x_f/x_f = 1$ |

2553 | comes about because the line being drawn jumps upward by one |

2554 | unit, thus tending to ``catch'' the ideal line.) |

2555 | \end{itemize} |

2556 | |

2557 | The code of Figure \ref{fig:crat0:sbla0:03} implements the |

2558 | two observations above in a straightforward way. $d$ is maintained |

2559 | in units of $1/x_f$, and when ``U'' is chosen over ``D'' whenever |

2560 | the gap between the ideal line and the current row of pixels |

2561 | being drawn becomes too large. |

2562 | |

2563 | The code in Figure \ref{fig:crat0:sbla0:03} does however contain logical |

2564 | and performance problems which should be corrected: |

2565 | |

2566 | \begin{itemize} |

2567 | \item The test of $d$ against $x_f/2$ will perform as intended. |

2568 | For example, if $d=2$ and $x_f=5$, the test |

2569 | ``\texttt{d >= x\_f/2}'' in the code will evaluate true |

2570 | although the actual condition is false. To correct this |

2571 | defect, the units of $d$ should be changed from |

2572 | $1/x_f$ to $1/(2 x_f)$. |

2573 | \item The quantity $y_f - x_f$ is calculated repeatedly. This |

2574 | calculation should be moved out of the \emph{while()} loop. |

2575 | \item The test against $x_f$ may be more economical if changed to |

2576 | a test against 0 (but this requires a different initialization |

2577 | assignment for $d$). |

2578 | \end{itemize} |

2579 | |

2580 | Figure \ref{fig:crat0:sbla0:04} corrects these defects |

2581 | from Figure \ref{fig:crat0:sbla0:03}. |

2582 | Figure \ref{fig:crat0:sbla0:04} is essentially the Bresenham |

2583 | line algorithm, except that it only draws starting from the |

2584 | origin and will only draw a line with a slope |

2585 | $m = y_f/x_f \leq 1$. |

2586 | |

2587 | \begin{figure} |

2588 | \begin{verbatim} |

2589 | /* Draws a line from (0,0) to (x_f,y_f) on a raster */ |

2590 | /* device. */ |

2591 | |

2592 | void bresenham_line(int x_f, int y_f) |

2593 | { |

2594 | int d = 2 * y_f - x_f; |

2595 | /* Position of the ideal line minus */ |

2596 | /* the position of the line we are */ |

2597 | /* drawing, in units of 1/(2 * x_f). */ |

2598 | /* Initialization value of 2 * y_f is */ |

2599 | /* because algorithm is looking one */ |

2600 | /* pixel ahead. Value of -x_f is from */ |

2601 | /* shifting the midpoint test (the */ |

2602 | /* "if" statement below) downward to a */ |

2603 | /* test against zero. */ |

2604 | int dD = 2 * y_f; |

2605 | int dU = dD - x_f; |

2606 | /* Amounts to add to d if "D" and "U" */ |

2607 | /* pixels are chosen, respectively. */ |

2608 | /* Calculated here outside of loop. */ |

2609 | int x=0, y=0; |

2610 | /* x- and y-coordinates currently being */ |

2611 | /* evaluated. */ |

2612 | plotpoint(0,0); /* Plot initial point. */ |

2613 | while (x <= x_f) |

2614 | { |

2615 | x++; /* We move to the right regardless. */ |

2616 | if (d >= 0) |

2617 | { |

2618 | /* The "U" choice. We must jump up a pixel */ |

2619 | /* to keep up with the ideal line. */ |

2620 | d += dU; |

2621 | y++; /* Jump up a pixel. */ |

2622 | } |

2623 | else /* d < 0 */ |

2624 | { |

2625 | /* The "D" choice. Distance is not large */ |

2626 | /* enough to jump up a pixel. */ |

2627 | d += dD; |

2628 | } |

2629 | plotpoint(x,y); |

2630 | } |

2631 | } |

2632 | \end{verbatim} |

2633 | \caption{Third Attempt At A Raster Device Line Algorithm |

2634 | Using Rational Counting Techniques} |

2635 | \label{fig:crat0:sbla0:04} |

2636 | \end{figure} |

2637 | |

2638 | |

2639 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2640 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2641 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2642 | \section{Authors And Acknowledgements} |

2643 | %Section tag: ACK0 |

2644 | This chapter was primarily written by |

2645 | \index{Ashley, David T.} David T. Ashley |

2646 | \cite{bibref:i:daveashley}. |

2647 | |

2648 | We would like to gratefully acknowledge the assistance of |

2649 | \index{Falconer, Chuck B.} Chuck B. Falconer \cite{bibref:i:chuckbfalconer}, |

2650 | \index{Hoffmann, Klaus} Klaus Hoffmann \cite{bibref:i:klaushoffmann}, |

2651 | \index{Larkin, John} John Larkin \cite{bibref:i:johnlarkin}, |

2652 | \index{Smith, Thad} Thad Smith \cite{bibref:i:thadsmith}, |

2653 | and |

2654 | \index{Voipio, Tauno} Tauno Voipio \cite{bibref:i:taunovoipio} |

2655 | for insight into rational counting approaches, contributed via the |

2656 | \texttt{sci.math} \cite{bibref:n:scimathnewsgroup} |

2657 | and |

2658 | \texttt{comp.arch.embedded} \cite{bibref:n:comparchembedded} |

2659 | newsgroups. |

2660 | |

2661 | |

2662 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2663 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2664 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2665 | \section{Exercises} |

2666 | %Section tag: EXE0 |

2667 | |

2668 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2669 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2670 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2671 | \subsection[\protect\mbox{\protect$h/2^q$} and \protect\mbox{\protect$2^q/k$} Rational Linear Approximation] |

2672 | {\protect\mbox{\protect\boldmath$h/2^q$} and \protect\mbox{\protect\boldmath$2^q/k$} Rational Linear Approximation} |

2673 | |

2674 | \begin{vworkexercisestatement} |

2675 | \label{exe:crat0:sexe0:a01} |

2676 | Derive equations analogous to (\ref{eq:crat0:shqq0:dph0:03}) |

2677 | and (\ref{eq:crat0:shqq0:dph0:07}) if $r_A$ is chosen |

2678 | without rounding, i.e. |

2679 | $h=\lfloor r_I 2^q \rfloor$ and therefore |

2680 | $r_A=\lfloor r_I 2^q \rfloor/2^q$. |

2681 | \end{vworkexercisestatement} |

2682 | \vworkexercisefooter{} |

2683 | |

2684 | \begin{vworkexercisestatement} |

2685 | \label{exe:crat0:sexe0:a02} |

2686 | Derive equations analogous to (\ref{eq:crat0:shqq0:dph0:03}) |

2687 | and (\ref{eq:crat0:shqq0:dph0:07}) if |

2688 | $z$ is chosen for rounding with the midpoint case rounded |

2689 | down, i.e. $z=2^{q-1}-1$, and applied as in |

2690 | (\ref{eq:crat0:sint0:01}). |

2691 | \end{vworkexercisestatement} |

2692 | \vworkexercisefooter{} |

2693 | |

2694 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2695 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2696 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2697 | \subsection{Rational Counting} |

2698 | |

2699 | |

2700 | \begin{vworkexercisestatement} |

2701 | \label{exe:crat0:sexe0:01} |

2702 | For Algorithm \ref{alg:crat0:sfdv0:01a}, |

2703 | assume that one chooses $K_1 > K_3 - K_2$ (in contradiction to the |

2704 | restrictions in Figure \ref{fig:crat0:sfdv0:01a}). |

2705 | Derive a result similar to Lemma \ref{lem:crat0:sfdv0:sprc0:01} |

2706 | for the number of base subroutine invocations in which |

2707 | ``\texttt{A()}'' is run before it is |

2708 | \emph{not} run for the first time. |

2709 | \end{vworkexercisestatement} |

2710 | \vworkexercisefooter{} |

2711 | |

2712 | \begin{vworkexercisestatement} |

2713 | \label{exe:crat0:sexe0:02} |

2714 | This will be the $\epsilon$ lemma proof. |

2715 | \end{vworkexercisestatement} |

2716 | \vworkexercisefooter{} |

2717 | |

2718 | \begin{vworkexercisestatement} |

2719 | \label{exe:crat0:sexe0:03} |

2720 | Rederive appropriate results similar to |

2721 | Lemma \ref{lem:crat0:sfdv0:sprc0:04} in the case where |

2722 | $\gcd(K_2, K_4) > 1$. |

2723 | \end{vworkexercisestatement} |

2724 | \vworkexercisefooter{} |

2725 | |

2726 | \begin{vworkexercisestatement} |

2727 | \label{exe:crat0:sexe0:04} |

2728 | Rederive appropriate results similar to |

2729 | Lemma \ref{lem:crat0:sfdv0:sprc0:04} in the case where |

2730 | $K_1 \neq 0$. |

2731 | \end{vworkexercisestatement} |

2732 | \vworkexercisefooter{} |

2733 | |

2734 | \begin{vworkexercisestatement} |

2735 | \label{exe:crat0:sexe0:05} |

2736 | Rederive appropriate results similar to |

2737 | Lemma \ref{lem:crat0:sfdv0:sprc0:04} in the case where |

2738 | $\gcd(K_2, K_4) > 1$ and $K_1 \neq 0$. |

2739 | \end{vworkexercisestatement} |

2740 | \vworkexercisefooter{} |

2741 | |

2742 | \begin{vworkexercisestatement} |

2743 | \label{exe:crat0:sexe0:06} |

2744 | For Lemma \ref{lem:crat0:sfdv0:sprc0:04}, |

2745 | complete the missing proof: |

2746 | show that if $\gcd(K_2, K_4) = 1$, no algorithm can |

2747 | lead to a tighter bound than (\ref{eq:lem:crat0:sfdv0:sprc0:04:01}). |

2748 | \textbf{Hint:} start with the observation |

2749 | that with |

2750 | $\gcd(K_2, K_4) = 1$, $n K_2 \bmod K_4$ will attain every value in |

2751 | the set $\{ 0, \ldots , K_4-1 \}$. |

2752 | \end{vworkexercisestatement} |

2753 | \vworkexercisefooter{} |

2754 | |

2755 | \begin{vworkexercisestatement} |

2756 | \label{exe:crat0:sexe0:07} |

2757 | For Lemma \ref{lem:crat0:sfdv0:sprc0:03}, |

2758 | show that if $K_1=0$, the number of initial invocations |

2759 | of the base subroutine before ``\texttt{A()}'' is first |

2760 | called is in the set specified in |

2761 | (\ref{eq:lem:crat0:sfdv0:sprc0:03:01}). |

2762 | \end{vworkexercisestatement} |

2763 | \vworkexercisefooter{} |

2764 | |

2765 | \begin{vworkexercisestatement} |

2766 | \label{exe:crat0:sexe0:08} |

2767 | Develop other techniques to correct the state upset vulnerability |

2768 | of Algorithm \ref{alg:crat0:sfdv0:01a} besides |

2769 | the technique illustrated in |

2770 | Figure \ref{fig:crat0:sfdv0:sprc0:01}. |

2771 | \end{vworkexercisestatement} |

2772 | \vworkexercisefooter{} |

2773 | |

2774 | \begin{vworkexercisestatement} |

2775 | \label{exe:crat0:sexe0:09} |

2776 | Show for Example \ref{ex:crat0:sfdv0:sprc1:01} that integers of at least |

2777 | 27 bits are required. |

2778 | \end{vworkexercisestatement} |

2779 | \vworkexercisefooter{} |

2780 | |

2781 | \begin{vworkexercisestatement} |

2782 | \label{exe:crat0:sexe0:10} |

2783 | Prove Lemma \ref{lem:crat0:sfdv0:sprc1:02}. |

2784 | \end{vworkexercisestatement} |

2785 | \vworkexercisefooter{} |

2786 | |

2787 | \begin{vworkexercisestatement} |

2788 | \label{exe:crat0:sexe0:12} |

2789 | Prove Lemma \ref{lem:crat0:sfdv0:sprc1:04}. |

2790 | \end{vworkexercisestatement} |

2791 | \vworkexercisefooter{} |

2792 | |

2793 | \begin{vworkexercisestatement} |

2794 | \label{exe:crat0:sexe0:13} |

2795 | Define the term \emph{steady state} as used in |

2796 | Lemma \ref{lem:crat0:sfdv0:sprc1:04} in terms of |

2797 | set membership of the \texttt{state} variable. |

2798 | \end{vworkexercisestatement} |

2799 | \vworkexercisefooter{} |

2800 | |

2801 | \begin{vworkexercisestatement} |

2802 | \label{exe:crat0:sexe0:14} |

2803 | For Algorithm \ref{alg:crat0:sfdv0:01a}, devise examples of anomalous behavior due to |

2804 | race conditions that may occur if $K_2$ and/or $K_4$ are set in a process |

2805 | which is asynchronous with respect to the process which implements the |

2806 | rational counting algorithm if mutual exclusion protocol is not |

2807 | implemented. |

2808 | \end{vworkexercisestatement} |

2809 | \vworkexercisefooter{} |

2810 | |

2811 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2812 | \vfill |

2813 | \noindent\begin{figure}[!b] |

2814 | \noindent\rule[-0.25in]{\textwidth}{1pt} |

2815 | \begin{tiny} |

2816 | \begin{verbatim} |

2817 | $RCSfile: c_rat0.tex,v $ |

2818 | $Source: /home/dashley/cvsrep/e3ft_gpl01/e3ft_gpl01/dtaipubs/esrgubka/c_rat0/c_rat0.tex,v $ |

2819 | $Revision: 1.28 $ |

2820 | $Author: dtashley $ |

2821 | $Date: 2004/02/22 19:27:48 $ |

2822 | \end{verbatim} |

2823 | \end{tiny} |

2824 | \noindent\rule[0.25in]{\textwidth}{1pt} |

2825 | \end{figure} |

2826 | |

2827 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |

2828 | % $Log: c_rat0.tex,v $ |

2829 | % Revision 1.28 2004/02/22 19:27:48 dtashley |

2830 | % Edits. |

2831 | % |

2832 | % Revision 1.27 2004/02/22 15:01:53 dtashley |

2833 | % Edits. |

2834 | % |

2835 | % Revision 1.26 2003/12/06 17:48:49 dtashley |

2836 | % Final edits before move back to SourceForge. |

2837 | % |

2838 | % Revision 1.25 2003/04/08 01:21:16 dtashley |

2839 | % Checkin after major ripup to mechanism for documenting algorithms. |

2840 | % |

2841 | % Revision 1.24 2003/04/07 09:38:23 dtashley |

2842 | % Safety checkin before major tearup with algorithms. |

2843 | % |

2844 | % Revision 1.23 2003/04/04 04:05:40 dtashley |

2845 | % Safety checkin before another major edit. |

2846 | % |

2847 | % Revision 1.22 2003/04/03 19:49:36 dtashley |

2848 | % Global corrections to typeface of "gcd" made as per Jan-Hinnerk Reichert's |

2849 | % recommendation. |

2850 | % |

2851 | % Revision 1.21 2003/04/03 19:33:13 dtashley |

2852 | % Substantial edits. Safety checkin. Preparing to make corrections to |

2853 | % gcd typeface pointed out my Jan-Hinnerk Reichert. |

2854 | % |

2855 | % Revision 1.20 2003/04/02 08:21:16 dtashley |

2856 | % Substantial edits, safety checkin. |

2857 | % |

2858 | % Revision 1.19 2003/03/30 05:37:20 dtashley |

2859 | % Evening safety checkin. Substantial edits. |

2860 | % |

2861 | % Revision 1.18 2003/03/28 07:24:16 dtashley |

2862 | % Safety checkin, substantial edits. |

2863 | % |

2864 | % Revision 1.17 2003/03/25 05:31:40 dtashley |

2865 | % Substantial edits, safety checkin. |

2866 | % |

2867 | % Revision 1.16 2003/03/21 06:34:54 dtashley |

2868 | % Major revisions. Safety checkin. |

2869 | % |

2870 | % Revision 1.15 2003/03/18 06:20:48 dtashley |

2871 | % Substantial edits, safety checkin. |

2872 | % |

2873 | % Revision 1.14 2003/03/13 06:28:36 dtashley |

2874 | % Substantial progress, edits. |

2875 | % |

2876 | % Revision 1.13 2003/03/08 04:11:19 dtashley |

2877 | % Friday evening safety checkin. |

2878 | % |

2879 | % Revision 1.12 2003/03/05 02:37:34 dtashley |

2880 | % Safety checkin before major edits. |

2881 | % |

2882 | % Revision 1.11 2003/03/03 23:50:44 dtashley |

2883 | % Substantial edits. Safety checkin. |

2884 | % |

2885 | % Revision 1.10 2002/04/27 00:21:04 dtashley |

2886 | % Substantial edits--preparing for review. |

2887 | % |

2888 | % Revision 1.9 2002/04/26 03:47:22 dtashley |

2889 | % Substantial edits. |

2890 | % |

2891 | % Revision 1.8 2002/04/23 02:58:53 dtashley |

2892 | % Edits. |

2893 | % |

2894 | % Revision 1.7 2002/04/22 07:27:32 dtashley |

2895 | % Preparing to work on desktop computer again. |

2896 | % |

2897 | % Revision 1.6 2002/04/22 04:47:30 dtashley |

2898 | % Preparing to work on laptop. |

2899 | % |

2900 | % Revision 1.5 2002/04/22 02:11:54 dtashley |

2901 | % Preparing to resume work on desktop. |

2902 | % |

2903 | % Revision 1.4 2002/04/22 00:14:56 dtashley |

2904 | % Edits before resuming work on desktop. |

2905 | % |

2906 | % Revision 1.3 2002/04/21 23:05:09 dtashley |

2907 | % Version control information straightened out. |

2908 | % |

2909 | %End of file C_RAT0.TEX |

dashley@gmail.com | ViewVC Help |

Powered by ViewVC 1.1.25 |