# Diff of /pubs/books/ucbka/trunk/c_cil0/c_cil0.tex

revision 139 by dashley, Thu Oct 6 03:15:02 2016 UTC revision 140 by dashley, Mon Jul 3 01:59:16 2017 UTC
# Line 1  Line 1
1  %$Header: /home/dashley/cvsrep/e3ft_gpl01/e3ft_gpl01/dtaipubs/esrgubka/c_cil0/c_cil0.tex,v 1.24 2003/11/03 02:14:24 dtashley Exp$  %$Header$
2
3  \chapter[\ccilzeroshorttitle{}]{\ccilzerolongtitle{}}  \chapter[\ccilzeroshorttitle{}]{\ccilzerolongtitle{}}
4
5  \label{ccil0}  \label{ccil0}
6
7  \beginchapterquote{If our ancestors had invented arithmetic by counting with  \beginchapterquote{If our ancestors had invented arithmetic by counting with
8                       their two fists or their eight fingers, instead of their                       their two fists or their eight fingers, instead of their
9                       ten digits', we would never have to worry about                       ten digits', we would never have to worry about
10                       writing binary-decimal conversion routines.                       writing binary-decimal conversion routines.
11                       (And we would perhaps never have learned as much about                       (And we would perhaps never have learned as much about
12                       number systems.)''}                       number systems.)''}
13                       {Donald E. Knuth, \cite[p. 319]{bibref:b:knuthclassic2ndedvol2}}                       {Donald E. Knuth, \cite[p. 319]{bibref:b:knuthclassic2ndedvol2}}
14
15  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
16  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
17  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
18  \section{Introduction}  \section{Introduction}
19  %Section tag: INT0  %Section tag: INT0
20  \label{ccil0:sint0}  \label{ccil0:sint0}
21
22  Low-cost microcontrollers have no support for floating-point arithmetic,  Low-cost microcontrollers have no support for floating-point arithmetic,
23  and so integer arithmetic and fixed-point arithmetic are used nearly exclusively  and so integer arithmetic and fixed-point arithmetic are used nearly exclusively
24  in embedded systems.  The ability to implement integer arithmetic  in embedded systems.  The ability to implement integer arithmetic
25  economically is a critical skill in the development of embedded  economically is a critical skill in the development of embedded
26  systems.  systems.
27
28  Integer arithmetic algorithms are critically important in embedded  Integer arithmetic algorithms are critically important in embedded
29  systems for the following reasons:  systems for the following reasons:
30
31  \begin{itemize}  \begin{itemize}
32  \item Mistakes in the implementation of arithmetic are frequently  \item Mistakes in the implementation of arithmetic are frequently
33        responsible for product problems.  (Mistakes are not confined        responsible for product problems.  (Mistakes are not confined
34        to obvious errors---errors such as filters which do not converge        to obvious errors---errors such as filters which do not converge
35        on their input are also responsible for product problems.)        on their input are also responsible for product problems.)
36  \item Floating-point arithmetic is not available or ill-advised  \item Floating-point arithmetic is not available or ill-advised
37        for nearly all small embedded systems for the following reasons:        for nearly all small embedded systems for the following reasons:
38        \begin{itemize}        \begin{itemize}
39        \item Low-cost microcontrollers do not possess hardware support for        \item Low-cost microcontrollers do not possess hardware support for
40              floating-point arithmetic.              floating-point arithmetic.
41        \item Implementation of floating-point arithmetic in software is        \item Implementation of floating-point arithmetic in software is
42              computationally expensive.              computationally expensive.
43        \item Implementation of floating-point arithmetic in software may        \item Implementation of floating-point arithmetic in software may
44              require large floating-point libraries, typically consuming              require large floating-point libraries, typically consuming
45              1K-4K of ROM.              1K-4K of ROM.
46        \item Safety-critical software standards typically prohibit the        \item Safety-critical software standards typically prohibit the
47              use of floating-point arithmetic.              use of floating-point arithmetic.
48        \end{itemize}        \end{itemize}
49  \item Integer arithmetic algorithms (other than addition and subtraction)  \item Integer arithmetic algorithms (other than addition and subtraction)
50        are quite tedious and error-prone for a software developer to design, implement, and        are quite tedious and error-prone for a software developer to design, implement, and
51        unit test.  The implementation of such algorithms represents        unit test.  The implementation of such algorithms represents
52        cost and risk.  Cost and risk benefits are achieved if the algorithms in detail are        cost and risk.  Cost and risk benefits are achieved if the algorithms in detail are
53        available in advance (thus precluding design activities), or        available in advance (thus precluding design activities), or
54        better yet if ready-to-use integer algorithm libraries are available.        better yet if ready-to-use integer algorithm libraries are available.
55  \end{itemize}  \end{itemize}
56
57  This chapter describes the more fundamental principles and algorithms  This chapter describes the more fundamental principles and algorithms
58  (representation, fixed-point arithmetic, treatment of overflow, comparison,  (representation, fixed-point arithmetic, treatment of overflow, comparison,
59  addition, subtraction, multiplication, and division).  A section  addition, subtraction, multiplication, and division).  A section
60  (Section \ref{ccil0:smim0}) is also included  (Section \ref{ccil0:smim0}) is also included
61  on miscellaneous mappings involving integers which  on miscellaneous mappings involving integers which
62  are not numerical in intent.  are not numerical in intent.
63  Chapter %\cdtazeroxrefhyphen\cdtazerovolarabic{}  Chapter %\cdtazeroxrefhyphen\cdtazerovolarabic{}
64  TBD  TBD
65  describes more complicated  describes more complicated
66  integer algorithms and techniques (discrete-time operations  integer algorithms and techniques (discrete-time operations
67  such as filtering, integration, and differentiation as well as more  such as filtering, integration, and differentiation as well as more
68  complex functions such as square root).  The split between these two chapters  complex functions such as square root).  The split between these two chapters
69  is arbitrary; and in fact the material could have been divided differently  is arbitrary; and in fact the material could have been divided differently
70  or combined.  or combined.
71
72  Treatment of the topics in this chapter is largely in accordance with  Treatment of the topics in this chapter is largely in accordance with
73  Knuth \cite{bibref:b:knuthclassic2ndedvol2}.  The principal issues in  Knuth \cite{bibref:b:knuthclassic2ndedvol2}.  The principal issues in
74  the implementation of integer algorithms are:  the implementation of integer algorithms are:
75
76  \begin{itemize}  \begin{itemize}
77  \item \textbf{How to use the arithmetic [or other] instructions provided by the machine to  \item \textbf{How to use the arithmetic [or other] instructions provided by the machine to
78        operate on larger operands.}  Microcontrollers typically provide arithmetic        operate on larger operands.}  Microcontrollers typically provide arithmetic
79        instructions (comparison, shifting, addition, subtraction, and often but not        instructions (comparison, shifting, addition, subtraction, and often but not
80        always multiplication and/or division) that operate on 8-bit or 16-bit integers.          always multiplication and/or division) that operate on 8-bit or 16-bit integers.
81        A key question        A key question
82        is how small-operand instructions scale up''---that is, if and how they can        is how small-operand instructions scale up''---that is, if and how they can
83        be used to assist in the implementation of integer arithmetic for much larger        be used to assist in the implementation of integer arithmetic for much larger
84        operands.        operands.
85  \item \textbf{The order of the algorithm involved.}  The order of algorithms  \item \textbf{The order of the algorithm involved.}  The order of algorithms
86        is a complicated issue when applied to microcontroller work.  Many sophisticated        is a complicated issue when applied to microcontroller work.  Many sophisticated
87        algorithms have a breakpoint below which they are less economical than        algorithms have a breakpoint below which they are less economical than
88        an inferior algorithm.  Some applications (such as generating cryptographic keys        an inferior algorithm.  Some applications (such as generating cryptographic keys
89        when integers thousands of bits long must be tested for primality) will        when integers thousands of bits long must be tested for primality) will
90        benefit from sophisticated algorithms becuase the operand sizes are large enough        benefit from sophisticated algorithms becuase the operand sizes are large enough
91        to pass any such breakpoints.  However, in microcontroller work, the need to manipulate        to pass any such breakpoints.  However, in microcontroller work, the need to manipulate
92        integers longer than 64 bits is very rare; thus, the breakpoints that indicate the        integers longer than 64 bits is very rare; thus, the breakpoints that indicate the
93        use of more sophisticated algorithms may not be reached.  In microcontroller work,        use of more sophisticated algorithms may not be reached.  In microcontroller work,
94        depending on the operand sizes, there are circumstances in which an        depending on the operand sizes, there are circumstances in which an
95        $O(n^2)$ algorithm may be preferable to an $O(\log n)$ algorithm.  Generally,        $O(n^2)$ algorithm may be preferable to an $O(\log n)$ algorithm.  Generally,
96        the order of algorithms must be balanced against operand sizes.        the order of algorithms must be balanced against operand sizes.
97  \end{itemize}  \end{itemize}
98
99  We do diverge from Knuth in some areas.  The most prominent divergence is  We do diverge from Knuth in some areas.  The most prominent divergence is
100  in the proofs offered for some important theorems and lemmas.  Knuth  in the proofs offered for some important theorems and lemmas.  Knuth
101  employs contrapositive proof formats in many circumstances, whereas we prefer  employs contrapositive proof formats in many circumstances, whereas we prefer
102  to use linear proofs that are more understandable to engineers and microcontroller  to use linear proofs that are more understandable to engineers and microcontroller
103  software developers.  software developers.
104
105  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
106  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
107  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
109          {Paradigms And Principles Of Microcontroller Arithmetic}          {Paradigms And Principles Of Microcontroller Arithmetic}
110  %Section tag: PPM0  %Section tag: PPM0
111  \label{ccil0:sppm0}  \label{ccil0:sppm0}
112
113  How should one think about microcontroller arithmetic?  What principles  How should one think about microcontroller arithmetic?  What principles
114  guide us in its design and implementation?  In this section,  guide us in its design and implementation?  In this section,
115  we provide some general principles and paradigms of thought.  we provide some general principles and paradigms of thought.
116
117
118  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
119  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
120  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
121  \subsection{Microcontroller Arithmetic As An Accident Of Silicon Design}  \subsection{Microcontroller Arithmetic As An Accident Of Silicon Design}
122  %Subsection tag: MAS0  %Subsection tag: MAS0
123  \label{ccil0:sppm0:smas0}  \label{ccil0:sppm0:smas0}
124
125  In chapters  In chapters
126  \cfryzeroxrefhyphen{}\ref{cfry0},  \cfryzeroxrefhyphen{}\ref{cfry0},
127  \ccfrzeroxrefhyphen{}\ref{ccfr0},  \ccfrzeroxrefhyphen{}\ref{ccfr0},
128  and  and
129  \cratzeroxrefhyphen{}\ref{crat0}  \cratzeroxrefhyphen{}\ref{crat0}
130  we consider rational approximation,  we consider rational approximation,
131  both in the form $h/k$ and $h/2^q$.  Both forms of rational approximation  both in the form $h/k$ and $h/2^q$.  Both forms of rational approximation
132  tend to be effective because we know that all modern processors possess  tend to be effective because we know that all modern processors possess
133  shift instructions, most possess integer multiply instructions, and many  shift instructions, most possess integer multiply instructions, and many
134  possess integer divide instructions.  In other words, the design  possess integer divide instructions.  In other words, the design
135  of the machine instruction set drives the strategies for implementation  of the machine instruction set drives the strategies for implementation
136  of arithmetic, and makes some strategies attractive.  of arithmetic, and makes some strategies attractive.
137
138  Similarly, the observation that all microcontrollers provide instructions  Similarly, the observation that all microcontrollers provide instructions
139  for integer arithmetic creates the attractiveness of fixed-point arithmetic.  for integer arithmetic creates the attractiveness of fixed-point arithmetic.
140
141  Thus, we might view our approaches to microcontroller arithmetic as  Thus, we might view our approaches to microcontroller arithmetic as
142  an accident'' of silicon design, or as being driven by silicon  an accident'' of silicon design, or as being driven by silicon
143  design.  design.
144
145  Generally, we seek to determine the best way to use the primitive  Generally, we seek to determine the best way to use the primitive
146  operations provided by the machine (the instruction set) to  operations provided by the machine (the instruction set) to
147  accomplish the mappings of interest.  accomplish the mappings of interest.
148
149  The classic'' algorithms  The classic'' algorithms
150  presented by Knuth  presented by Knuth
151  Knuth (\cite[pp. 265-284]{bibref:b:knuthclassic2ndedvol2}) are especially  Knuth (\cite[pp. 265-284]{bibref:b:knuthclassic2ndedvol2}) are especially
152  designed to use the small'' addition, subtraction, multiplication, and  designed to use the small'' addition, subtraction, multiplication, and
153  division provided by the machine to add, subtract, multiply, and divide arbitrarily  division provided by the machine to add, subtract, multiply, and divide arbitrarily
154  large integers.  In  large integers.  In
155  \cite[pp. 265-266]{bibref:b:knuthclassic2ndedvol2}) Knuth writes:  \cite[pp. 265-266]{bibref:b:knuthclassic2ndedvol2}) Knuth writes:
156
157  \begin{quote}  \begin{quote}
158  \emph{The most important fact to understand about extended-precision numbers  \emph{The most important fact to understand about extended-precision numbers
159  is that they may be regarded as numbers written in radix-$w$ notation,  is that they may be regarded as numbers written in radix-$w$ notation,
160  where $w$ is the computer's word size.  For example, an integer that  where $w$ is the computer's word size.  For example, an integer that
161  fills 10 words on a computer whose word size is $w=10^{10}$ has 100  fills 10 words on a computer whose word size is $w=10^{10}$ has 100
162  decimal digits; but we will consider it to be a 10-place number to  decimal digits; but we will consider it to be a 10-place number to
163  the base $10^{10}$.  This viewpoint is justified for the same reason  the base $10^{10}$.  This viewpoint is justified for the same reason
164  that we may convert, say, from binary to hexadecimal notation,  that we may convert, say, from binary to hexadecimal notation,
165  simply by grouping the bits together.}  simply by grouping the bits together.}
166
167  \emph{In these terms, we are given the following primitive operations to work with:}  \emph{In these terms, we are given the following primitive operations to work with:}
168
169  \begin{itemize}  \begin{itemize}
170  \item \emph{a$_0$) addition or subtraction of one-place integers, giving a one-place  \item \emph{a$_0$) addition or subtraction of one-place integers, giving a one-place
172  \item \emph{b$_0$) multiplication of a one-place integer by another one-place integer,  \item \emph{b$_0$) multiplication of a one-place integer by another one-place integer,
174  \item \emph{c$_0$) division of a two-place integer by a one-place integer,  \item \emph{c$_0$) division of a two-place integer by a one-place integer,
175               provided that the quotient is a one-place integer, and yielding               provided that the quotient is a one-place integer, and yielding
176                              also a one-place remainder.}                              also a one-place remainder.}
177  \end{itemize}  \end{itemize}
178
179  \noindent{}\emph{By adjusting the word size, if necessary, nearly all computers  \noindent{}\emph{By adjusting the word size, if necessary, nearly all computers
180  will have these three operations available; so we will construct algorithms  will have these three operations available; so we will construct algorithms
181  (a), (b), and (c) mentioned above in terms of the primitive operations  (a), (b), and (c) mentioned above in terms of the primitive operations
182  (a$_0$), (b$_0$), and (c$_0$).}  (a$_0$), (b$_0$), and (c$_0$).}
183
184  \emph{Since we are visualizing extended-precision integers as base $b$ numbers, it is  \emph{Since we are visualizing extended-precision integers as base $b$ numbers, it is
185  sometimes helpful to think of the situation when $b = 10$, and to imagine  sometimes helpful to think of the situation when $b = 10$, and to imagine
186  that we are doing the arithmetic by hand.  Then operation (a$_0$) is analogous  that we are doing the arithmetic by hand.  Then operation (a$_0$) is analogous
187  to memorizing the addition table; (b$_0$) is analogous to memorizing the  to memorizing the addition table; (b$_0$) is analogous to memorizing the
188  multiplication table, and (c$_0$) is essentially memorizing the multiplication  multiplication table, and (c$_0$) is essentially memorizing the multiplication
189  table in reverse.  The more complicated operations (a), (b), (c) on  table in reverse.  The more complicated operations (a), (b), (c) on
190  high-precision numbers can now be done using simple addition, subraction,  high-precision numbers can now be done using simple addition, subraction,
191  multiplication, and long-division procedures that children are taught  multiplication, and long-division procedures that children are taught
192  in elementary school.}  in elementary school.}
193  \end{quote}  \end{quote}
194
195  The critical issue for implementation of integer arithmetic with large operands  The critical issue for implementation of integer arithmetic with large operands
196  is how to use small-operand instructions to operate on larger operands---in other words,  is how to use small-operand instructions to operate on larger operands---in other words,
197  how to scale up'' the capability provided by the instruction set.  how to scale up'' the capability provided by the instruction set.
198
199
200  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
201  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
202  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
203  \subsection{Microcontroller Arithmetic As A Mapping From Quantized Domain To  \subsection{Microcontroller Arithmetic As A Mapping From Quantized Domain To
204              Quantized Range}              Quantized Range}
205  %Subsection tag: MAM0  %Subsection tag: MAM0
206  \label{ccil0:sppm0:smam0}  \label{ccil0:sppm0:smam0}
207
208  Microcontroller software accepts inputs which are quantized.  In nearly all cases,  Microcontroller software accepts inputs which are quantized.  In nearly all cases,
209  this involves a mapping from $\vworkrealset$ to $\vworkintset$.  Often, because  this involves a mapping from $\vworkrealset$ to $\vworkintset$.  Often, because
210  microcontroller products are optimized for cost, the quantization hardware  microcontroller products are optimized for cost, the quantization hardware
211  delivers quite poor precision, frequently less than 8 bits.    delivers quite poor precision, frequently less than 8 bits.
212
213  When a quantized input is accepted, it defines an inquality.  Knowledge of  When a quantized input is accepted, it defines an inquality.  Knowledge of
214  the quantized input (an integer) confines the actual input (a real  the quantized input (an integer) confines the actual input (a real
215  number, before  number, before
216  quantization) to an interval.  With a low-cost hardware design, the  quantization) to an interval.  With a low-cost hardware design, the
217  interval can be fairly large.  Usually, by adding cost, the  interval can be fairly large.  Usually, by adding cost, the
219
220  Microcontroller outputs tend to be quantized as well, so it is  Microcontroller outputs tend to be quantized as well, so it is
221  accurate to also characterize outputs as integers.  For example, a PWM signal  accurate to also characterize outputs as integers.  For example, a PWM signal
222  generated by a microcontroller or the output of a D/A converter is  generated by a microcontroller or the output of a D/A converter is
223  controlled by data that is an integer.  Like inputs, often the granularity''  controlled by data that is an integer.  Like inputs, often the granularity''
224  with which outputs can be controlled is quite coarse---again, 8 bits or  with which outputs can be controlled is quite coarse---again, 8 bits or
225  less is not uncommon.  less is not uncommon.
226
227  Thus, we may view microcontroller software as a mapping from poor-quality  Thus, we may view microcontroller software as a mapping from poor-quality
228  inputs to poor-quality outputs.  inputs to poor-quality outputs.
229
230  In such a framework, where the nature of inputs and outputs introduces  In such a framework, where the nature of inputs and outputs introduces
231  substantial error, it is imperative not to introduce additional error  substantial error, it is imperative not to introduce additional error
232  in computer arithmetic.  In other words, given inputs which are  in computer arithmetic.  In other words, given inputs which are
233  integers, the responsibility of the software is to choose the best  integers, the responsibility of the software is to choose the best
234  integers as outputs.  Usually this means that calculations should be  integers as outputs.  Usually this means that calculations should be
235  devised so as not to lose any information (i.e.  devised so as not to lose any information (i.e.
236  not to lose remainders, for example).  Losing information is usually  not to lose remainders, for example).  Losing information is usually
237  equivalent to not being able to make the most correct mapping from input  equivalent to not being able to make the most correct mapping from input
238  to output.  Lossy'' arithmetic can degrade the performance of a system,  to output.  Lossy'' arithmetic can degrade the performance of a system,
239  since poor arithmetic may compound an inexpensive hardware design.  since poor arithmetic may compound an inexpensive hardware design.
240
241
242  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
243  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
244  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
245  \subsection{Microcontroller Arithmetic As A Simulation Of Continuous Controllers}  \subsection{Microcontroller Arithmetic As A Simulation Of Continuous Controllers}
246  %Subsection tag: MAE0  %Subsection tag: MAE0
247  \label{ccil0:sppm0:smae0}  \label{ccil0:sppm0:smae0}
248
249  Control systems have not always employed digital controllers.  Control systems have not always employed digital controllers.
250  Many books and web sites (see \cite{bibref:w:historycontrol01}, for example)  Many books and web sites (see \cite{bibref:w:historycontrol01}, for example)
251  discuss the historical development of feedback control.  Controllers  discuss the historical development of feedback control.  Controllers
252  have not historically been digital, or even electronic.  have not historically been digital, or even electronic.
253  Early controllers for governing steam devices or windmills were  Early controllers for governing steam devices or windmills were
254  ultimately mechanical, and relied upon inertia or other physical  ultimately mechanical, and relied upon inertia or other physical
255  properties.  It is possible to realize abstract notions  properties.  It is possible to realize abstract notions
256  (integrators, differentiators, gains) using hydraulic systems or other mechanical systems;  (integrators, differentiators, gains) using hydraulic systems or other mechanical systems;
257  and in fact hydraulic feedback controllers were used in early rockets  and in fact hydraulic feedback controllers were used in early rockets
258  and aircraft.  Very naturally, abstract notions (integrators,  and aircraft.  Very naturally, abstract notions (integrators,
259  differentiators, gains) can be implemented using analog  differentiators, gains) can be implemented using analog
260  electronic components.  The most common implementations involve  electronic components.  The most common implementations involve
261  operational amplifiers, and the behavior of such implementations comes  operational amplifiers, and the behavior of such implementations comes
262  very close to the ideal mathematical models.  very close to the ideal mathematical models.
263
264  Mechanical, hydraulic, and non-digital electronic controllers have  Mechanical, hydraulic, and non-digital electronic controllers have
265  one very desirable characteristic---\emph{clipping}.  If, for example,  one very desirable characteristic---\emph{clipping}.  If, for example,
266  one provides an analog differentiator with a $dV/dt$ which  one provides an analog differentiator with a $dV/dt$ which
267  is too large, the output that the differentiator can  is too large, the output that the differentiator can
268  provide is limited, usually by the supply voltage available to an operational  provide is limited, usually by the supply voltage available to an operational
269  amplifier.  amplifier.
270  The differentiator \emph{must} clip.  The differentiator \emph{must} clip.
271
272  Clipping often leads to behavior which is close to what  Clipping often leads to behavior which is close to what
273  intuition would expect (i.e. we would present  intuition would expect (i.e. we would present
274  clipping as an occasional advantage).  For example, if an input to  clipping as an occasional advantage).  For example, if an input to
275  an analog control system suffers a failure, the behavior  an analog control system suffers a failure, the behavior
276  the of the controller is limited, as is its internal  the of the controller is limited, as is its internal
277  state.  Similarly, when the  state.  Similarly, when the
278  input is restored, the controller will usually recover  input is restored, the controller will usually recover
279  in a reasonable time because the  in a reasonable time because the
280  state of the controller (typically maintained in capacitors) is limited  state of the controller (typically maintained in capacitors) is limited
281  in the magnitude it can attain.  in the magnitude it can attain.
282
283  We might view a digital controller as an emulation of  We might view a digital controller as an emulation of
284  an analog controller.  We may want to cause the  an analog controller.  We may want to cause the
285  controller to have limits (i.e. rails) internally, for  controller to have limits (i.e. rails) internally, for
286  example to prevent excessive integrator windup''.  We discuss  example to prevent excessive integrator windup''.  We discuss
287  this further in Section \ref{ccil0:sode0}.  this further in Section \ref{ccil0:sode0}.
288
289
290  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
291  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
292  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
293  \section{Practical Design Issues}  \section{Practical Design Issues}
294  %Section tag: PDI0  %Section tag: PDI0
295  \label{ccil0:spdi0}  \label{ccil0:spdi0}
296
297  In this section, we consider practical issues surrounding the design  In this section, we consider practical issues surrounding the design
298  and construction of a set of integer arithmetic subroutines.  and construction of a set of integer arithmetic subroutines.
299  In practice, such a collection of subroutines is likely to be  In practice, such a collection of subroutines is likely to be
300  arranged into a library.  The purpose of the library would be to  arranged into a library.  The purpose of the library would be to
301  free the clients (or callers) of the library from the complexity of  free the clients (or callers) of the library from the complexity of
302  large integer calculations.  large integer calculations.
303
304  The design decisions surrounding the construction of a library vary in  The design decisions surrounding the construction of a library vary in
305  the objectivity with which they can be approached.  Some design decisions  the objectivity with which they can be approached.  Some design decisions
306  (such as the best mechanism for passing parameters) can be approached  (such as the best mechanism for passing parameters) can be approached
307  rigorously because the measures of goodness are unequivocal (minimal ROM consumption  rigorously because the measures of goodness are unequivocal (minimal ROM consumption
308  or execution time, for example).  However, other design decisions, particulary the  or execution time, for example).  However, other design decisions, particulary the
309  decision of the exact nature of the interface between an arithmetic library and  decision of the exact nature of the interface between an arithmetic library and
310  its clients, are more subjective.  One size does \emph{not} fit all.  its clients, are more subjective.  One size does \emph{not} fit all.
311
312
313  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
314  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
315  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
316  \subsection{Parameter Passing And Temporary Storage Mechanisms}  \subsection{Parameter Passing And Temporary Storage Mechanisms}
317  %Subsection tag: PPM0  %Subsection tag: PPM0
318  \label{ccil0:spdi0:sppm0}  \label{ccil0:spdi0:sppm0}
319
320  In small microcontroller work, the desire to save ROM and execution time  In small microcontroller work, the desire to save ROM and execution time
321  may lead to inelegant software construction.  Because an arithmetic library  may lead to inelegant software construction.  Because an arithmetic library
322  used in microcontroller work may be called from many different places  used in microcontroller work may be called from many different places
323  throughout ROM, serious thought should be given to optimizing the  throughout ROM, serious thought should be given to optimizing the
324  parameter passing mechanisms, even perhaps at the expense of elegance.  parameter passing mechanisms, even perhaps at the expense of elegance.
325  The way in which the arithmetic library allocates temporary storage is  The way in which the arithmetic library allocates temporary storage is
326  also a point of concern, because the most elegant way of allocating temporary  also a point of concern, because the most elegant way of allocating temporary
327  storage (on the stack) may either not be feasible (because of the possibility  storage (on the stack) may either not be feasible (because of the possibility
328  of stack overflow) or may not be efficient (because the addressing modes of  of stack overflow) or may not be efficient (because the addressing modes of
329  the machine make data on the stack inefficient to address).  In this section  the machine make data on the stack inefficient to address).  In this section
330  we discuss both parameter passing and temporary storage mechanisms.  we discuss both parameter passing and temporary storage mechanisms.
331
332  In the remainder of the discussion, we make the following assumptions  In the remainder of the discussion, we make the following assumptions
334
335  \begin{enumerate}  \begin{enumerate}
336  \item \textbf{The arithmetic library need not be re-entrant.}  \item \textbf{The arithmetic library need not be re-entrant.}
337        Most small'' microcontroller software loads use a non-preemptive        Most small'' microcontroller software loads use a non-preemptive
338        scheduling paradigm, so this is a reasonable assumption.  We also        scheduling paradigm, so this is a reasonable assumption.  We also
339        make the reasonable assumption that ISR's may not make calls into        make the reasonable assumption that ISR's may not make calls into
340        the arithmetic library.        the arithmetic library.
341  \item \textbf{Dynamic memory allocation, other than on the stack,  \item \textbf{Dynamic memory allocation, other than on the stack,
342        is not allowed by the software architecture.}  This is also        is not allowed by the software architecture.}  This is also
343        a reasonable assumption in small'' microcontroller software.        a reasonable assumption in small'' microcontroller software.
344  \end{enumerate}  \end{enumerate}
345
346  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
347  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
348  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
349  \subsubsection{Parameter Passing Mechanisms}  \subsubsection{Parameter Passing Mechanisms}
350  %Subsubsection tag: PPM0  %Subsubsection tag: PPM0
351  \label{ccil0:spdi0:sppm0:sppm0}  \label{ccil0:spdi0:sppm0:sppm0}
352
353  If an arithmetic library exists in a microcontroller software load,  If an arithmetic library exists in a microcontroller software load,
354  it may be called many times throughout ROM.  Thus, the parameter  it may be called many times throughout ROM.  Thus, the parameter
355  passing mechanisms chosen may have a large effect on ROM consumption  passing mechanisms chosen may have a large effect on ROM consumption
356  (due to the setup required for each subroutine call multiplied by  (due to the setup required for each subroutine call multiplied by
357  many instances throughout ROM) and execution time  many instances throughout ROM) and execution time
358  (because in microcontroller software longer instructions nearly always  (because in microcontroller software longer instructions nearly always
359  require more time).  Because of the criticality of ROM consumption,  require more time).  Because of the criticality of ROM consumption,
360  parameter passing mechanisms that lack elegance may be attractive.  parameter passing mechanisms that lack elegance may be attractive.
361
362  In the category of parameter passing, we also include the way in which  In the category of parameter passing, we also include the way in which
363  return value(s) are passed back to the caller.  return value(s) are passed back to the caller.
364
365  The following parameter-passing mechanisms may be employed:  The following parameter-passing mechanisms may be employed:
366
367  \begin{enumerate}  \begin{enumerate}
368  \item \textbf{Pass by value as storage class \emph{automatic}.}  \item \textbf{Pass by value as storage class \emph{automatic}.}
369        The most common scenario is that the arithmetic        The most common scenario is that the arithmetic
370        library is written in assembly-language to be called from        library is written in assembly-language to be called from
371        C', and so the assembly-language subroutines must adhere to the        C', and so the assembly-language subroutines must adhere to the
372        parameter-passing conventions used by the compiler.        parameter-passing conventions used by the compiler.
373        This usually means that the entire input or output value        This usually means that the entire input or output value
374        will be passed in CPU registers or on the stack.  Somewhat rarely,        will be passed in CPU registers or on the stack.  Somewhat rarely,
375        a compiler will pass parameters in static locations.\footnote{The        a compiler will pass parameters in static locations.\footnote{The
376        usual reason for a C' compiler to pass parameters in static locations        usual reason for a C' compiler to pass parameters in static locations
377        is because the instruction set of the machine was not designed for        is because the instruction set of the machine was not designed for
378        higher-level languages, and references to [usually \emph{near}] memory        higher-level languages, and references to [usually \emph{near}] memory
379        are cheaper than stack references.  Such compilers also typically        are cheaper than stack references.  Such compilers also typically
380        analyze the calling tree of the program where possible and use this        analyze the calling tree of the program where possible and use this
381        information to overlay the parameter-passing memory areas of        information to overlay the parameter-passing memory areas of
382        subroutines that cannot be active simultaneously.  Without the ability        subroutines that cannot be active simultaneously.  Without the ability
383        to analyze the calling tree and make overlay decisions based on it,        to analyze the calling tree and make overlay decisions based on it,
384        memory would be exhausted, because each subroutine would need to have its        memory would be exhausted, because each subroutine would need to have its
385        own static storage for parameters and local variables.}        own static storage for parameters and local variables.}
386  \item \textbf{Pass by reference.}  \item \textbf{Pass by reference.}
387        Typically, it is convenient to pass pointer(s) to area(s) of memory        Typically, it is convenient to pass pointer(s) to area(s) of memory
388        containing input operands, and also a pointer to an area of memory        containing input operands, and also a pointer to an area of memory
389        owned by the caller which is written with the result by the arithmetic subroutine.        owned by the caller which is written with the result by the arithmetic subroutine.
390        The efficiency of this approach depends on the compiler and the instruction        The efficiency of this approach depends on the compiler and the instruction
391        set of the machine.  If the instruction set of the machine cannot        set of the machine.  If the instruction set of the machine cannot
392        make effective use of pointers or stack frames, an arithmetic subroutine        make effective use of pointers or stack frames, an arithmetic subroutine
393        might be constructed so that it first copies the operands to a static area of memory        might be constructed so that it first copies the operands to a static area of memory
394        reserved for the arithmetic library, then performs the necessary arithmetic        reserved for the arithmetic library, then performs the necessary arithmetic
395        operations on the operands in the static area,        operations on the operands in the static area,
396        then copies the result(s) back to the area owned by the caller.        then copies the result(s) back to the area owned by the caller.
397  \item \textbf{Pass by common data block.}  \item \textbf{Pass by common data block.}
398        In some cases, it may be preferable to reserve a block of memory in which to        In some cases, it may be preferable to reserve a block of memory in which to
399        pass parameters to arithmetic library functions, and from which to retrieve        pass parameters to arithmetic library functions, and from which to retrieve
400        results after an arithmetic library function returns.  The allocation of such        results after an arithmetic library function returns.  The allocation of such
401        a static memory block may be done manually\footnote{Note to self:  need to        a static memory block may be done manually\footnote{Note to self:  need to
402        include gentleman's agreements on memory usage in my note on software architecture.}        include gentleman's agreements on memory usage in my note on software architecture.}
403        or automatically (by development tools which can analyze the function calling tree and        or automatically (by development tools which can analyze the function calling tree and
404        manage the overlaying).        manage the overlaying).
405  \end{enumerate}  \end{enumerate}
406
407
408
409  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
410  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
411  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
412  \subsubsection{Temporary Storage Mechanisms}  \subsubsection{Temporary Storage Mechanisms}
413  %Subsubsection tag: TSM0  %Subsubsection tag: TSM0
414  \label{ccil0:spdi0:sppm0:stsm0}  \label{ccil0:spdi0:sppm0:stsm0}
415
416  Need to indicate clearly on section on software architectures the  Need to indicate clearly on section on software architectures the
417  primary temporary storage mechanisms:  primary temporary storage mechanisms:
418
419  \begin{itemize}  \begin{itemize}
420  \item Stack.  \item Stack.
421  \item Memory block with overlay functionality.  \item Memory block with overlay functionality.
422  \end{itemize}  \end{itemize}
423
424  Need to expand software architecture section to cover this, so don't  Need to expand software architecture section to cover this, so don't
425  discuss here.  discuss here.
426
427
428  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
429  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
430  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
431  \subsection{Reporting Of Overflow, Underflow, And Domain Errors}  \subsection{Reporting Of Overflow, Underflow, And Domain Errors}
432  %Subsection tag: OUD0  %Subsection tag: OUD0
433  \label{ccil0:spdi0:soud0}  \label{ccil0:spdi0:soud0}
434
435  Long integer data types used in microcontroller work are typically  Long integer data types used in microcontroller work are typically
436  of a static size (they cannot grow in size in as operations are  of a static size (they cannot grow in size in as operations are
437  performed on them).  The reason for the typical static sizes is that  performed on them).  The reason for the typical static sizes is that
438  dynamic allocation (except for allocation and  dynamic allocation (except for allocation and
439  deallocation on the stack as subroutines  deallocation on the stack as subroutines
440  are called and return) is rarely used in small microcontroller work.  are called and return) is rarely used in small microcontroller work.
441  It will come about in the normal usage of an integer arithmetic  It will come about in the normal usage of an integer arithmetic
442  library that an attempt will be made to operate on integers in  library that an attempt will be made to operate on integers in
443  a way which generates an overflow, generates an  a way which generates an overflow, generates an
444  underflow, or  underflow, or
445  represents a domain error (division by zero or  represents a domain error (division by zero or
446  square root of a negative integer, for example).  square root of a negative integer, for example).
447  An important design decision is how such normal exceptions should be  An important design decision is how such normal exceptions should be
448  handled.  handled.
449
450  Possible design decisions in this area are:  Possible design decisions in this area are:
451
452  \begin{enumerate}  \begin{enumerate}
453  \item \label{enum:ccil0:spdi0:soud0:01:01}  \item \label{enum:ccil0:spdi0:soud0:01:01}
454        \textbf{To design arithmetic subroutines so that exceptions are not        \textbf{To design arithmetic subroutines so that exceptions are not
455        possible.}          possible.}
456        For example, multiplying an $m$-word integer by an $n$-word integer        For example, multiplying an $m$-word integer by an $n$-word integer
457        will always generate an integer that will fit within $m+n$ words.        will always generate an integer that will fit within $m+n$ words.
458        If a multiplication subroutine is designed so that the caller must        If a multiplication subroutine is designed so that the caller must
459        provide an $m$-word operand and an $n$-word operand and a pointer to        provide an $m$-word operand and an $n$-word operand and a pointer to
460        an $(m+n)$-word area of memory for the result, an overflow cannot occur.        an $(m+n)$-word area of memory for the result, an overflow cannot occur.
461        Such a design decision essentially pushes overflow detection back up to        Such a design decision essentially pushes overflow detection back up to
462        the callers of arithmetic subroutines.        the callers of arithmetic subroutines.
463  \item \label{enum:ccil0:spdi0:soud0:01:02}  \item \label{enum:ccil0:spdi0:soud0:01:02}
464        \textbf{To design arithmetic subroutines so that exceptions are possible,        \textbf{To design arithmetic subroutines so that exceptions are possible,
465        but not to detect the exceptions, thus providing an implementation that        but not to detect the exceptions, thus providing an implementation that
466        will produce incorrect results with some operand data values.}        will produce incorrect results with some operand data values.}
467        For example, if an arithmetic subroutine is designed to add an $m$-word        For example, if an arithmetic subroutine is designed to add an $m$-word
468        operand to another $m$-word operand to produce an $m$-word result, overflow        operand to another $m$-word operand to produce an $m$-word result, overflow
469        is possible.  A design decision to fail to detect such exceptions pushes        is possible.  A design decision to fail to detect such exceptions pushes
470        the responsibility up to the callers of the arithmetic subroutines.        the responsibility up to the callers of the arithmetic subroutines.
471        Callers must devise a method for not calling arithmetic subroutines        Callers must devise a method for not calling arithmetic subroutines
472        with data values that will cause an exception, or else to detect an exception        with data values that will cause an exception, or else to detect an exception
473        when it has occurred.        when it has occurred.
474  \item \label{enum:ccil0:spdi0:soud0:01:03}  \item \label{enum:ccil0:spdi0:soud0:01:03}
475        \textbf{To rail'' the result in response to an exception.}          \textbf{To rail'' the result in response to an exception.}
476        It was stated earlier that analog control system functional blocks        It was stated earlier that analog control system functional blocks
477        built with operational        built with operational
478        amplifiers typically have an output which cannot go beyond the        amplifiers typically have an output which cannot go beyond the
479        supply rails.  One may implement similar behavior in arithmetic subroutines.        supply rails.  One may implement similar behavior in arithmetic subroutines.
480        In an addition subroutine which adds two $m$-word operands to produce an        In an addition subroutine which adds two $m$-word operands to produce an
481        $m$-word result (with each word having $w$ bits), it would be natural to        $m$-word result (with each word having $w$ bits), it would be natural to
482        return $2^{mw}-1$ in the event of an overflow in a positive direction and        return $2^{mw}-1$ in the event of an overflow in a positive direction and
483        $-2^{mw}$ in the event of an overlfow in a negative direction.  Note that        $-2^{mw}$ in the event of an overlfow in a negative direction.  Note that
484        the caller will not be able to distinguish a rail'' value which represents        the caller will not be able to distinguish a rail'' value which represents
485        a valid result from a rail'' value substituted to indicate an exception.        a valid result from a rail'' value substituted to indicate an exception.
486  \item \label{enum:ccil0:spdi0:soud0:01:04}  \item \label{enum:ccil0:spdi0:soud0:01:04}
487        \textbf{To reserve special result data values to indicate exceptions.}        \textbf{To reserve special result data values to indicate exceptions.}
488        Depending on the arithmetic subroutine being implemented, it may be possible        Depending on the arithmetic subroutine being implemented, it may be possible
489        to reserve certain result data values to indicate exceptions.  This approach        to reserve certain result data values to indicate exceptions.  This approach
490        is often awkward, as most mathematical subroutines are naturally defined so that        is often awkward, as most mathematical subroutines are naturally defined so that
491        all bit patterns in the memory reserved for the result are valid numbers.        all bit patterns in the memory reserved for the result are valid numbers.
492        Additionally, with long result data values, it may not be economical to        Additionally, with long result data values, it may not be economical to
493        compare the result against the reserved exception values.  Thus, this is seldom        compare the result against the reserved exception values.  Thus, this is seldom
494        an optimal way to deal with exceptions.        an optimal way to deal with exceptions.
495
496        Additionally, if this approach is employed, the semantics of how exception        Additionally, if this approach is employed, the semantics of how exception
497        values combine with other exception values and data values must be decided.        values combine with other exception values and data values must be decided.
498  \item \label{enum:ccil0:spdi0:soud0:01:05}  \item \label{enum:ccil0:spdi0:soud0:01:05}
499        \textbf{To return exception codes to the caller separate from the result data.}        \textbf{To return exception codes to the caller separate from the result data.}
500        In the C' language, pointers are often used to supply an arithmetic subroutine        In the C' language, pointers are often used to supply an arithmetic subroutine
501        with the input operands and to provide the arithmetic subroutine with a location        with the input operands and to provide the arithmetic subroutine with a location
502        (which belongs to the caller) in which to store the result data.  Thus, the return        (which belongs to the caller) in which to store the result data.  Thus, the return
503        value of the arithmetic subroutine (normally assigned through the subroutine name)        value of the arithmetic subroutine (normally assigned through the subroutine name)
504        is often available to return exception codes.  For example,        is often available to return exception codes.  For example,
505        a C' function may be defined as        a C' function may be defined as
506
507        \begin{verbatim}        \begin{verbatim}
509                                 INT128 *arg1,                                 INT128 *arg1,
510                                 INT128 *arg2),                                 INT128 *arg2),
511        \end{verbatim}        \end{verbatim}
512
513        leaving the returned \texttt{unsigned char} value available to return        leaving the returned \texttt{unsigned char} value available to return
514        exception information.  Note that this arrangement has the following advantages:        exception information.  Note that this arrangement has the following advantages:
515
516        \begin{enumerate}        \begin{enumerate}
517        \item All bit patterns in the result data memory area area available        \item All bit patterns in the result data memory area area available
518              as data bit patterns.              as data bit patterns.
519        \item The exception data is very economical to test, because it is placed        \item The exception data is very economical to test, because it is placed
520              in a machine-native data type.              in a machine-native data type.
521        \item The exception data can easily be discarded or by the caller if desired.        \item The exception data can easily be discarded or by the caller if desired.
522        \item All decisions about how to handle exceptions are left to the caller.        \item All decisions about how to handle exceptions are left to the caller.
523        \end{enumerate}        \end{enumerate}
524  \item \label{enum:ccil0:spdi0:soud0:01:06}  \item \label{enum:ccil0:spdi0:soud0:01:06}
525        \textbf{To maintain exception data with each result integer.}        \textbf{To maintain exception data with each result integer.}
526        It is possible to reserve bits for exception information which are part of the        It is possible to reserve bits for exception information which are part of the
527        long integer data type.  This exception state essentially conveys        long integer data type.  This exception state essentially conveys
528        \emph{NaN}\footnote{\emph{N}ot \emph{a} \emph{n}umber.} information---integers with        \emph{NaN}\footnote{\emph{N}ot \emph{a} \emph{n}umber.} information---integers with
529        exception information set are not true numbers, but rather they are different from the        exception information set are not true numbers, but rather they are different from the
530        true result in some way.  As with (\ref{enum:ccil0:spdi0:soud0:01:04}), the        true result in some way.  As with (\ref{enum:ccil0:spdi0:soud0:01:04}), the
531        semantics of how to combine NaN values and NaN values with ordinary non-NaN numbers        semantics of how to combine NaN values and NaN values with ordinary non-NaN numbers
532        must be defined.        must be defined.
533  \item \label{enum:ccil0:spdi0:soud0:01:07}  \item \label{enum:ccil0:spdi0:soud0:01:07}
534        \textbf{To maintain a global exception state variable.}        \textbf{To maintain a global exception state variable.}
535        A variable or set of variables can be reserved which hold the        A variable or set of variables can be reserved which hold the
536        exception information, if any, from the most recent call to        exception information, if any, from the most recent call to
537        a function in the arithmetic library.        a function in the arithmetic library.
538
539        To save CPU cycles, the arithmetic library can        To save CPU cycles, the arithmetic library can
540        be designed so that it will assign the global exception state variable only if an        be designed so that it will assign the global exception state variable only if an
541        exception occurs---the caller then has the responsibility of clearing the exception        exception occurs---the caller then has the responsibility of clearing the exception
542        state variable before making any call into the arithmetic library where the        state variable before making any call into the arithmetic library where the
543        exception result is of interest.  The interface between the caller and the library        exception result is of interest.  The interface between the caller and the library
544        can be further optimized if the library only OR's data into the variable containing        can be further optimized if the library only OR's data into the variable containing
545        the exception state.  Using this optimization, the caller can clear the exception state        the exception state.  Using this optimization, the caller can clear the exception state
546        variable, make several calls into the arithmetic library, and then retrieve a meaningful        variable, make several calls into the arithmetic library, and then retrieve a meaningful
547        exception state variable value summarizing several arithmetic operations.        exception state variable value summarizing several arithmetic operations.
548  \item \label{enum:ccil0:spdi0:soud0:01:08}  \item \label{enum:ccil0:spdi0:soud0:01:08}
549        \textbf{Hybrid approaches.}        \textbf{Hybrid approaches.}
550        The approaches (\ref{enum:ccil0:spdi0:soud0:01:01})        The approaches (\ref{enum:ccil0:spdi0:soud0:01:01})
551        through (\ref{enum:ccil0:spdi0:soud0:01:07}) can be combined.        through (\ref{enum:ccil0:spdi0:soud0:01:07}) can be combined.
552        For example, approach (\ref{enum:ccil0:spdi0:soud0:01:03})        For example, approach (\ref{enum:ccil0:spdi0:soud0:01:03})
553        might be combined with approach (\ref{enum:ccil0:spdi0:soud0:01:05})        might be combined with approach (\ref{enum:ccil0:spdi0:soud0:01:05})
554        so that exceptions are railed'', but the caller may also be made        so that exceptions are railed'', but the caller may also be made
555        aware that an exception has occured.  Many hybrid approaches are possible.        aware that an exception has occured.  Many hybrid approaches are possible.
556  \end{enumerate}  \end{enumerate}
557
558
559  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
560  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
561  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
562  \subsection{Semantics Of Combining Overflow, Underflow, And Domain Errors}  \subsection{Semantics Of Combining Overflow, Underflow, And Domain Errors}
563  %Subsection tag: CMB0  %Subsection tag: CMB0
564  \label{ccil0:spdi0:scmb0}  \label{ccil0:spdi0:scmb0}
565
566  For control system arithmetic, some form of clipping as suggested  For control system arithmetic, some form of clipping as suggested
567  in Section \ref{ccil0:sppm0:smae0} is probably the  in Section \ref{ccil0:sppm0:smae0} is probably the
568  best approach.  Definitely, an overflow should generate a result  best approach.  Definitely, an overflow should generate a result
569  which is the largest representable integer, and an  which is the largest representable integer, and an
570  underflow should generate a result which is the smallest  underflow should generate a result which is the smallest
571  representable integer.  representable integer.
572
573  In addition to treating an overflow by clipping, it may be  In addition to treating an overflow by clipping, it may be
574  advantageous to reserve a flag in the representation of a  advantageous to reserve a flag in the representation of a
575  multiple-precision integer to record that an overflow has occured and been clipped.  multiple-precision integer to record that an overflow has occured and been clipped.
576  Some functions which accept the integer as input may be interested  Some functions which accept the integer as input may be interested
577  in the value of such a flag, where othere---perhaps most---may  in the value of such a flag, where othere---perhaps most---may
578  not.  not.
579
580  The correct course of action in the event of a domain error (such  The correct course of action in the event of a domain error (such
581  as division by zero) is less clear.  It is noteworthy that in a  as division by zero) is less clear.  It is noteworthy that in a
582  normal control system, domain errors cannot occur (but overflows  normal control system, domain errors cannot occur (but overflows
583  can).  can).
584
585  The best approach when a domain error is involved probably  The best approach when a domain error is involved probably
586  depends on the basis for the underlying calculation.  For  depends on the basis for the underlying calculation.  For
587  example, if integer division is used as part of a  example, if integer division is used as part of a
588  strategy for software ratiometric conversion, a value  strategy for software ratiometric conversion, a value
589  of zero in the denominator probably represents extreme electrical  of zero in the denominator probably represents extreme electrical
590  noise, and the most sane approach may be to replace the  noise, and the most sane approach may be to replace the
591  denominator by one.  However, in other contexts it may be appropriate  denominator by one.  However, in other contexts it may be appropriate
592  to think in terms of  to think in terms of
593
594
595  \lim_{k \rightarrow 0^+} \frac{kn}{kd}  \lim_{k \rightarrow 0^+} \frac{kn}{kd}
596
597
598  \noindent{}or  \noindent{}or
599
600
601  \lim_{k \rightarrow 0^-} \frac{kn}{kd} .  \lim_{k \rightarrow 0^-} \frac{kn}{kd} .
602
603
604  Put in other terms, there is not a clear one solution  Put in other terms, there is not a clear one solution
605  meets all needs'' approach to dealing with domain  meets all needs'' approach to dealing with domain
606  errors.  errors.
607
608  As in the case of overflows, it may be advantageous to reserve a bit  As in the case of overflows, it may be advantageous to reserve a bit
609  flag to signal that a domain error has occured and that the result  flag to signal that a domain error has occured and that the result
610  is not valid or not reliable.  Note that floating point chips  is not valid or not reliable.  Note that floating point chips
611  (such as the 80x87) provide similar indications of domain errors.  (such as the 80x87) provide similar indications of domain errors.
612
614  overflow or domain error flags propagate through binary or  overflow or domain error flags propagate through binary or
615  unary operators.  For example, if two numbers are multiplied, and  unary operators.  For example, if two numbers are multiplied, and
616  one of the two has the overflow flag set, it may be wise set the  one of the two has the overflow flag set, it may be wise set the
617  overflow flag in the  result.  A scheme for how warning  overflow flag in the  result.  A scheme for how warning
618  flags propagate may be beneficial.  flags propagate may be beneficial.
619
620
621  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
622  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
623  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
624  \subsection{Variable Versus Constant Subroutine Execution Time}  \subsection{Variable Versus Constant Subroutine Execution Time}
625  %Subsection tag: VVC0  %Subsection tag: VVC0
626  \label{ccil0:spdi0:svvc0}  \label{ccil0:spdi0:svvc0}
627
628  As a design goal of an embedded system, we seek to minimize the  As a design goal of an embedded system, we seek to minimize the
629  timing variability of software components.  An arithmetic subroutine  timing variability of software components.  An arithmetic subroutine
630  that with a high probability takes a short time to execute and with a  that with a high probability takes a short time to execute and with a
631  low probability takes a long time to execute, and where the execution  low probability takes a long time to execute, and where the execution
632  time is data dependent, is a serious risk.  An embedded software product  time is data dependent, is a serious risk.  An embedded software product
633  may pass all release testing, but then fail in the field because of  may pass all release testing, but then fail in the field because of
634  specific data values used in calculations.  specific data values used in calculations.
635
636  A very conservative design goal would be to design every arithmetic subroutine  A very conservative design goal would be to design every arithmetic subroutine
637  to require exactly the same execution time, regardless of data values.  to require exactly the same execution time, regardless of data values.
638  This goal is not practical because machine instructions themselves  This goal is not practical because machine instructions themselves
639  usually have a variable execution time, particularly for multiplication  usually have a variable execution time, particularly for multiplication
640  and division instructions.  A fallback goal would be to avoid  and division instructions.  A fallback goal would be to avoid
641  large differences between minimum and maximum execution time, without  large differences between minimum and maximum execution time, without
642  increasing the maximum execution time.  A very practical step to take  increasing the maximum execution time.  A very practical step to take
643  (using division as an example)  (using division as an example)
644  is to insert artifical delays into easily detectable exception cases (such as  is to insert artifical delays into easily detectable exception cases (such as
645  division by zero) so that the exception case takes as long as the  division by zero) so that the exception case takes as long as the
646  minimum time for a division with valid operands.  minimum time for a division with valid operands.
647
648
649  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
650  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
651  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
652  \section{Fixed-Point Arithmetic}  \section{Fixed-Point Arithmetic}
653  %Section tag: FPA0  %Section tag: FPA0
654  \label{ccil0:sfpa0}  \label{ccil0:sfpa0}
655
656  \emph{Fixed-point arithmetic}\index{fixed-point arithmetic}  \emph{Fixed-point arithmetic}\index{fixed-point arithmetic}
657  is a scheme for the representation  is a scheme for the representation
658  of engineering quantities (conceptually real numbers with optional  of engineering quantities (conceptually real numbers with optional
659  units) by integers so that calculations can be performed  units) by integers so that calculations can be performed
660  on these quantitites using [usually multiple-precision]  on these quantitites using [usually multiple-precision]
661  integer arithmetic.  integer arithmetic.
662
663  In discussing fixed-point arithmetic,  In discussing fixed-point arithmetic,
664  we must be careful to distinguish between the  we must be careful to distinguish between the
665  \emph{represented value} (the engineering quantity)  \emph{represented value} (the engineering quantity)
666  and the \emph{representation} (the integer which represents  and the \emph{representation} (the integer which represents
667  the engineering quantity).  In most cases, we must also  the engineering quantity).  In most cases, we must also
668  be careful to devise a system to track the units of the  be careful to devise a system to track the units of the
669  represented values, as, especially with control systems,  represented values, as, especially with control systems,
670  the units of represented values (due to integration and  the units of represented values (due to integration and
671  differentiation) can become very complex  differentiation) can become very complex
672  and mistakes are easy to make.  and mistakes are easy to make.
673
674  Fixed-point arithmetic is the dominant paradigm of construction  Fixed-point arithmetic is the dominant paradigm of construction
675  for calculations in small microcontroller systems.  It may not be  for calculations in small microcontroller systems.  It may not be
676  clear why this should be so or what advantages it offers [over  clear why this should be so or what advantages it offers [over
677  floating-point arithmetic].  The reasons for  floating-point arithmetic].  The reasons for
678  this predominance are:  this predominance are:
679
680  \begin{itemize}  \begin{itemize}
681  \item Fixed-point calculations tend to be very efficient, because  \item Fixed-point calculations tend to be very efficient, because
682        they make direct use of the integer arithmetic instructions in the        they make direct use of the integer arithmetic instructions in the
683        microcontroller's instruction set.  On the other hand, floating-point        microcontroller's instruction set.  On the other hand, floating-point
684        arithmetic operations tend to be much slower.        arithmetic operations tend to be much slower.
685
686  \item Floating-point calculations typically require a floating-point  \item Floating-point calculations typically require a floating-point
687        library, which may consume at least several hundred bytes        library, which may consume at least several hundred bytes
688        of ROM.        of ROM.
689
690  \item Some safety-critical software standards prohibit the use of  \item Some safety-critical software standards prohibit the use of
691        floating-point arithmetic because it can result in nebulous        floating-point arithmetic because it can result in nebulous
692        behavior.  Fixed-point arithmetic avoids these concerns.        behavior.  Fixed-point arithmetic avoids these concerns.
693  \end{itemize}  \end{itemize}
694
695  In order to carry out fixed-point arithmetic---that is, in order to  In order to carry out fixed-point arithmetic---that is, in order to
696  operate on engineering quantities as integers---we  operate on engineering quantities as integers---we
697  require that the relationship between the  require that the relationship between the
698  represented value and the representation be of the form  represented value and the representation be of the form
699
700
701  \label{eq:ccil0:sfpa0:00}  \label{eq:ccil0:sfpa0:00}
702  x = r_I u + \Psi,  x = r_I u + \Psi,
703
704
705  \noindent{}where $x \in \vworkrealset$ (possibly with units) is the represented  \noindent{}where $x \in \vworkrealset$ (possibly with units) is the represented
706  value, $u \in \vworkintset$ is the representation,  value, $u \in \vworkintset$ is the representation,
707  $r_I \in \vworkrealset$ is the scaling factor (possibly with  $r_I \in \vworkrealset$ is the scaling factor (possibly with
708  units), and $\Psi \in \vworkrealset$ (possibly with units) is the offset.  units), and $\Psi \in \vworkrealset$ (possibly with units) is the offset.
709  Note that the units of $r_I$, $\Psi$, and $x$  must match.  Note that the units of $r_I$, $\Psi$, and $x$  must match.
710
711  We further require that $r_I \vworkdivides{} \Psi$\footnote{We \emph{are}  We further require that $r_I \vworkdivides{} \Psi$\footnote{We \emph{are}
712  aware that this is an abuse of nomenclature, as  aware that this is an abuse of nomenclature, as
713  $\vworkdivides$' (divides'') is traditionally only applied to integers.}  $\vworkdivides$' (divides'') is traditionally only applied to integers.}
714  (i.e. that $\Psi$ be an  (i.e. that $\Psi$ be an
715  integral multiple of $r_I$)  integral multiple of $r_I$)
716  so that the offset in the represented value corresponds to an integer  so that the offset in the represented value corresponds to an integer
717  in the representation.  Without this restriction, we could not remove the  in the representation.  Without this restriction, we could not remove the
718  offset from the representation with integer subtraction only.  Note that  offset from the representation with integer subtraction only.  Note that
719  we do \emph{not} require that $r_I$ or $\Psi$ be rational, although  we do \emph{not} require that $r_I$ or $\Psi$ be rational, although
720  they must both be rational or both be irrational in order to satisfy  they must both be rational or both be irrational in order to satisfy
721  (\ref{eq:ccil0:sfpa0:00}).  (\ref{eq:ccil0:sfpa0:00}).
722
723  In (\ref{eq:ccil0:sfpa0:00}), since we've required that $r_I \vworkdivides{} \Psi$,  In (\ref{eq:ccil0:sfpa0:00}), since we've required that $r_I \vworkdivides{} \Psi$,
724  we can replace $\Psi$ by  we can replace $\Psi$ by
725
726
727  \label{eq:ccil0:sfpa0:00b}  \label{eq:ccil0:sfpa0:00b}
728  \psi = \frac{\Psi}{r_I}  \psi = \frac{\Psi}{r_I}
729
730
731  \noindent{}to obtain  \noindent{}to obtain
732
733
734  \label{eq:ccil0:sfpa0:01}  \label{eq:ccil0:sfpa0:01}
735  x = r_I (u + \psi),  x = r_I (u + \psi),
736
737
738  \noindent{}where $\psi \in \vworkintset$ is the offset in representational  \noindent{}where $\psi \in \vworkintset$ is the offset in representational
739  counts.  counts.
740
741  Note that we can also easily obtain the following relationships from the  Note that we can also easily obtain the following relationships from the
742  defining equations (\ref{eq:ccil0:sfpa0:00}),  defining equations (\ref{eq:ccil0:sfpa0:00}),
743  (\ref{eq:ccil0:sfpa0:00b}), and (\ref{eq:ccil0:sfpa0:01}).  (\ref{eq:ccil0:sfpa0:00b}), and (\ref{eq:ccil0:sfpa0:01}).
744
745
746  \label{eq:ccil0:sfpa0:02}  \label{eq:ccil0:sfpa0:02}
747  u = \frac{x - \Psi}{r_I} = \frac{x}{r_I} - \psi  u = \frac{x - \Psi}{r_I} = \frac{x}{r_I} - \psi
748
749
750
751  \label{eq:ccil0:sfpa0:03}  \label{eq:ccil0:sfpa0:03}
752  \psi = \frac{\Psi}{r_I}  \psi = \frac{\Psi}{r_I}
753  \Longleftrightarrow  \Longleftrightarrow
754  \Psi = r_I \psi  \Psi = r_I \psi
755  \Longleftrightarrow  \Longleftrightarrow
756  r_I = \frac{\Psi}{\psi}  r_I = \frac{\Psi}{\psi}
757
758
759
760  For example, in a 16-bit signed integer (which inherently may range  For example, in a 16-bit signed integer (which inherently may range
761  from -32768 to 32767 inclusive), one might used a fixed-point  from -32768 to 32767 inclusive), one might used a fixed-point
762  representation of 100 integer counts per $^\circ$C ($r_I = 0.01 \; ^\circ$C)  representation of 100 integer counts per $^\circ$C ($r_I = 0.01 \; ^\circ$C)
763  with an offset of 100$^\circ$C ($\Psi$ = 100$^\circ$C or equivalently  with an offset of 100$^\circ$C ($\Psi$ = 100$^\circ$C or equivalently
764  $\psi$ = 10000), giving  $\psi$ = 10000), giving
765  a representational range from -227.68$^\circ$C to 427.67$^\circ$C inclusive.  a representational range from -227.68$^\circ$C to 427.67$^\circ$C inclusive.
766
767  If $r_I = 2^N, \; N \in \vworkintset$, then the radix point of the represented  If $r_I = 2^N, \; N \in \vworkintset$, then the radix point of the represented
768  value is positioned  value is positioned
769  between two bits of the representation---this arrangement may have computational  between two bits of the representation---this arrangement may have computational
771  if the whole and fractional parts of the represented value need to be separated  if the whole and fractional parts of the represented value need to be separated
772  easily in the representation.  Note also that if $r_I = 2^{WN}$ where $W$ is the  easily in the representation.  Note also that if $r_I = 2^{WN}$ where $W$ is the
773  machine integer size of the computer (in bits), then the radix point of the represented value  machine integer size of the computer (in bits), then the radix point of the represented value
774  occurs between two addressable machine integers of the representation, which can be convenient.  occurs between two addressable machine integers of the representation, which can be convenient.
775  However, we do not require for our definition of a fixed-point representation that  However, we do not require for our definition of a fixed-point representation that
776  $r_I$ be an integral power of two; and in fact we do not even require by our  $r_I$ be an integral power of two; and in fact we do not even require by our
777  definition that $r_I$ be rational.  Note that in general our definition above is the  definition that $r_I$ be rational.  Note that in general our definition above is the
778  weakest set of conditions so that real-world engineering values can be manipulated  weakest set of conditions so that real-world engineering values can be manipulated
779  using integer operations performed upon the representation.  using integer operations performed upon the representation.
780
781
782  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
783  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
784  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
785  \section{Representation Of Integers}  \section{Representation Of Integers}
786  %Section Tag: ROI0  %Section Tag: ROI0
787  \label{ccil0:sroi0}  \label{ccil0:sroi0}
788
789  In this section, we discuss common representations of integers, both  In this section, we discuss common representations of integers, both
790  \emph{machine} integers and \emph{synthetic long} integers.    \emph{machine} integers and \emph{synthetic long} integers.
791  By \emph{representation} we mean the mapping between the abstract  By \emph{representation} we mean the mapping between the abstract
792  mathematical notion of an integer and the way it is stored in the computer  mathematical notion of an integer and the way it is stored in the computer
793  (voltage levels and the programming model).  (voltage levels and the programming model).
794  Although  Although
795  in Knuth's development of integer arithmetic  in Knuth's development of integer arithmetic
796  \cite{bibref:b:knuthclassic2ndedvol2}  \cite{bibref:b:knuthclassic2ndedvol2}
797   it is assumed that   it is assumed that
798  integers may be represented in any base, we don't require such generality  integers may be represented in any base, we don't require such generality
799  in practice and in this work we confine ourselves for the most  in practice and in this work we confine ourselves for the most
800  part to $b=2^n$, and often to $n = 8i$, $i \in \vworkintsetpos$.  The assumption  part to $b=2^n$, and often to $n = 8i$, $i \in \vworkintsetpos$.  The assumption
801  of $b=2^n$ characterizes all modern digital computers, and we feel comfortable  of $b=2^n$ characterizes all modern digital computers, and we feel comfortable
802  making this assumption throughout the work.  However, the assumption  making this assumption throughout the work.  However, the assumption
803  $n = 8i$, $i \in \vworkintsetpos$ does not hold universally, and so we  $n = 8i$, $i \in \vworkintsetpos$ does not hold universally, and so we
804  most often do not make this assumption.  most often do not make this assumption.
805
806  By \emph{machine} integer, we mean an integer upon which the computer  By \emph{machine} integer, we mean an integer upon which the computer
807  can operate in a single instruction (such as to add, increment,  can operate in a single instruction (such as to add, increment,
808  load, store, etc.).  For most microcontrollers, machine integers are  load, store, etc.).  For most microcontrollers, machine integers are
809  either 8 or 16 bits in size.  The representation of a machine integer  either 8 or 16 bits in size.  The representation of a machine integer
810  is designed and specified by the microcontroller manufacturer.  In principle,  is designed and specified by the microcontroller manufacturer.  In principle,
811  nothing would prevent a microcontroller manufacturer from devising  nothing would prevent a microcontroller manufacturer from devising
812  and implementing a novel way of representing machine integers and supporting  and implementing a novel way of representing machine integers and supporting
813  this novel representation with an instruction set.  However, in  this novel representation with an instruction set.  However, in
814  practice, all machine integers are either simple unsigned integers or two's  practice, all machine integers are either simple unsigned integers or two's
815  complement signed integers.  In addition to the efficiency of these  complement signed integers.  In addition to the efficiency of these
816  representations with respect to the design of digital logic, these representations  representations with respect to the design of digital logic, these representations
817  are so standard and so pervasive that they are universally tacitly assumed.  are so standard and so pervasive that they are universally tacitly assumed.
818  For example, \texttt{if (i \& 0x1)}'' is an accepted C-language  For example, \texttt{if (i \& 0x1)}'' is an accepted C-language
819  idiom for if $i$ is odd'', and it is expected that such code will  idiom for if $i$ is odd'', and it is expected that such code will
820  work on all platforms.  work on all platforms.
821
822  By \emph{synthetic long} integer, we mean an integer of  By \emph{synthetic long} integer, we mean an integer of
823  arbitrary\footnote{By \emph{arbitrary}, we do not necessarily mean that  arbitrary\footnote{By \emph{arbitrary}, we do not necessarily mean that
824  the integer can grow to be arbitrarily large in magnitude, or that  the integer can grow to be arbitrarily large in magnitude, or that
825  its maximum size is not known at compile time.  We mean \emph{arbitrarily}  its maximum size is not known at compile time.  We mean \emph{arbitrarily}
826  longer than a machine integer.  Multiple-precision arithmetic libraries  longer than a machine integer.  Multiple-precision arithmetic libraries
827  can be divided into two classes---those that fix the size of the integers  can be divided into two classes---those that fix the size of the integers
828  at compile time, and those that use dynamic allocation and allow integers to  at compile time, and those that use dynamic allocation and allow integers to
829  grow as needed at run time.  The former category  grow as needed at run time.  The former category
830  is normally used for small microcontroller work, whereas the latter category  is normally used for small microcontroller work, whereas the latter category
831  (such as the GNU MP Library \cite{bibref:s:gnumultipleprecisionarithmeticlibrary})  (such as the GNU MP Library \cite{bibref:s:gnumultipleprecisionarithmeticlibrary})
832  is normally used in scientific and number theoretic calculation and on  is normally used in scientific and number theoretic calculation and on
833  more powerful platforms than microcontrollers.  The representational concepts  more powerful platforms than microcontrollers.  The representational concepts
834  we present here apply to both categories.}  we present here apply to both categories.}
835  length that is formed by concatenating machine integers.  There is some  length that is formed by concatenating machine integers.  There is some
836  subjectivity in deciding the representation of multiple-precision integers,  subjectivity in deciding the representation of multiple-precision integers,
837  and we discuss in the subsections  and we discuss in the subsections
838  \ref{ccil0:sroi0:srou0} and  \ref{ccil0:sroi0:srou0} and
839  \ref{ccil0:sroi0:sros0} which immediately follow.  \ref{ccil0:sroi0:sros0} which immediately follow.
840
841
842  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
843  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
844  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
845  \subsection{Representation Of Unsigned Integers}  \subsection{Representation Of Unsigned Integers}
846  %Subsection Tag: ROU0  %Subsection Tag: ROU0
847  \label{ccil0:sroi0:srou0}  \label{ccil0:sroi0:srou0}
848
849  Unsigned machine integers are always represented as an ordered  Unsigned machine integers are always represented as an ordered
850  array of bits (Figure \ref{fig:ccil0:sroi0:srou0:00}).  For an  array of bits (Figure \ref{fig:ccil0:sroi0:srou0:00}).  For an
851  $m$-bit unsigned integer $u$, we denote these bits $u_{[m-1]}$ through  $m$-bit unsigned integer $u$, we denote these bits $u_{[m-1]}$ through
852  $u_{[0]}$, with $u_{[m-1]}$ the most significant bit.  The value  $u_{[0]}$, with $u_{[m-1]}$ the most significant bit.  The value
853  of $u$ is the sum of the values of each bit multiplied  of $u$ is the sum of the values of each bit multiplied
854  by the power of 2 it represents:  by the power of 2 it represents:
855
856  \begin{figure}  \begin{figure}
857  \centering  \centering
858  \includegraphics[width=4.6in]{c_cil0/uintrep1.eps}  \includegraphics[width=4.6in]{c_cil0/uintrep1.eps}
859  \caption{Representation Of Unsigned Machine Integers}  \caption{Representation Of Unsigned Machine Integers}
860  \label{fig:ccil0:sroi0:srou0:00}  \label{fig:ccil0:sroi0:srou0:00}
861  \end{figure}  \end{figure}
862
863
864  \label{eq:ccil0:sroi0:srou0:00}  \label{eq:ccil0:sroi0:srou0:00}
865  u = \sum_{i=0}^{m-1} u_{[i]} 2^i .  u = \sum_{i=0}^{m-1} u_{[i]} 2^i .
866
867
868  In general, an $m$-bit unsigned integer can assume the values of  In general, an $m$-bit unsigned integer can assume the values of
869  0 through $2^m - 1$, so that  0 through $2^m - 1$, so that
870
871
872  \label{eq:ccil0:sroi0:srou0:01}  \label{eq:ccil0:sroi0:srou0:01}
873  u \in \{0, \ldots{} , 2^m - 1 \} .  u \in \{0, \ldots{} , 2^m - 1 \} .
874
875
876  Unsigned synthetic long integers are always represented as an array  Unsigned synthetic long integers are always represented as an array
877  of unsigned machine integers.    of unsigned machine integers.
878  Consistent with the GMP \cite{bibref:s:gnumultipleprecisionarithmeticlibrary},  Consistent with the GMP \cite{bibref:s:gnumultipleprecisionarithmeticlibrary},
879  we call each element of the array a \emph{limb} and we call the size of  we call each element of the array a \emph{limb} and we call the size of
880  each limb the \emph{limbsize}.  This usage is very close to what Knuth  each limb the \emph{limbsize}.  This usage is very close to what Knuth
881  calls the \emph{word}size $w$ in the excerpt presented in  calls the \emph{word}size $w$ in the excerpt presented in
882  Section \ref{ccil0:sppm0:smas0}.  Section \ref{ccil0:sppm0:smas0}.
883
884  Microcontroller processors are more likely than more powerful processors to have  Microcontroller processors are more likely than more powerful processors to have
885  a non-orthogonal instruction set, and so the limbsize may not be consistent between  a non-orthogonal instruction set, and so the limbsize may not be consistent between
886  operations in an arithmetic library.  operations in an arithmetic library.
887  With some processors, the appropriate limbsize may vary depending on the operation being  With some processors, the appropriate limbsize may vary depending on the operation being
888  performed.    performed.
889  For example, a microcontroller processor may be able to add two 16-bit integers to  For example, a microcontroller processor may be able to add two 16-bit integers to
890  obtain a 16-bit result plus a carry, but only be able to multiply two 8-bit integers to  obtain a 16-bit result plus a carry, but only be able to multiply two 8-bit integers to
891  obtain a 16-bit result (thus, the appropriate limbsize for addition may be  obtain a 16-bit result (thus, the appropriate limbsize for addition may be
892  16 bits while the appropriate limbsize for multiplication may be 8 bits).  16 bits while the appropriate limbsize for multiplication may be 8 bits).
893  In such cases, it is usually most efficient to add using 16-bit limbs but  In such cases, it is usually most efficient to add using 16-bit limbs but
894  multiply using 8-bit limbs.  Adding using 8-bit limbs on a machine which will  multiply using 8-bit limbs.  Adding using 8-bit limbs on a machine which will
895  support 16-bit additions is not normally a good design decision---even if the  support 16-bit additions is not normally a good design decision---even if the
896  machine supports an 8-bit addition instruction which is faster than the 16-bit addition  machine supports an 8-bit addition instruction which is faster than the 16-bit addition
897  instruction, $m/2$ 16-bit additions will nearly always be faster than  instruction, $m/2$ 16-bit additions will nearly always be faster than
898  $m$ 8-bit additions.  Using different limbsizes within the same arithmetic library  $m$ 8-bit additions.  Using different limbsizes within the same arithmetic library
899  may require some consideration of alignment and  may require some consideration of alignment and
900  endian issues, but these are implementation details  endian issues, but these are implementation details
901  which are easily solved.  which are easily solved.
902
903  We view a synthetic long unsigned integer as an array of limbs (machine integers)  We view a synthetic long unsigned integer as an array of limbs (machine integers)
904  of some size, and we agree that we will not address the array in any other way than  of some size, and we agree that we will not address the array in any other way than
905  by loading and storing limbs of this size.\footnote{Well \ldots{} not quite.  by loading and storing limbs of this size.\footnote{Well \ldots{} not quite.
906  In software for large computers (personal computers and workstations) with an  In software for large computers (personal computers and workstations) with an
907  orthogonal instruction set, we may be able to adhere to this rule.  However,  orthogonal instruction set, we may be able to adhere to this rule.  However,
908  with microcontrollers, arithmetic libraries which are optimized  with microcontrollers, arithmetic libraries which are optimized
909  may break this rule.}  In particular, because  may break this rule.}  In particular, because
910  computers may be big-endian'' or little-endian'', loading and storing  computers may be big-endian'' or little-endian'', loading and storing
911  smaller units than limbs may lead in  smaller units than limbs may lead in
912  a worst case to software defects or in a best case to non-portable code.  a worst case to software defects or in a best case to non-portable code.
913
914  Assume that $w$ is the number of bits in a limb.  Assume that $w$ is the number of bits in a limb.
915  Notationally, we denote an unsigned  Notationally, we denote an unsigned
916  synthetic long integer as an array of $m$ limbs  synthetic long integer as an array of $m$ limbs
917  $u_{m-1}$ through $u_0$, each containing $w$ bits,  $u_{m-1}$ through $u_0$, each containing $w$ bits,
918  with $u_0$ the least significant machine integer.    with $u_0$ the least significant machine integer.
919  We may also define $b=2^w$ (consistent with Knuth's  We may also define $b=2^w$ (consistent with Knuth's
920  notation).  notation).
921  The value of  The value of
922  such a synthetic long machine integer is  such a synthetic long machine integer is
923
924
925  \label{eq:ccil0:sroi0:srou0:02}  \label{eq:ccil0:sroi0:srou0:02}
926  u = \sum_{i=0}^{m-1} u_{i} 2^{wi}  u = \sum_{i=0}^{m-1} u_{i} 2^{wi}
927  =  =
928  \sum_{i=0}^{m-1} u_{i} b^i.  \sum_{i=0}^{m-1} u_{i} b^i.
929
930
931  As an alternative, we may write the value as the sum of the bit-values,  As an alternative, we may write the value as the sum of the bit-values,
932
933
934  \label{eq:ccil0:sroi0:srou0:03}  \label{eq:ccil0:sroi0:srou0:03}
935  u = \sum_{i=0}^{wm-1} u_{[i]} 2^{i} .  u = \sum_{i=0}^{wm-1} u_{[i]} 2^{i} .
936
937
938  Naturally, the range of such a synthetic long integer is  Naturally, the range of such a synthetic long integer is
939
940
941  \label{eq:ccil0:sroi0:srou0:04}  \label{eq:ccil0:sroi0:srou0:04}
942  u \in \{0, \ldots{} , 2^{wm} - 1 \} .  u \in \{0, \ldots{} , 2^{wm} - 1 \} .
943
944
945  In storing an unsigned synthetic long machine integer, the most natural way  In storing an unsigned synthetic long machine integer, the most natural way
946  to order the array of limbs depends on whether dynamic memory allocation is  to order the array of limbs depends on whether dynamic memory allocation is
947  used by the arithmetic library.  In microcontroller work, where arithmetic  used by the arithmetic library.  In microcontroller work, where arithmetic
948  library subroutines typically operate on fixed-size operands and produce  library subroutines typically operate on fixed-size operands and produce
949  fixed-size results, storing  fixed-size results, storing
950  limbs most significant limb first (i.e. in C', so that element \texttt{[0]}  limbs most significant limb first (i.e. in C', so that element \texttt{[0]}
951  of the array of limbs contains the most significant limb) may be natural  of the array of limbs contains the most significant limb) may be natural
952  and convenient.  However, this ordering would lead to computational waste in a library such  and convenient.  However, this ordering would lead to computational waste in a library such
953  as the GMP \cite{bibref:s:gnumultipleprecisionarithmeticlibrary} where integers  as the GMP \cite{bibref:s:gnumultipleprecisionarithmeticlibrary} where integers
954  may grow arbitrarily large and the library may need to reallocate long synthetic  may grow arbitrarily large and the library may need to reallocate long synthetic
955  integers to contain more limbs, as each reallocation would need to be followed  integers to contain more limbs, as each reallocation would need to be followed
956  by a memory copy to align the integer's existing limbs to the end of the array.  by a memory copy to align the integer's existing limbs to the end of the array.
957  For libraries such as the GMP, it is more practical to store limbs  For libraries such as the GMP, it is more practical to store limbs
958  least-significant limb first, as it eliminates the need to copy memory  least-significant limb first, as it eliminates the need to copy memory
959  when reallocations are done.  when reallocations are done.
960
961
962
963  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
964  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
965  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
966  \subsection{Representation Of Signed Integers}  \subsection{Representation Of Signed Integers}
967  %Subsection Tag: ROS0  %Subsection Tag: ROS0
968  \label{ccil0:sroi0:sros0}  \label{ccil0:sroi0:sros0}
969
970  Signed machine integers are always represented in two's complement form on modern  Signed machine integers are always represented in two's complement form on modern
971  processors.  This representation is universal because of the digital logic  processors.  This representation is universal because of the digital logic
972  conveniences---the same addition and subtraction mappings which are correct  conveniences---the same addition and subtraction mappings which are correct
973  for unsigned machine integers are also correct for signed machine integers,  for unsigned machine integers are also correct for signed machine integers,
974  although the criteria for overflow and comparison are  although the criteria for overflow and comparison are
975  different.  different.
976
977  Most readers are familiar with two's complement representation, so we will not  Most readers are familiar with two's complement representation, so we will not
978  belabor it.  However, we will present essential properties.  belabor it.  However, we will present essential properties.
979  When two's complement representation is used in an $m$-bit machine integer $u$:  When two's complement representation is used in an $m$-bit machine integer $u$:
980
981  \begin{enumerate}  \begin{enumerate}
982  \item All bit patterns with $u_{[m-1]} = 0$ represent non-negative integers, and  \item All bit patterns with $u_{[m-1]} = 0$ represent non-negative integers, and
983        represent the same integer as if the representation were unsigned.        represent the same integer as if the representation were unsigned.
984  \item All bit patterns with $u_{[m-1]} = 1$ represent negative numbers; specifically  \item All bit patterns with $u_{[m-1]} = 1$ represent negative numbers; specifically
985        $u_{[m-2:0]} - 2^{m-1}$; i.e.        $u_{[m-2:0]} - 2^{m-1}$; i.e.
986
987
988        \label{eq:ccil0:sroi0:sros0:00}        \label{eq:ccil0:sroi0:sros0:00}
989        u = - u_{[m-1]} 2^{m-1} + \sum_{i=0}^{m-2} u_{[i]} 2^i .        u = - u_{[m-1]} 2^{m-1} + \sum_{i=0}^{m-2} u_{[i]} 2^i .
990
991
992  \item $u \in \{-2^{m-1}, \ldots{}, 2^{m-1}-1 \}$.  \item $u \in \{-2^{m-1}, \ldots{}, 2^{m-1}-1 \}$.
993  \item All bit patterns represent a unique integer.  \item All bit patterns represent a unique integer.
994  \item For any integer except $-2^{m-1}$,  \item For any integer except $-2^{m-1}$,
995        negation can be performed by forming the one's complement (complementing        negation can be performed by forming the one's complement (complementing
996        every bit), then adding one.  To see why this is true algebraically, note that        every bit), then adding one.  To see why this is true algebraically, note that
997
998  \end{enumerate}  \end{enumerate}
999
1000
1001  However, let us observe that the value of an  However, let us observe that the value of an
1002  $m$-bit two's complement  $m$-bit two's complement
1003  machine integer is  machine integer is
1004
1005
1006  In general, an $m$-bit signed machine integer can assume the values of  In general, an $m$-bit signed machine integer can assume the values of
1007  $-2^{m-1}$ through $2^{m-1} - 1$, so that  $-2^{m-1}$ through $2^{m-1} - 1$, so that
1008
1009
1010  \label{eq:ccil0:sroi0:sros0:01}  \label{eq:ccil0:sroi0:sros0:01}
1011  u \in \{-2^{m-1}, \ldots{} , 2^{m-1} - 1 \} .  u \in \{-2^{m-1}, \ldots{} , 2^{m-1} - 1 \} .
1012
1013
1014  There are [at least] two different representations of signed  There are [at least] two different representations of signed
1015  multiple-precision integers:  multiple-precision integers:
1016
1017  \begin{itemize}  \begin{itemize}
1018  \item Two's complement representation.  \item Two's complement representation.
1019  \item Sign-magnitude representation.  \item Sign-magnitude representation.
1020  \end{itemize}  \end{itemize}
1021
1022  There are two different representations commonly used  There are two different representations commonly used
1023  for signed multiple-precision integers because two's complement  for signed multiple-precision integers because two's complement
1024  representation is not ideal for multiplication and division, although  representation is not ideal for multiplication and division, although
1025  it is ideal for addition and subtraction.  For multiple-precision  it is ideal for addition and subtraction.  For multiple-precision
1026  integer arithmetic, sign-magnitude representation is more common.  integer arithmetic, sign-magnitude representation is more common.
1027
1028  In two's complement representation of multiple-precision integers,  In two's complement representation of multiple-precision integers,
1029  the representation is the same as suggested by  the representation is the same as suggested by
1030  (\ref{eq:ccil0:sroi0:sros0:00}), except  (\ref{eq:ccil0:sroi0:sros0:00}), except
1031  more bits are involved.  For a two's complement representation  more bits are involved.  For a two's complement representation
1032  of a number consisting of $n$ machine integers with $W$ bits per  of a number consisting of $n$ machine integers with $W$ bits per
1033  machine integer,  machine integer,
1034
1035
1036  \label{eq:ccil0:sroi0:sros0:02}  \label{eq:ccil0:sroi0:sros0:02}
1037  u = - u_{B(Wn-1)} 2^{Wn-1} + \sum_{i=0}^{Wn-2} u_{B(i)} 2^i .  u = - u_{B(Wn-1)} 2^{Wn-1} + \sum_{i=0}^{Wn-2} u_{B(i)} 2^i .
1038
1039
1040  Because we would like to know how to compare signed multiple-precision  Because we would like to know how to compare signed multiple-precision
1041  integers in two's complement representation, we can gain some  integers in two's complement representation, we can gain some
1042  insight into the representation by rewriting  insight into the representation by rewriting
1043  (\ref{eq:ccil0:sroi0:sros0:02}) in terms of machine integers:  (\ref{eq:ccil0:sroi0:sros0:02}) in terms of machine integers:
1044
1045
1046  \label{eq:ccil0:sroi0:sros0:03}  \label{eq:ccil0:sroi0:sros0:03}
1047  u = - u_{B(Wn-1)} 2^{Wn-1}  u = - u_{B(Wn-1)} 2^{Wn-1}
1048  +  +
1049  \sum_{i=W(m-1)}^{Wn-2} u_{B(i)} 2^i  \sum_{i=W(m-1)}^{Wn-2} u_{B(i)} 2^i
1050  +  +
1051  \sum_{i=0}^{m-2} u_{i} 2^{Wi} .  \sum_{i=0}^{m-2} u_{i} 2^{Wi} .
1052
1053
1054  (\ref{eq:ccil0:sroi0:sros0:03}) gives some insight into the  (\ref{eq:ccil0:sroi0:sros0:03}) gives some insight into the
1055  relative values of multiple-precision signed two's complement  relative values of multiple-precision signed two's complement
1056  integers with respect to the values of the machine integers  integers with respect to the values of the machine integers
1057  that comprise them.  We discuss this further in  that comprise them.  We discuss this further in
1058  Section \ref{ccil0:scsi0}.  Section \ref{ccil0:scsi0}.
1059
1060  In sign-magnitude representation of multiple-precision signed two's  In sign-magnitude representation of multiple-precision signed two's
1061  complement integers, an integer $u$ is represented as a sign  complement integers, an integer $u$ is represented as a sign
1062  bit (usually a value of one indicates negativity), and a magnitude,  bit (usually a value of one indicates negativity), and a magnitude,
1063  which is $|u|$.  Unlike two's complement representation,  which is $|u|$.  Unlike two's complement representation,
1064  sign-magnitude representation, has two representations of zero---a positive  sign-magnitude representation, has two representations of zero---a positive
1065  one and a negative one.  Either a canonical form for zero should be  one and a negative one.  Either a canonical form for zero should be
1066  adopted, or both values should be treated identically.  adopted, or both values should be treated identically.
1067
1068  Assuming that the sign bit is stored in the most significant bit position,  Assuming that the sign bit is stored in the most significant bit position,
1069  it is easy to deduce that the value of a multiple-precision  it is easy to deduce that the value of a multiple-precision
1070  signed two's complement integer in sign-magnitude representation is  signed two's complement integer in sign-magnitude representation is
1071
1072
1073  \label{eq:ccil0:sros0:srou0:04}  \label{eq:ccil0:sros0:srou0:04}
1074  u = (-1)^{u_{B(m-1)}} \sum_{i=0}^{m-2} u_{B(i)} 2^i .  u = (-1)^{u_{B(m-1)}} \sum_{i=0}^{m-2} u_{B(i)} 2^i .
1075
1076
1077  Sign-magnitude representation is especially convenient because  Sign-magnitude representation is especially convenient because
1078  it allows machine instructions which accept unsigned operands to be used  it allows machine instructions which accept unsigned operands to be used
1079  to make calculations involving signed integers.  to make calculations involving signed integers.
1080
1081
1082
1083  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1084  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1085  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1086  \section{Characteristics Of Practical Processors}  \section{Characteristics Of Practical Processors}
1087  %Section tag: CPP  %Section tag: CPP
1088  \label{ccil0:scpp0}  \label{ccil0:scpp0}
1089
1090  Before discussing specific algorithms, it is necessary to  Before discussing specific algorithms, it is necessary to
1091  discuss the construction of practical processors---how such a processor  discuss the construction of practical processors---how such a processor
1092  manipulates machine integers.  We accept as a typical processor the  manipulates machine integers.  We accept as a typical processor the
1093  TMS-370C8, an 8-bit microcontroller manufactured by  TMS-370C8, an 8-bit microcontroller manufactured by
1094  Texas Instruments.  Texas Instruments.
1095
1096  \begin{figure}  \begin{figure}
1097  \centering  \centering
1098  \includegraphics[width=4.6in]{c_cil0/t370flag.eps}  \includegraphics[width=4.6in]{c_cil0/t370flag.eps}
1099  \caption{Texas Instruments TMS-370C8 Flags}  \caption{Texas Instruments TMS-370C8 Flags}
1100  \label{fig:ccil0:scpp0:00}  \label{fig:ccil0:scpp0:00}
1101  \end{figure}  \end{figure}
1102
1103  \begin{figure}  \begin{figure}
1104  \centering  \centering
1105  \includegraphics[width=4.6in]{c_cil0/t370cjmp.eps}  \includegraphics[width=4.6in]{c_cil0/t370cjmp.eps}
1106  \caption{Texas Instruments TMS-370C8 Conditional Jump Instructions}  \caption{Texas Instruments TMS-370C8 Conditional Jump Instructions}
1107  \label{fig:ccil0:scpp0:01}  \label{fig:ccil0:scpp0:01}
1108  \end{figure}  \end{figure}
1109
1110  A typical microcontroller allows operations on machine integers  A typical microcontroller allows operations on machine integers
1111  in the following steps:  in the following steps:
1112
1113  \begin{itemize}  \begin{itemize}
1114  \item A machine instruction is performed on one or two machine  \item A machine instruction is performed on one or two machine
1115        integer operands (for example:  addition, subtraction,        integer operands (for example:  addition, subtraction,
1116        multiplication, division, increment, decrement, complement,        multiplication, division, increment, decrement, complement,
1117        negation, or comparison).  This machine instruction may        negation, or comparison).  This machine instruction may
1118        produce a result, and usually sets a number of condition flags that        produce a result, and usually sets a number of condition flags that
1119        reflect the nature and validity of the result (Is it zero?        reflect the nature and validity of the result (Is it zero?
1120        Is it negative?  Did the result overflow?).  As an        Is it negative?  Did the result overflow?).  As an
1121        example, the condition        example, the condition
1122        flags of the TMS-370C8 are shown in Figure \ref{fig:ccil0:scpp0:00}.        flags of the TMS-370C8 are shown in Figure \ref{fig:ccil0:scpp0:00}.
1123        .        .
1124  \item A conditional branch instruction is used to branch conditionally  \item A conditional branch instruction is used to branch conditionally
1125        based on the state of the condition flags.  The definition of        based on the state of the condition flags.  The definition of
1126        the condition flags and the way in which the conditional        the condition flags and the way in which the conditional
1127        branch instruction utilizes them is designed to provide a        branch instruction utilizes them is designed to provide a
1128        way to treat both unsigned and signed machine integers.        way to treat both unsigned and signed machine integers.
1129        As an example, the way in which the conditional        As an example, the way in which the conditional
1130        jump instructions of the TMS-370C8 use the flags        jump instructions of the TMS-370C8 use the flags
1131        is shown in Figure \ref{fig:ccil0:scpp0:01}.        is shown in Figure \ref{fig:ccil0:scpp0:01}.
1132  \end{itemize}  \end{itemize}
1133
1134  It is not too often necessary to understand in detail  It is not too often necessary to understand in detail
1135  the Boolean relationships that govern how machine integers are  the Boolean relationships that govern how machine integers are
1136  added, subtracted, and compared; and how signed comparisons differ  added, subtracted, and compared; and how signed comparisons differ
1137  from unsigned comparisons.  In most cases, it is adequate to  from unsigned comparisons.  In most cases, it is adequate to
1138  rely on the design of the microcontroller.  However, we do present  rely on the design of the microcontroller.  However, we do present
1139  rudimentary observations in this section.  rudimentary observations in this section.
1140
1141
1142
1143  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1144  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1145  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1146  \section{Comparison Of Integers}  \section{Comparison Of Integers}
1147
1148
1149  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1150  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1151  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1152  \subsection{Comparison Of Unsigned Integers}  \subsection{Comparison Of Unsigned Integers}
1153
1154
1155  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1156  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1157  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1158  \subsection{Comparison Of Signed Integers}  \subsection{Comparison Of Signed Integers}
1159  %Section Tag: CSI0  %Section Tag: CSI0
1160  \label{ccil0:scsi0}  \label{ccil0:scsi0}
1161
1162
1163  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1164  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1165  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1169
1170  Addition of two $m$-bit integers is a combinational function---that is,  Addition of two $m$-bit integers is a combinational function---that is,
1171  the inputs uniquely determine the output.  Addition of binary  the inputs uniquely determine the output.  Addition of binary
1172  numbers is performed  numbers is performed
1173
1174
1175  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1176  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1177  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1179
1180
1181  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1182  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1183  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1185
1186
1187  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1188  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1189  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1191
1192
1193  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1194  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1195  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1196  \section {Integer Subtraction}  \section {Integer Subtraction}
1197
1198
1199  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1200  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1201  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1202  \subsection{Hardware Implementation Of Subtraction}  \subsection{Hardware Implementation Of Subtraction}
1203
1204
1205  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1206  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1207  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1208  \subsection{Subtraction Of Unsigned Operands}  \subsection{Subtraction Of Unsigned Operands}
1209
1210
1211  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1212  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1213  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1214  \subsection{Subtraction Of Signed Operands}  \subsection{Subtraction Of Signed Operands}
1215
1216
1217
1218  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1219  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1220  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1221  \section{Integer Multiplication}  \section{Integer Multiplication}
1222
1223
1224  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1225  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1226  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1227  \subsection{Hardware Implementation Of Multiplication}  \subsection{Hardware Implementation Of Multiplication}
1228
1229
1230  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1231  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1232  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1233  \subsection{Multiplication Of Unsigned Operands}  \subsection{Multiplication Of Unsigned Operands}
1234
1235
1236  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1237  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1238  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1239  \subsection{Multiplication Of Signed Operands}  \subsection{Multiplication Of Signed Operands}
1240
1241
1242
1243  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1244  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1245  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1246  \section{Integer Division}  \section{Integer Division}
1247  \label{ccil0:sidv0}  \label{ccil0:sidv0}
1248
1249  \index{division}\index{integer division}In this section,  \index{division}\index{integer division}In this section,
1250  we discuss the best known methods of dividing integers using  we discuss the best known methods of dividing integers using
1251  typical microcontroller instruction sets.  In general, given  typical microcontroller instruction sets.  In general, given
1252  two arbitrary integers $p$ and $q$, we are interested in determining  two arbitrary integers $p$ and $q$, we are interested in determining
1253  their quotient $q=\lfloor{}p/q\rfloor$ and remainder  their quotient $q=\lfloor{}p/q\rfloor$ and remainder
1254  $r=p\bmod{}q$ as economically as possible.  $r=p\bmod{}q$ as economically as possible.
1255
1256
1257  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1258  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1259  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1260  \subsection{Hardware Implementation Of Division}  \subsection{Hardware Implementation Of Division}
1261
1262
1263  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1264  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1265  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1266  \subsection{General Unsigned Division Without A Machine Division Instruction}  \subsection{General Unsigned Division Without A Machine Division Instruction}
1267  \label{ccil0:sidv0:sgdn0}  \label{ccil0:sidv0:sgdn0}
1268
1269
1270  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1271  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1272  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1273  \subsection{General Unsigned Division With A Machine Division Instruction}  \subsection{General Unsigned Division With A Machine Division Instruction}
1274  \label{ccil0:sidv0:sgdu0}  \label{ccil0:sidv0:sgdu0}
1275
1276  As mentioned many places in this work, efficiency in microcontroller software  As mentioned many places in this work, efficiency in microcontroller software
1277  involves phrasing computational problems in a way which makes good use of  involves phrasing computational problems in a way which makes good use of
1278  the machine instruction set.  In Section \ref{ccil0:sidv0:sgdn0} we discussed  the machine instruction set.  In Section \ref{ccil0:sidv0:sgdn0} we discussed
1279  the classic shift-compare-subtract algorithm for division.  This algorithm  the classic shift-compare-subtract algorithm for division.  This algorithm
1280  is far from efficient.  A reasonable question to ask is whether we can  is far from efficient.  A reasonable question to ask is whether we can
1281  leverage small'' division capability (provided by the machine instruction set)  leverage small'' division capability (provided by the machine instruction set)
1282  to accomplish large'' divisions (those which we require in practice).  to accomplish large'' divisions (those which we require in practice).
1283  It ends up that this is possible:  the technique involved is effectively  It ends up that this is possible:  the technique involved is effectively
1284  to use machine division instructions to estimate the highest-order bits of  to use machine division instructions to estimate the highest-order bits of
1285  the quotient based on the highest-order bits of the dividend and divisor.  the quotient based on the highest-order bits of the dividend and divisor.
1286
1287  Knuth's discussion of division  Knuth's discussion of division
1288  algorithms \cite[pp. 270-275]{bibref:b:knuthclassic2ndedvol2} is the  algorithms \cite[pp. 270-275]{bibref:b:knuthclassic2ndedvol2} is the
1289  basis for most of the material in this subsection.  However, Knuth has  basis for most of the material in this subsection.  However, Knuth has
1290  a gift for terseness that is sometimes a curse for the reader, and so  a gift for terseness that is sometimes a curse for the reader, and so
1291  we take more time than Knuth to explain certain results.  we take more time than Knuth to explain certain results.
1292
1293  First, as a starting point, we present \emph{Algorithm D} from  First, as a starting point, we present \emph{Algorithm D} from
1294  Knuth \cite[pp. 272-273]{bibref:b:knuthclassic2ndedvol2}.  Then,  Knuth \cite[pp. 272-273]{bibref:b:knuthclassic2ndedvol2}.  Then,
1295  we justify the algorithm and explain why it is valid.  Finally,  we justify the algorithm and explain why it is valid.  Finally,
1296  we supply implementation advice for microcontroller instruction sets.  we supply implementation advice for microcontroller instruction sets.
1297
1298  \begin{vworkalgorithmstatementpar}{Arbitrary Unsigned Division Using  \begin{vworkalgorithmstatementpar}{Arbitrary Unsigned Division Using
1299  Machine Unsigned Division Instructions}  Machine Unsigned Division Instructions}
1300  \label{alg:ccil0:sidv0:sgdu0:01}  \label{alg:ccil0:sidv0:sgdu0:01}
1301  (From Knuth \cite[pp. 272-273]{bibref:b:knuthclassic2ndedvol2})  (From Knuth \cite[pp. 272-273]{bibref:b:knuthclassic2ndedvol2})
1302  Given nonnegative integers $u=(u_{m+n-1} \ldots{} u_1 u_0)_b$  Given nonnegative integers $u=(u_{m+n-1} \ldots{} u_1 u_0)_b$
1303  and $v=(v_{n-1} \ldots{} v_1 v_0)_b$, where  and $v=(v_{n-1} \ldots{} v_1 v_0)_b$, where
1304  $v_{n-1} \neq 0$ and $n > 1$, we form the radix-$b$ quotient  $v_{n-1} \neq 0$ and $n > 1$, we form the radix-$b$ quotient
1305  $\lfloor{}u/v\rfloor{} = (q_m q_{m-1} \ldots{} q_0)_b$ and  $\lfloor{}u/v\rfloor{} = (q_m q_{m-1} \ldots{} q_0)_b$ and
1306  the remainder $u \bmod v = (r_{n-1} \ldots{} r_1 r_0)_b$.  When  the remainder $u \bmod v = (r_{n-1} \ldots{} r_1 r_0)_b$.  When
1307  $n=1$, the simpler algorithm of  $n=1$, the simpler algorithm of
1308  Subsection \ref{ccil0:sidv0:sldm0}  Subsection \ref{ccil0:sidv0:sldm0}
1309  should be used.  should be used.
1310
1311  \begin{algblvl0}  \begin{algblvl0}
1312  \item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:01}  \item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:01}
1313        [Normalize.] Set $d \gets \lfloor{}b/(v_{n-1} + 1)\rfloor$.        [Normalize.] Set $d \gets \lfloor{}b/(v_{n-1} + 1)\rfloor$.
1314        Then set $(u_{m+n} u_{m+n-1} \ldots{} u_1 u_0)_b$ equal to        Then set $(u_{m+n} u_{m+n-1} \ldots{} u_1 u_0)_b$ equal to
1315        $(u_{m+n-1} \ldots{} u_1 u_0)_b$ times $d$; similarly,        $(u_{m+n-1} \ldots{} u_1 u_0)_b$ times $d$; similarly,
1316        set $(v_{n-1} \ldots{} v_1 v_0)_b$ equal to        set $(v_{n-1} \ldots{} v_1 v_0)_b$ equal to
1317        $(v_{n-1} \ldots{} v_1 v_0)_b$ times $d$.  (Notice the introduction        $(v_{n-1} \ldots{} v_1 v_0)_b$ times $d$.  (Notice the introduction
1318        of a new digit position $u_{m+n}$ at the left of        of a new digit position $u_{m+n}$ at the left of
1319        $u_{m+n-1}$; if $d=1$, all we need to do in this step is set        $u_{m+n-1}$; if $d=1$, all we need to do in this step is set
1320        $u_{m+n} \gets 0$.  On a binary computer it may be preferable        $u_{m+n} \gets 0$.  On a binary computer it may be preferable
1321        to choose $d$ to be a power of 2 instead of using the value        to choose $d$ to be a power of 2 instead of using the value
1322        suggested here; any value of $d$ that results in        suggested here; any value of $d$ that results in
1323        $v_{n-1} \geq \lfloor{}b/2\rfloor$ will suffice.  See also        $v_{n-1} \geq \lfloor{}b/2\rfloor$ will suffice.  See also
1324        exercise 37.)        exercise 37.)
1325  \item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:02}  \item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:02}
1326        [Initialize $j$.]  Set $j \gets m$.  (The loop on $j$,        [Initialize $j$.]  Set $j \gets m$.  (The loop on $j$,
1327        steps        steps
1328        \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:03}        \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:03}
1329        through        through
1330        \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:07},        \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:07},
1331        will be essentially a division of        will be essentially a division of
1332        $(u_{j+n} \ldots{} u_{j+1} u_j)_b$ by $(v_{n-1} \ldots{} v_1 v_0)_b$ to        $(u_{j+n} \ldots{} u_{j+1} u_j)_b$ by $(v_{n-1} \ldots{} v_1 v_0)_b$ to
1333        get a single quotient digit $q_j$; see Figure \ref{fig:alg:ccil0:sidv0:sgdu0:01:01}.)        get a single quotient digit $q_j$; see Figure \ref{fig:alg:ccil0:sidv0:sgdu0:01:01}.)
1334
1335  \begin{figure}  \begin{figure}
1336  \centering  \centering
1337  \includegraphics[width=4.6in]{c_cil0/kdfc01.eps}  \includegraphics[width=4.6in]{c_cil0/kdfc01.eps}
1338  \caption{Flowchart For Algorithm \ref{alg:ccil0:sidv0:sgdu0:01} (From \cite[p. 273]{bibref:b:knuthclassic2ndedvol2})}  \caption{Flowchart For Algorithm \ref{alg:ccil0:sidv0:sgdu0:01} (From \cite[p. 273]{bibref:b:knuthclassic2ndedvol2})}
1339  \label{fig:alg:ccil0:sidv0:sgdu0:01:01}  \label{fig:alg:ccil0:sidv0:sgdu0:01:01}
1340  \end{figure}  \end{figure}
1341
1342  \item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:03}  \item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:03}
1343        [Calculate $\hat{q}$.]  Set        [Calculate $\hat{q}$.]  Set
1344        $\hat{q} \gets \lfloor{}u_{j+n}b + u_{j+n-1})/v_{n-1}\rfloor$ and        $\hat{q} \gets \lfloor{}u_{j+n}b + u_{j+n-1})/v_{n-1}\rfloor$ and
1345        let $\hat{r}$ be the remainder, $(u_{j+n}b + u_{j+n-1}) \bmod v_{n-1}$.        let $\hat{r}$ be the remainder, $(u_{j+n}b + u_{j+n-1}) \bmod v_{n-1}$.
1346        Now test if $\hat{q} = b$ or $\hat{q} v_{n-2} > b\hat{r} + u_{j+n-2}$;        Now test if $\hat{q} = b$ or $\hat{q} v_{n-2} > b\hat{r} + u_{j+n-2}$;
1347        if so, decrease $\hat{q}$ by 1, increase $\hat{r}$ by $v_{n-1}$, and repeat        if so, decrease $\hat{q}$ by 1, increase $\hat{r}$ by $v_{n-1}$, and repeat
1348        this test if $\hat{r} < b$.  (The test of $v_{n-2}$ determines at high        this test if $\hat{r} < b$.  (The test of $v_{n-2}$ determines at high
1349        speed most of the cases in which the trial value $\hat{q}$ is one too large,        speed most of the cases in which the trial value $\hat{q}$ is one too large,
1350        and it eliminates \emph{all} cases where $\hat{q}$ is two too large;        and it eliminates \emph{all} cases where $\hat{q}$ is two too large;
1351        see exercises 19, 20, 21.)        see exercises 19, 20, 21.)
1352  \item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:04}  \item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:04}
1353        [Multiply and subtract.]  Replace $(u_{j+n} u_{j+n-1} \ldots{} u_j)_b$ by        [Multiply and subtract.]  Replace $(u_{j+n} u_{j+n-1} \ldots{} u_j)_b$ by
1354
1355
1356        \nonumber        \nonumber
1357        (u_{j+n} u_{j+n-1} \ldots{} u_j)_b - \hat{q} (0 v_{n-1} \ldots{} v_1 v_0)_b.        (u_{j+n} u_{j+n-1} \ldots{} u_j)_b - \hat{q} (0 v_{n-1} \ldots{} v_1 v_0)_b.
1358
1359
1360        This computation consists of a simple multiplication by a one-place number,        This computation consists of a simple multiplication by a one-place number,
1361        combined with a subtraction.  The digits $(u_{j+n}, u_{j+n-1}, \ldots{}, u_j)$        combined with a subtraction.  The digits $(u_{j+n}, u_{j+n-1}, \ldots{}, u_j)$
1362        should be kept positive; if the result of this step is actually negative,        should be kept positive; if the result of this step is actually negative,
1363        $(u_{j+n} u_{j+n-1} \ldots{} u_j)_b$ should be left as the true        $(u_{j+n} u_{j+n-1} \ldots{} u_j)_b$ should be left as the true
1364        value plus $b^{n+1}$, namely as the $b$'s complement of the true value, and        value plus $b^{n+1}$, namely as the $b$'s complement of the true value, and
1365        a borrow'' to the left should be remembered.        a borrow'' to the left should be remembered.
1366  \item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:05}  \item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:05}
1367        [Test remainder.]  Set $q_j \gets \hat{q}$.  If the result of step        [Test remainder.]  Set $q_j \gets \hat{q}$.  If the result of step
1368        \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:04} was negative, go to        \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:04} was negative, go to
1369        step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:06}; otherwise go on to        step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:06}; otherwise go on to
1370        step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:07}.        step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:07}.
1371  \item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:06}  \item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:06}
1372        [Add back.]  (The probability that this step is necessary is very small, on the        [Add back.]  (The probability that this step is necessary is very small, on the
1373        order of only $2/b$, as shown in exercise 21; test data to activate this step        order of only $2/b$, as shown in exercise 21; test data to activate this step
1374        should therefore be specifically contrived when debugging.)  Decrease        should therefore be specifically contrived when debugging.)  Decrease
1375        $q_j$ by 1, and add $(0 v_{n-1} \ldots v_1 v_0)_b$ to        $q_j$ by 1, and add $(0 v_{n-1} \ldots v_1 v_0)_b$ to
1376        $(u_{j+n} u_{j+n-1} \ldots{} u_{j+1} u_j)_b$.  (A carry will occur to the left of        $(u_{j+n} u_{j+n-1} \ldots{} u_{j+1} u_j)_b$.  (A carry will occur to the left of
1377        $u_{j+n}$, and it should be ignored since it cancels with the borrow that        $u_{j+n}$, and it should be ignored since it cancels with the borrow that
1378        occured in step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:04}.)        occured in step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:04}.)
1379  \item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:07}  \item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:07}
1380        [Loop on $j$.]  Decrease $j$ by one.  Now if $j \geq 0$, go back to step        [Loop on $j$.]  Decrease $j$ by one.  Now if $j \geq 0$, go back to step
1381        \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:03}.        \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:03}.
1382  \item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:08}  \item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:08}
1383        [Unnormalize.]  Now $(q_m \ldots{} q_1 q_0)_b$ is the desired quotient, and        [Unnormalize.]  Now $(q_m \ldots{} q_1 q_0)_b$ is the desired quotient, and
1384        the desired remainder may be obtained by dividing        the desired remainder may be obtained by dividing
1385        $(u_{n-1} \ldots{} u_1 u_0)_b$ by $d$.        $(u_{n-1} \ldots{} u_1 u_0)_b$ by $d$.
1386  \end{algblvl0}  \end{algblvl0}
1387  \end{vworkalgorithmstatementpar}  \end{vworkalgorithmstatementpar}
1388  \vworkalgorithmfooter{}  \vworkalgorithmfooter{}
1389
1390  The general idea of Algorithm \ref{alg:ccil0:sidv0:sgdu0:01} is that  The general idea of Algorithm \ref{alg:ccil0:sidv0:sgdu0:01} is that
1391  digits (machine words) of the quotient $q$ can be successively estimated  digits (machine words) of the quotient $q$ can be successively estimated
1392  based on the first digits of the dividend and divisor.  Knuth  based on the first digits of the dividend and divisor.  Knuth
1393  \cite[p. 271]{bibref:b:knuthclassic2ndedvol2} explores the properties of  \cite[p. 271]{bibref:b:knuthclassic2ndedvol2} explores the properties of
1394  the digit estimate  the digit estimate
1395
1396
1397  \label{eq:ccil0:sidv0:sgdu0:01}  \label{eq:ccil0:sidv0:sgdu0:01}
1398  \hat{q} = \min \left( {\left\lfloor{\frac{u_n b + u_{n-1}}{v_{n-1}}}\right\rfloor, b-1} \right).  \hat{q} = \min \left( {\left\lfloor{\frac{u_n b + u_{n-1}}{v_{n-1}}}\right\rfloor, b-1} \right).
1399
1400
1401  The first point to make about an estimate in the form of  The first point to make about an estimate in the form of
1402  (\ref{eq:ccil0:sidv0:sgdu0:01}) is that it can only be accomplished  (\ref{eq:ccil0:sidv0:sgdu0:01}) is that it can only be accomplished
1403  efficiently if the machine-native division instruction supports  efficiently if the machine-native division instruction supports
1404  overflow detection, since it is possible that  overflow detection, since it is possible that
1405  $(u_n b + u_{n-1})/v_{n-1} \geq b$, even if  $(u_n b + u_{n-1})/v_{n-1} \geq b$, even if
1406  $u/v < b$, as is shown by the following example.  $u/v < b$, as is shown by the following example.
1407
1408  \begin{vworkexamplestatement}  \begin{vworkexamplestatement}
1409  \label{ex:ccil0:sidv0:sgdu0:01:01}  \label{ex:ccil0:sidv0:sgdu0:01:01}
1410  Assume that we wish to apply the estimate of $\hat{q}$ provided by  Assume that we wish to apply the estimate of $\hat{q}$ provided by
1411  (\ref{eq:ccil0:sidv0:sgdu0:01})  (\ref{eq:ccil0:sidv0:sgdu0:01})
1412  to $u=16,776,704$ and $v=65,535$.  Demonstrate that a machine division  to $u=16,776,704$ and $v=65,535$.  Demonstrate that a machine division
1413  overflow will occur when estimating the first digit, assuming a processor  overflow will occur when estimating the first digit, assuming a processor
1414  that can divide a 16-bit dividend by an 8-bit divisor to produce an 8-bit  that can divide a 16-bit dividend by an 8-bit divisor to produce an 8-bit
1415  quotient.  quotient.
1416  \end{vworkexamplestatement}  \end{vworkexamplestatement}
1417  \begin{vworkexampleparsection}{Solution}  \begin{vworkexampleparsection}{Solution}
1418  Note that according to Knuth's intention, the word size on such a machine  Note that according to Knuth's intention, the word size on such a machine
1419  is 8 bits.  Thus, $b=256$.  Note that $u/v = 255 + 255/256 < b = 256$, as  is 8 bits.  Thus, $b=256$.  Note that $u/v = 255 + 255/256 < b = 256$, as
1420  required by Knuth's precondition.  However, although $u/v < b$,  required by Knuth's precondition.  However, although $u/v < b$,
1421  $u = [255 \; 254 \; 0] [256^2 \; 256 \; 1]^T = [u_2 u_1 u_0] [b^2 b^1 b^0]^T$ and  $u = [255 \; 254 \; 0] [256^2 \; 256 \; 1]^T = [u_2 u_1 u_0] [b^2 b^1 b^0]^T$ and
1422  $v = [255 \; 255] [256 \; 1]^T = [v_1 v_0] [b^1 b^0]^T$, so that calculating  $v = [255 \; 255] [256 \; 1]^T = [v_1 v_0] [b^1 b^0]^T$, so that calculating
1423  an estimate $\hat{q}$ as required by (\ref{eq:ccil0:sidv0:sgdu0:01}),  an estimate $\hat{q}$ as required by (\ref{eq:ccil0:sidv0:sgdu0:01}),
1424  $(u_n b + u_{n-1})/v_{n-1} = 65,534/255 = 256 + 254/255 \geq b$, is a division  $(u_n b + u_{n-1})/v_{n-1} = 65,534/255 = 256 + 254/255 \geq b$, is a division
1425  overflow for a single machine division instruction.  Thus, it follows that  overflow for a single machine division instruction.  Thus, it follows that
1426  a machine with division overflow detection can quickly determine that $b-1$ from  a machine with division overflow detection can quickly determine that $b-1$ from
1427  (\ref{eq:ccil0:sidv0:sgdu0:01}) is the minimum, whereas a machine without  (\ref{eq:ccil0:sidv0:sgdu0:01}) is the minimum, whereas a machine without
1428  division overflow  division overflow
1429  detection would have to use several additional machine instructions to make  detection would have to use several additional machine instructions to make
1430  this determination.  this determination.
1431  \end{vworkexampleparsection}  \end{vworkexampleparsection}
1432  \vworkexamplefooter{}  \vworkexamplefooter{}
1433
1434  The second thing to establish about $\hat{q}$ as defined by  The second thing to establish about $\hat{q}$ as defined by
1435  (\ref{eq:ccil0:sidv0:sgdu0:01}) is how good'' of an estimate  (\ref{eq:ccil0:sidv0:sgdu0:01}) is how good'' of an estimate
1436  $\hat{q}$ is---how much information, exactly, about $q$ can we  $\hat{q}$ is---how much information, exactly, about $q$ can we
1437  obtain by examining the first two words of $u$ and the first  obtain by examining the first two words of $u$ and the first
1438  word of $v$?  word of $v$?
1439
1440  We first establish in the following lemma that our estimate of  We first establish in the following lemma that our estimate of
1441  $q$, $\hat{q}$, can be no less than $q$.  $q$, $\hat{q}$, can be no less than $q$.
1442
1443  \begin{vworklemmastatementpar}{\mbox{\boldmath$\hat{q} \geq q$}}  \begin{vworklemmastatementpar}{\mbox{\boldmath$\hat{q} \geq q$}}
1444  \label{lem:ccil0:sidv0:sgdu0:01}  \label{lem:ccil0:sidv0:sgdu0:01}
1445  The estimate of $q$ provided by (\ref{eq:ccil0:sidv0:sgdu0:01}),  The estimate of $q$ provided by (\ref{eq:ccil0:sidv0:sgdu0:01}),
1446  $\hat{q}$ is always at least as great as the actual value of  $\hat{q}$ is always at least as great as the actual value of
1447  $q$, i.e. $\hat{q} \geq q$.  $q$, i.e. $\hat{q} \geq q$.
1448  \end{vworklemmastatementpar}  \end{vworklemmastatementpar}
1449  \begin{vworklemmaproof}  \begin{vworklemmaproof}
1450  Knowledge of $u_{n}$, $u_{n-1}$, and $v_{n-1}$ necessarily confine  Knowledge of $u_{n}$, $u_{n-1}$, and $v_{n-1}$ necessarily confine
1451  the intervals in which the actual values of $u$ and $v$ may be;  the intervals in which the actual values of $u$ and $v$ may be;
1452  specifically:\footnote{In  specifically:\footnote{In
1453  (\ref{eq:lem:ccil0:sidv0:sgdu0:01:04})  (\ref{eq:lem:ccil0:sidv0:sgdu0:01:04})
1454  and  and
1455  (\ref{eq:lem:ccil0:sidv0:sgdu0:01:07}),  (\ref{eq:lem:ccil0:sidv0:sgdu0:01:07}),
1456  we use statements of the form $x = x$'' as an idiom for  we use statements of the form $x = x$'' as an idiom for
1457  $x$ is known''.}  $x$ is known''.}
1458
1459  \begin{eqnarray}  \begin{eqnarray}
1460  \label{eq:lem:ccil0:sidv0:sgdu0:01:01}  \label{eq:lem:ccil0:sidv0:sgdu0:01:01}
1461  u & = & \sum_{i=0}^{n} u_i b^i   \\  u & = & \sum_{i=0}^{n} u_i b^i   \\
1462  \label{eq:lem:ccil0:sidv0:sgdu0:01:02}  \label{eq:lem:ccil0:sidv0:sgdu0:01:02}
1463    & = & u_n b^n + u_{n-1} b^{n-1} + \ldots{} + u_2 b^2 + u_1 b + u_0   \\    & = & u_n b^n + u_{n-1} b^{n-1} + \ldots{} + u_2 b^2 + u_1 b + u_0   \\
1464  \label{eq:lem:ccil0:sidv0:sgdu0:01:03}  \label{eq:lem:ccil0:sidv0:sgdu0:01:03}
1465    & = & (u_n b + u_{n-1}) b^{n-1} + \ldots{} + u_2 b^2 + u_1 b + u_0    & = & (u_n b + u_{n-1}) b^{n-1} + \ldots{} + u_2 b^2 + u_1 b + u_0
1466  \end{eqnarray}  \end{eqnarray}
1467
1468  \begin{eqnarray}  \begin{eqnarray}
1469  \nonumber & (u_n = u_n \wedge u_{n-1} = u_{n-1}) & \\  \nonumber & (u_n = u_n \wedge u_{n-1} = u_{n-1}) & \\
1470  \label{eq:lem:ccil0:sidv0:sgdu0:01:04}  \label{eq:lem:ccil0:sidv0:sgdu0:01:04}
1471  & \vworkvimp & \\  & \vworkvimp & \\
1472  \nonumber & (u_n b + u_{n-1}) b^{n-1} \leq u \leq (u_n b + u_{n-1}) b^{n-1} + b^{n-1} - 1 &  \nonumber & (u_n b + u_{n-1}) b^{n-1} \leq u \leq (u_n b + u_{n-1}) b^{n-1} + b^{n-1} - 1 &
1473  \end{eqnarray}  \end{eqnarray}
1474
1475  \begin{eqnarray}  \begin{eqnarray}
1476  \label{eq:lem:ccil0:sidv0:sgdu0:01:05}  \label{eq:lem:ccil0:sidv0:sgdu0:01:05}
1477  v & = & \sum_{i=0}^{n-1} v_i b^i   \\  v & = & \sum_{i=0}^{n-1} v_i b^i   \\
1478  \label{eq:lem:ccil0:sidv0:sgdu0:01:06}  \label{eq:lem:ccil0:sidv0:sgdu0:01:06}
1479  & = & v_{n-1} b^{n-1} + v_{n-2} b^{n-2} + \ldots{} + v_2 b^2 + v_1 b + v_0  & = & v_{n-1} b^{n-1} + v_{n-2} b^{n-2} + \ldots{} + v_2 b^2 + v_1 b + v_0
1480  \end{eqnarray}  \end{eqnarray}
1481
1482
1483  \label{eq:lem:ccil0:sidv0:sgdu0:01:07}  \label{eq:lem:ccil0:sidv0:sgdu0:01:07}
1484  (v_{n-1} = v_{n-1})  (v_{n-1} = v_{n-1})
1485  \vworkhimp  \vworkhimp
1486  v_{n-1} b^{n-1} \leq v \leq v_{n-1} b^{n-1} + b^{n-1} - 1  v_{n-1} b^{n-1} \leq v \leq v_{n-1} b^{n-1} + b^{n-1} - 1
1487
1488
1489  (\ref{eq:lem:ccil0:sidv0:sgdu0:01:04}) and  (\ref{eq:lem:ccil0:sidv0:sgdu0:01:04}) and
1490  (\ref{eq:lem:ccil0:sidv0:sgdu0:01:07}) reflect the uncertainties in the  (\ref{eq:lem:ccil0:sidv0:sgdu0:01:07}) reflect the uncertainties in the
1491  values of $u$ and $v$ respectively because only the first digit(s) of  values of $u$ and $v$ respectively because only the first digit(s) of
1492  $u$ and $v$ are being considered in forming the estimate $\hat{q}$.  $u$ and $v$ are being considered in forming the estimate $\hat{q}$.
1493
1494  By definition, the actual value of $q$ is $\lfloor{}u/v\rfloor$.  For a  By definition, the actual value of $q$ is $\lfloor{}u/v\rfloor$.  For a
1495  rational function $f(u,v) = u/v$ where $u \in [u_{min}, u_{max}]$ and  rational function $f(u,v) = u/v$ where $u \in [u_{min}, u_{max}]$ and
1496  $v \in [v_{min}, v_{max}]$, the minimum value of $u/v$ occurs at  $v \in [v_{min}, v_{max}]$, the minimum value of $u/v$ occurs at
1497  $u_{min}/v_{max}$, and the maximum value of $u/v$ occurs at  $u_{min}/v_{max}$, and the maximum value of $u/v$ occurs at
1498  $u_{max}/v_{min}$.  We can therefore write that  $u_{max}/v_{min}$.  We can therefore write that
1499
1500
1501  \label{eq:lem:ccil0:sidv0:sgdu0:01:08}  \label{eq:lem:ccil0:sidv0:sgdu0:01:08}
1502  \left\lfloor{\frac{(u_n b + u_{n-1}) b^{n-1}}{v_{n-1} b^{n-1} + b^{n-1} - 1}}\right\rfloor  \left\lfloor{\frac{(u_n b + u_{n-1}) b^{n-1}}{v_{n-1} b^{n-1} + b^{n-1} - 1}}\right\rfloor
1503  \leq  \leq
1504  q  q
1505  \leq  \leq
1506  \left\lfloor{\frac{(u_n b + u_{n-1}) b^{n-1} + b^{n-1} - 1}{v_{n-1} b^{n-1}}}\right\rfloor .  \left\lfloor{\frac{(u_n b + u_{n-1}) b^{n-1} + b^{n-1} - 1}{v_{n-1} b^{n-1}}}\right\rfloor .
1507
1508
1509  In other words, knowledge of $u_{n}$, $u_{n-1}$, and $v_{n-1}$ confines $q$ to the  In other words, knowledge of $u_{n}$, $u_{n-1}$, and $v_{n-1}$ confines $q$ to the
1510  interval indicated in (\ref{eq:lem:ccil0:sidv0:sgdu0:01:08}).  We must prove that,  interval indicated in (\ref{eq:lem:ccil0:sidv0:sgdu0:01:08}).  We must prove that,
1511  given a specific $u_{n}$, $u_{n-1}$, and $v_{n-1}$, $\hat{q}$ is at least as large as  given a specific $u_{n}$, $u_{n-1}$, and $v_{n-1}$, $\hat{q}$ is at least as large as
1512  the upper bound in (\ref{eq:lem:ccil0:sidv0:sgdu0:01:08}); otherwise we could find a  the upper bound in (\ref{eq:lem:ccil0:sidv0:sgdu0:01:08}); otherwise we could find a
1513  $q$ such that $q > \hat{q}$.  We can algebraically manipulate the upper bound in  $q$ such that $q > \hat{q}$.  We can algebraically manipulate the upper bound in
1514  in (\ref{eq:lem:ccil0:sidv0:sgdu0:01:08}) to yield  in (\ref{eq:lem:ccil0:sidv0:sgdu0:01:08}) to yield
1515
1516
1517  \label{eq:lem:ccil0:sidv0:sgdu0:01:09}  \label{eq:lem:ccil0:sidv0:sgdu0:01:09}
1518  \left\lfloor{\frac{(u_n b + u_{n-1}) b^{n-1}}{v_{n-1} b^{n-1} + b^{n-1} - 1}}\right\rfloor  \left\lfloor{\frac{(u_n b + u_{n-1}) b^{n-1}}{v_{n-1} b^{n-1} + b^{n-1} - 1}}\right\rfloor
1519  \leq  \leq
1520  q  q
1521  \leq  \leq
1522  \left\lfloor{\frac{u_n b + u_{n-1} + \frac{b^{n-1}-1}{b^{n-1}}}{v_{n-1}}}\right\rfloor .  \left\lfloor{\frac{u_n b + u_{n-1} + \frac{b^{n-1}-1}{b^{n-1}}}{v_{n-1}}}\right\rfloor .
1523
1524
1525  In (\ref{eq:lem:ccil0:sidv0:sgdu0:01:09}), since $(b^{n-1}-1)/b^{n-1} < 1$ and since  In (\ref{eq:lem:ccil0:sidv0:sgdu0:01:09}), since $(b^{n-1}-1)/b^{n-1} < 1$ and since
1526  $u_n b + u_{n-1}$ is an integer, we can conclude that  $u_n b + u_{n-1}$ is an integer, we can conclude that
1527  $\lfloor u_n b + u_{n-1} + (b^{n-1}-1)/b^{n-1} \rfloor = \lfloor u_n b + u_{n-1} \rfloor$  $\lfloor u_n b + u_{n-1} + (b^{n-1}-1)/b^{n-1} \rfloor = \lfloor u_n b + u_{n-1} \rfloor$
1528  and hence that  and hence that
1529
1530
1531  \label{eq:lem:ccil0:sidv0:sgdu0:01:10}  \label{eq:lem:ccil0:sidv0:sgdu0:01:10}
1532  q  q
1533  \leq  \leq
1534  \left\lfloor{\frac{u_n b + u_{n-1} + \frac{b^{n-1}-1}{b^{n-1}}}{v_{n-1}}}\right\rfloor  \left\lfloor{\frac{u_n b + u_{n-1} + \frac{b^{n-1}-1}{b^{n-1}}}{v_{n-1}}}\right\rfloor
1535  =  =
1536  \left\lfloor{\frac{u_n b + u_{n-1}}{v_{n-1}}}\right\rfloor  \left\lfloor{\frac{u_n b + u_{n-1}}{v_{n-1}}}\right\rfloor
1537  =  =
1538  \hat{q} .  \hat{q} .
1539
1540
1541  Therefore, $\hat{q} \geq q$.  Therefore, $\hat{q} \geq q$.
1542  \end{vworklemmaproof}  \end{vworklemmaproof}
1543  \vworklemmafooter{}  \vworklemmafooter{}
1544
1545  Lemma \ref{lem:ccil0:sidv0:sgdu0:01} establishes that  Lemma \ref{lem:ccil0:sidv0:sgdu0:01} establishes that
1546  a digit estimate $\hat{q}$ based on the first digit of the  a digit estimate $\hat{q}$ based on the first digit of the
1547  divisor $v$ can be no less than the actual digit $q$, i.e.  divisor $v$ can be no less than the actual digit $q$, i.e.
1548  $\hat{q}-q \geq 0$.  However, we must also establish an upper bound  $\hat{q}-q \geq 0$.  However, we must also establish an upper bound
1549  on $\hat{q}-q$.  on $\hat{q}-q$.
1550
1551  Intuitively, based on  Intuitively, based on
1552  (\ref{eq:lem:ccil0:sidv0:sgdu0:01:06}), we might guess that  (\ref{eq:lem:ccil0:sidv0:sgdu0:01:06}), we might guess that
1553  if $v_{n-1}$ is small, the estimate $\hat{q}$ may be quite  if $v_{n-1}$ is small, the estimate $\hat{q}$ may be quite
1554  poor, as the interval to which the actual value of $v$ is confined  poor, as the interval to which the actual value of $v$ is confined
1555  may be quite large.  This fact is the basis for the normalization  may be quite large.  This fact is the basis for the normalization
1556  step [\ref{enumstep:alg:ccil0:sidv0:sgdu0:01:01}] in Algorithm  step [\ref{enumstep:alg:ccil0:sidv0:sgdu0:01:01}] in Algorithm
1557  \ref{alg:ccil0:sidv0:sgdu0:01}.  We now prove a useful result  \ref{alg:ccil0:sidv0:sgdu0:01}.  We now prove a useful result
1558  for how much $u, v$ must be normalized so that $\hat{q}-q \leq 2$.  for how much $u, v$ must be normalized so that $\hat{q}-q \leq 2$.
1559
1560  \begin{vworklemmastatementpar}{Normalization Requirement So That  \begin{vworklemmastatementpar}{Normalization Requirement So That
1561                                 \mbox{\boldmath$\hat{q} - q \leq 2$}}                                 \mbox{\boldmath$\hat{q} - q \leq 2$}}
1562  \label{lem:ccil0:sidv0:sgdu0:02}  \label{lem:ccil0:sidv0:sgdu0:02}
1563  If $v_{n-1} \geq \lfloor b/2 \rfloor$ and $\hat{q}$ is chosen as  If $v_{n-1} \geq \lfloor b/2 \rfloor$ and $\hat{q}$ is chosen as
1564  indicated in (\ref{eq:ccil0:sidv0:sgdu0:01}), then  indicated in (\ref{eq:ccil0:sidv0:sgdu0:01}), then
1565  $0 \leq \hat{q} - q \leq 2$.  $0 \leq \hat{q} - q \leq 2$.
1566  \end{vworklemmastatementpar}  \end{vworklemmastatementpar}
1567  \begin{vworklemmaproof}  \begin{vworklemmaproof}
1568  The lower limit on $\hat{q} - q$ is proved in Lemma \ref{lem:ccil0:sidv0:sgdu0:01}.  The lower limit on $\hat{q} - q$ is proved in Lemma \ref{lem:ccil0:sidv0:sgdu0:01}.
1569  We now seek only to prove that $\hat{q} - q \leq 2$.  We now seek only to prove that $\hat{q} - q \leq 2$.
1570  By definition of $\hat{q}$ and $q$,  By definition of $\hat{q}$ and $q$,
1571
1572
1573  \label{eq:lem:ccil0:sidv0:sgdu0:02:01}  \label{eq:lem:ccil0:sidv0:sgdu0:02:01}
1574  \hat{q} - q  =    \left\lfloor {\frac{u_n b + u_{n-1}}{v_{n-1}}} \right\rfloor  \hat{q} - q  =    \left\lfloor {\frac{u_n b + u_{n-1}}{v_{n-1}}} \right\rfloor
1575                    - \left\lfloor {\frac{u}{v}} \right\rfloor                    - \left\lfloor {\frac{u}{v}} \right\rfloor
1576
1577
1578  When only nonnegative integers are involved,  When only nonnegative integers are involved,
1579  (\cmtnzeroxrefhyphen\ref{eq:cmtn0:sfcf0:02})  (\cmtnzeroxrefhyphen\ref{eq:cmtn0:sfcf0:02})
1580  supplies an exact expression for the floor of a  supplies an exact expression for the floor of a
1581  ratio of integers.  Using (\cmtnzeroxrefhyphen\ref{eq:cmtn0:sfcf0:02}),  ratio of integers.  Using (\cmtnzeroxrefhyphen\ref{eq:cmtn0:sfcf0:02}),
1582  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:01}) can be decomposed into  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:01}) can be decomposed into
1583
1584
1585  \label{eq:lem:ccil0:sidv0:sgdu0:02:02}  \label{eq:lem:ccil0:sidv0:sgdu0:02:02}
1586  \hat{q} - q  =    \frac{u_n b + u_{n-1}}{v_{n-1}}  \hat{q} - q  =    \frac{u_n b + u_{n-1}}{v_{n-1}}
1587                  - \frac{(u_n b + u_{n-1}) \bmod v_{n-1}}{v_{n-1}}                  - \frac{(u_n b + u_{n-1}) \bmod v_{n-1}}{v_{n-1}}
1588                  - \frac{u}{v}                  - \frac{u}{v}
1589                  + \frac{u \bmod v}{v} .                  + \frac{u \bmod v}{v} .
1590
1591
1592  \noindent{}Note that (\ref{eq:lem:ccil0:sidv0:sgdu0:02:02}) is an exact  \noindent{}Note that (\ref{eq:lem:ccil0:sidv0:sgdu0:02:02}) is an exact
1593  expression (rather than an  expression (rather than an
1594  inequality).  inequality).
1595
1596  Note in (\ref{eq:lem:ccil0:sidv0:sgdu0:02:02}) that  Note in (\ref{eq:lem:ccil0:sidv0:sgdu0:02:02}) that
1597  $(u_n b + u_{n-1}) \bmod v_{n-1} \in [0, v_{n-1}-1]$, and that in general  $(u_n b + u_{n-1}) \bmod v_{n-1} \in [0, v_{n-1}-1]$, and that in general
1598  there is no reason to expect it cannot be zero.  Thus, we can assume that  there is no reason to expect it cannot be zero.  Thus, we can assume that
1599  it \emph{is} zero, which will maximize $\hat{q}-q$.  We can thus convert  it \emph{is} zero, which will maximize $\hat{q}-q$.  We can thus convert
1600  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:02}) into the inequality  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:02}) into the inequality
1601
1602
1603  \label{eq:lem:ccil0:sidv0:sgdu0:02:03}  \label{eq:lem:ccil0:sidv0:sgdu0:02:03}
1604  \hat{q} - q  \leq \frac{u_n b + u_{n-1}}{v_{n-1}}  \hat{q} - q  \leq \frac{u_n b + u_{n-1}}{v_{n-1}}
1605                  - \frac{u}{v}                  - \frac{u}{v}
1606                  + \frac{u \bmod v}{v} .                  + \frac{u \bmod v}{v} .
1607
1608
1609  In (\ref{eq:lem:ccil0:sidv0:sgdu0:02:03}) we can also observe that  In (\ref{eq:lem:ccil0:sidv0:sgdu0:02:03}) we can also observe that
1610  $(u \bmod v)/v \in [0, (v-1)/v]$.  If we replace this expression with  $(u \bmod v)/v \in [0, (v-1)/v]$.  If we replace this expression with
1611  1'' (which is unattainable, but barely), this will change the relational  1'' (which is unattainable, but barely), this will change the relational
1612  operator from $\leq$'' to $<$'':  operator from $\leq$'' to $<$'':
1613
1614
1615  \label{eq:lem:ccil0:sidv0:sgdu0:02:04}  \label{eq:lem:ccil0:sidv0:sgdu0:02:04}
1616  \hat{q} - q  <    \frac{u_n b + u_{n-1}}{v_{n-1}}  \hat{q} - q  <    \frac{u_n b + u_{n-1}}{v_{n-1}}
1617                  - \frac{u}{v}                  - \frac{u}{v}
1618                  + 1 .                  + 1 .
1619
1620
1621  The result we wish to show is that with $v_{n-1} \geq \lfloor b/2 \rfloor$,  The result we wish to show is that with $v_{n-1} \geq \lfloor b/2 \rfloor$,
1622  $\hat{q}-q \leq 2$.  To simplify the subsequent algebraic manipulations, note in  $\hat{q}-q \leq 2$.  To simplify the subsequent algebraic manipulations, note in
1623  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:04}) that  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:04}) that
1624
1625  \begin{eqnarray}  \begin{eqnarray}
1626  \label{eq:lem:ccil0:sidv0:sgdu0:02:05}  \label{eq:lem:ccil0:sidv0:sgdu0:02:05}
1627            & \hat{q} - q  \leq 2 &  \\                & \hat{q} - q  \leq 2 &  \\
1628  \nonumber & \vworkvertequiv     &  \\  \nonumber & \vworkvertequiv     &  \\
1629  \label{eq:lem:ccil0:sidv0:sgdu0:02:06}  \label{eq:lem:ccil0:sidv0:sgdu0:02:06}
1630            & \displaystyle \frac{u_n b + u_{n-1}}{v_{n-1}}            & \displaystyle \frac{u_n b + u_{n-1}}{v_{n-1}}
1631                  - \frac{u}{v}                  - \frac{u}{v}
1632                  + 1 \leq 3 &          \\                  + 1 \leq 3 &          \\
1633  \nonumber & \vworkvertequiv     &  \\  \nonumber & \vworkvertequiv     &  \\
1634  \label{eq:lem:ccil0:sidv0:sgdu0:02:07}  \label{eq:lem:ccil0:sidv0:sgdu0:02:07}
1635            & \displaystyle \frac{u_n b + u_{n-1}}{v_{n-1}}            & \displaystyle \frac{u_n b + u_{n-1}}{v_{n-1}}
1636                  - \frac{u}{v} \leq 2    &                  - \frac{u}{v} \leq 2    &
1637  \end{eqnarray}  \end{eqnarray}
1638
1639  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:06}) may be counterintuitive, so  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:06}) may be counterintuitive, so
1640  further explanation is offered here.  Since $\hat{q} \in \vworkintset$ and $q \in \vworkintset$,  further explanation is offered here.  Since $\hat{q} \in \vworkintset$ and $q \in \vworkintset$,
1641  $\hat{q}-q \in \vworkintset$.  Thus, proving  $\hat{q}-q \in \vworkintset$.  Thus, proving
1642  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:06}) or  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:06}) or
1643  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:07}) proves that  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:07}) proves that
1644  $\hat{q}-q \in \{ \ldots, -1, 0, 1, 2 \}$ (however, by  $\hat{q}-q \in \{ \ldots, -1, 0, 1, 2 \}$ (however, by
1645  Lemma \ref{lem:ccil0:sidv0:sgdu0:01}, $\hat{q}-q \geq 0$, so in fact  Lemma \ref{lem:ccil0:sidv0:sgdu0:01}, $\hat{q}-q \geq 0$, so in fact
1646  what would be proved is that $\hat{q}-q \in \{ 0, 1, 2 \}$).  For  what would be proved is that $\hat{q}-q \in \{ 0, 1, 2 \}$).  For
1647  algebraic simplicity,  algebraic simplicity,
1648  we choose to prove (\ref{eq:lem:ccil0:sidv0:sgdu0:02:07}).  we choose to prove (\ref{eq:lem:ccil0:sidv0:sgdu0:02:07}).
1649
1650  First, adjust numerator and denominator of the first term in  First, adjust numerator and denominator of the first term in
1651  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:07}) by $b^{n-1}$ so that the terms more closely  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:07}) by $b^{n-1}$ so that the terms more closely
1652  resemble $u$ and $v$ in  resemble $u$ and $v$ in
1653  (\ref{eq:lem:ccil0:sidv0:sgdu0:01:02})  (\ref{eq:lem:ccil0:sidv0:sgdu0:01:02})
1654  and  and
1655  (\ref{eq:lem:ccil0:sidv0:sgdu0:01:06}):  (\ref{eq:lem:ccil0:sidv0:sgdu0:01:06}):
1656
1657
1658  \label{eq:lem:ccil0:sidv0:sgdu0:02:08}  \label{eq:lem:ccil0:sidv0:sgdu0:02:08}
1659  \frac{u_n b^n + u_{n-1} b^{n-1}}{v_{n-1} b^{n-1}}  \frac{u_n b^n + u_{n-1} b^{n-1}}{v_{n-1} b^{n-1}}
1660                  - \frac{u}{v} \leq 2 .                  - \frac{u}{v} \leq 2 .
1661
1662
1663  For logical implication to be maintained,  For logical implication to be maintained,
1664  we must make the most pessimistic choices and assumptions possible in  we must make the most pessimistic choices and assumptions possible in
1665  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:08}) in order to maximize the value  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:08}) in order to maximize the value
1666  of the left side of the inequality.  of the left side of the inequality.
1667  The first assumption to be made is the error in estimating  The first assumption to be made is the error in estimating
1668  $u$ and $v$ based on their most significant digits.  It can be  $u$ and $v$ based on their most significant digits.  It can be
1669  seen that (\ref{eq:lem:ccil0:sidv0:sgdu0:02:08}) will be maximized if:  seen that (\ref{eq:lem:ccil0:sidv0:sgdu0:02:08}) will be maximized if:
1670
1671  \begin{itemize}  \begin{itemize}
1672  \item We assume that $u = u_{n} b^n + u_{n-1} b^{n-1}$ (i.e. that we estimate  \item We assume that $u = u_{n} b^n + u_{n-1} b^{n-1}$ (i.e. that we estimate
1673        $u$ precisely).        $u$ precisely).
1674  \item We assume that $v = v_{n-1} b^{n-1} + b^{n-1} - 1$ (i.e. that  \item We assume that $v = v_{n-1} b^{n-1} + b^{n-1} - 1$ (i.e. that
1675        we underestimate $v$ by the maximum amount possible).        we underestimate $v$ by the maximum amount possible).
1676  \item We minimize the value of $v_{n-1} b^{n-1}$.  \item We minimize the value of $v_{n-1} b^{n-1}$.
1677  \end{itemize}  \end{itemize}
1678
1679  Assuming that $u$ is estimated precisely yields  Assuming that $u$ is estimated precisely yields
1680
1681
1682  \label{eq:lem:ccil0:sidv0:sgdu0:02:09}  \label{eq:lem:ccil0:sidv0:sgdu0:02:09}
1683  \frac{u}{v_{n-1} b^{n-1}}  \frac{u}{v_{n-1} b^{n-1}}
1684                  - \frac{u}{v} \leq 2 .                  - \frac{u}{v} \leq 2 .
1685
1686
1687  Assuming that $v$ is underestimated by the maximum amount possible  Assuming that $v$ is underestimated by the maximum amount possible
1688  yields  yields
1689
1690
1691  \label{eq:lem:ccil0:sidv0:sgdu0:02:10}  \label{eq:lem:ccil0:sidv0:sgdu0:02:10}
1692  \frac{u}{v - b^{n-1} + 1}  \frac{u}{v - b^{n-1} + 1}
1693                  - \frac{u}{v} \leq 2 .                  - \frac{u}{v} \leq 2 .
1694
1695
1696  Finally, with $b$ and $v$ fixed, $u$ can be maximized by noting that  Finally, with $b$ and $v$ fixed, $u$ can be maximized by noting that
1697  $u \leq bv - 1$ (by the problem assumption that the quotient is a single digit).  $u \leq bv - 1$ (by the problem assumption that the quotient is a single digit).
1698  However, for algebraic simplicity, we make the substitution $u=bv$ (rather than  However, for algebraic simplicity, we make the substitution $u=bv$ (rather than
1699  $u=bv-1$), since the weaker upper bound is strong enough to prove the  $u=bv-1$), since the weaker upper bound is strong enough to prove the
1700  first result we seek.  first result we seek.
1701
1702
1703  \label{eq:lem:ccil0:sidv0:sgdu0:02:11}  \label{eq:lem:ccil0:sidv0:sgdu0:02:11}
1704  \frac{bv}{v - b^{n-1} + 1}  \frac{bv}{v - b^{n-1} + 1}
1705                  - \frac{bv}{v} \leq 2                  - \frac{bv}{v} \leq 2
1706
1707
1708
1709  \label{eq:lem:ccil0:sidv0:sgdu0:02:12}  \label{eq:lem:ccil0:sidv0:sgdu0:02:12}
1710  \frac{bv}{v - b^{n-1} + 1}  \frac{bv}{v - b^{n-1} + 1}
1711                  - b \leq 2                  - b \leq 2
1712
1713
1714  Solving (\ref{eq:lem:ccil0:sidv0:sgdu0:02:12})  Solving (\ref{eq:lem:ccil0:sidv0:sgdu0:02:12})
1715  for $v$ yields  for $v$ yields
1716
1717
1718  \label{eq:lem:ccil0:sidv0:sgdu0:02:13}  \label{eq:lem:ccil0:sidv0:sgdu0:02:13}
1719  v \geq \frac{b^n}{2} + b^{n-1} - \frac{1}{b} - 1  v \geq \frac{b^n}{2} + b^{n-1} - \frac{1}{b} - 1
1720
1721
1722  Again using the assumption that $v$ is underestimated by the maximum amount  Again using the assumption that $v$ is underestimated by the maximum amount
1723  possible, we may make the substitution that $v = v_{n-1} b^{n-1} + b^{n-1} -1$,  possible, we may make the substitution that $v = v_{n-1} b^{n-1} + b^{n-1} -1$,
1725
1726
1727  \label{eq:lem:ccil0:sidv0:sgdu0:02:13b}  \label{eq:lem:ccil0:sidv0:sgdu0:02:13b}
1728  v_{n-1} \geq \frac{b}{2} - \frac{1}{2 b^{n-2}} .  v_{n-1} \geq \frac{b}{2} - \frac{1}{2 b^{n-2}} .
1729
1730
1731  There are two cases to consider:  $b$ even and $b$ odd.  If $b$ is even, the  There are two cases to consider:  $b$ even and $b$ odd.  If $b$ is even, the
1732  proof is complete, as $\lfloor b/2 \rfloor = b/2$ and the choice of  proof is complete, as $\lfloor b/2 \rfloor = b/2$ and the choice of
1733  $v_{n-1} = \lfloor b/2 \rfloor$ will automatically satisfy  $v_{n-1} = \lfloor b/2 \rfloor$ will automatically satisfy
1734  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:13b}).  However, if $b$ is odd,  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:13b}).  However, if $b$ is odd,
1735  $\lfloor b/2 \rfloor = b/2 - 1/2 < b/2 - 1/2b^{n-2}$, violating  $\lfloor b/2 \rfloor = b/2 - 1/2 < b/2 - 1/2b^{n-2}$, violating
1736  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:13b}),  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:13b}),
1737  and so we need to  and so we need to
1738  further examine this case.  further examine this case.
1739
1740  If $b$ is odd and $v_{n-1} = \lfloor b/2 \rfloor$, then  If $b$ is odd and $v_{n-1} = \lfloor b/2 \rfloor$, then
1741  $v_{n-1} = b/2 - 1/2$, violating (\ref{eq:lem:ccil0:sidv0:sgdu0:02:13b}).  $v_{n-1} = b/2 - 1/2$, violating (\ref{eq:lem:ccil0:sidv0:sgdu0:02:13b}).
1742  However, any larger choice of $v_{n-1}$ (such as  However, any larger choice of $v_{n-1}$ (such as
1743  $\lfloor b/2 \rfloor + 1$, $\lfloor b/2 \rfloor + 2$, etc.) satisfies  $\lfloor b/2 \rfloor + 1$, $\lfloor b/2 \rfloor + 2$, etc.) satisfies
1744  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:13b}); so that it remains only to prove  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:13b}); so that it remains only to prove
1745  the $v_{n-1} = \lfloor b/2 \rfloor = b/2 - 1/2$  the $v_{n-1} = \lfloor b/2 \rfloor = b/2 - 1/2$
1746  case.  case.
1747
1748  If $v_{n-1} = \lfloor b/2 \rfloor = b/2 - 1/2$, then  If $v_{n-1} = \lfloor b/2 \rfloor = b/2 - 1/2$, then
1749
1750  \begin{eqnarray}  \begin{eqnarray}
1751  \label{eq:lem:ccil0:sidv0:sgdu0:02:14}  \label{eq:lem:ccil0:sidv0:sgdu0:02:14}
1752  v & \in & \left[  v & \in & \left[
1753  \left( \frac{b}{2} - \frac{1}{2}\right) b^{n-1},  \left( \frac{b}{2} - \frac{1}{2}\right) b^{n-1},
1754  \left( \frac{b}{2} - \frac{1}{2}\right) b^{n-1} + b^{n-1} - 1  \left( \frac{b}{2} - \frac{1}{2}\right) b^{n-1} + b^{n-1} - 1
1755  \right] \\  \right] \\
1756  \nonumber & = &  \nonumber & = &
1757  \left[  \left[
1758  \frac{b^n}{2} - \frac{b^{n-1}}{2},  \frac{b^n}{2} - \frac{b^{n-1}}{2},
1759  \frac{b^n}{2} + \frac{b^{n-1}}{2} -1  \frac{b^n}{2} + \frac{b^{n-1}}{2} -1
1760  \right] .  \right] .
1761  \end{eqnarray}  \end{eqnarray}
1762
1763  Note in this case that the estimation error ($v - v_{n-1}b^{n-1}$) and the  Note in this case that the estimation error ($v - v_{n-1}b^{n-1}$) and the
1764  value of $v$ are not independent; and in fact it is this aspect  value of $v$ are not independent; and in fact it is this aspect
1765  of the problem that has led to the violation of  of the problem that has led to the violation of
1766  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:13b}) with $b$ odd and  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:13b}) with $b$ odd and
1767  $v_{n-1} = \lfloor b/2 \rfloor$.  $v_{n-1} = \lfloor b/2 \rfloor$.
1768
1769  In order to prove the case of $b$ odd and $v = \lfloor b/2 \rfloor = b/2 - 1/2$,  In order to prove the case of $b$ odd and $v = \lfloor b/2 \rfloor = b/2 - 1/2$,
1770  we must reexamine some simplifying assumptions made earlier in order to obtain  we must reexamine some simplifying assumptions made earlier in order to obtain
1771  tighter inequalities.  In (\ref{eq:lem:ccil0:sidv0:sgdu0:02:02}), we can no  tighter inequalities.  In (\ref{eq:lem:ccil0:sidv0:sgdu0:02:02}), we can no
1772  longer accept the maximum of $(u \bmod v)/v$ as one; instead we construct the  longer accept the maximum of $(u \bmod v)/v$ as one; instead we construct the
1773  tighter inequality  tighter inequality
1774
1775  \begin{eqnarray}  \begin{eqnarray}
1776  \label{eq:lem:ccil0:sidv0:sgdu0:02:15}  \label{eq:lem:ccil0:sidv0:sgdu0:02:15}
1777  & \displaystyle \hat{q} - q \leq  \frac{u_n b + u_{n-1}}{v_{n-1}}  & \displaystyle \hat{q} - q \leq  \frac{u_n b + u_{n-1}}{v_{n-1}}
1778                  - \frac{u}{v}                  - \frac{u}{v}
1779                  + \frac{u \bmod v}{v} & \\                  + \frac{u \bmod v}{v} & \\
1780  \nonumber &  \vworkvimp & \\  \nonumber &  \vworkvimp & \\
1781  \label{eq:lem:ccil0:sidv0:sgdu0:02:16}  \label{eq:lem:ccil0:sidv0:sgdu0:02:16}
1782  & \displaystyle \hat{q} - q \leq  \frac{u_n b + u_{n-1}}{v_{n-1}}  & \displaystyle \hat{q} - q \leq  \frac{u_n b + u_{n-1}}{v_{n-1}}
1783                  - \frac{u}{v}                  - \frac{u}{v}
1784                  + \frac{v-1}{v} , &                  + \frac{v-1}{v} , &
1785  \end{eqnarray}  \end{eqnarray}
1786
1788
1789
1790  \label{eq:lem:ccil0:sidv0:sgdu0:02:17}  \label{eq:lem:ccil0:sidv0:sgdu0:02:17}
1791  \frac{u_n b + u_{n-1}}{v_{n-1}}  \frac{u_n b + u_{n-1}}{v_{n-1}}
1792                  - \frac{u+1}{v}                  - \frac{u+1}{v}
1793                  < 2 .                  < 2 .
1794
1795
1796  In order to maximize the left-hand side of  In order to maximize the left-hand side of
1797  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:17}), we assume that we  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:17}), we assume that we
1798  estimate $u$ exactly so that $u = u_n b^n + u_{n-1} b^{n-1}$,  estimate $u$ exactly so that $u = u_n b^n + u_{n-1} b^{n-1}$,
1799  yielding  yielding
1800
1801
1802  \label{eq:lem:ccil0:sidv0:sgdu0:02:18}  \label{eq:lem:ccil0:sidv0:sgdu0:02:18}
1803  \frac{u}{v_{n-1} b^{n-1}}  \frac{u}{v_{n-1} b^{n-1}}
1804                  - \frac{u+1}{v}                  - \frac{u+1}{v}
1805                  < 2 .                  < 2 .
1806
1807
1808  We also assume that $u$ is the maximum value  We also assume that $u$ is the maximum value
1809  possible, $u=bv-1$, leading to  possible, $u=bv-1$, leading to
1810
1811
1812  \label{eq:lem:ccil0:sidv0:sgdu0:02:19}  \label{eq:lem:ccil0:sidv0:sgdu0:02:19}
1813  \frac{bv-1}{v_{n-1} b^{n-1}}  \frac{bv-1}{v_{n-1} b^{n-1}}
1814                  - b                  - b
1815                  < 2 .                  < 2 .
1816
1817
1818  Finally, we assume that $v$ is the upper limit in  Finally, we assume that $v$ is the upper limit in
1819  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:14}),  (\ref{eq:lem:ccil0:sidv0:sgdu0:02:14}),
1820  $v=b^n/2 + b^{n-1}/2 - 1$, and substitute the known value of  $v=b^n/2 + b^{n-1}/2 - 1$, and substitute the known value of
1821  $v_{n-1}$ for the case being proved, $v_{n-1} = b/2-1/2$, yielding  $v_{n-1}$ for the case being proved, $v_{n-1} = b/2-1/2$, yielding
1822
1823
1824  \label{eq:lem:ccil0:sidv0:sgdu0:02:20}  \label{eq:lem:ccil0:sidv0:sgdu0:02:20}
1825  \frac{b \left( \frac{b^n}{2} + \frac{b^{n-1}}{2} - 1 \right) - 1}  \frac{b \left( \frac{b^n}{2} + \frac{b^{n-1}}{2} - 1 \right) - 1}
1826  {\left( \frac{b}{2} - \frac{1}{2} \right) b^{n-1}}  {\left( \frac{b}{2} - \frac{1}{2} \right) b^{n-1}}
1827                  - b                  - b
1828                  < 2 .                  < 2 .
1829
1830
1831  Simplification of (\ref{eq:lem:ccil0:sidv0:sgdu0:02:20})  Simplification of (\ref{eq:lem:ccil0:sidv0:sgdu0:02:20})
1832  will establish that it is always true.  This completes the proof.  will establish that it is always true.  This completes the proof.
1833  \end{vworklemmaproof}  \end{vworklemmaproof}
1834  \vworklemmafooter{}  \vworklemmafooter{}
1835
1836  Lemmas \ref{lem:ccil0:sidv0:sgdu0:01} and  Lemmas \ref{lem:ccil0:sidv0:sgdu0:01} and
1837  \ref{lem:ccil0:sidv0:sgdu0:02},  \ref{lem:ccil0:sidv0:sgdu0:02},
1838  standing alone, lead to a good implementation of  standing alone, lead to a good implementation of
1839  division without any further results.  If it is known that  division without any further results.  If it is known that
1840  $0 \leq \hat{q} - q \leq 2$, Algorithm \ref{alg:ccil0:sidv0:sgdu0:01}  $0 \leq \hat{q} - q \leq 2$, Algorithm \ref{alg:ccil0:sidv0:sgdu0:01}
1841  can be trivially modified to only calculate  can be trivially modified to only calculate
1842  $\hat{q}$ (omitting the additional tests in Step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:03}),  $\hat{q}$ (omitting the additional tests in Step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:03}),
1843  and then  and then
1844  to include up to two add-back steps  to include up to two add-back steps
1845  (duplication of Step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:06}).    (duplication of Step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:06}).
1846  Although such an algorithm would be  Although such an algorithm would be
1848  be executed very frequently, slowing the algorithm substantially, especially  be executed very frequently, slowing the algorithm substantially, especially
1849  for long operands.  for long operands.
1850  We now show that the additional tests in Step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:03} of  We now show that the additional tests in Step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:03} of
1851  Algorithm \ref{alg:ccil0:sidv0:sgdu0:01} can eliminate  Algorithm \ref{alg:ccil0:sidv0:sgdu0:01} can eliminate
1852  altogether the case of $\hat{q}-q = 2$ (Lemma \ref{lem:ccil0:sidv0:sgdu0:03}), and  altogether the case of $\hat{q}-q = 2$ (Lemma \ref{lem:ccil0:sidv0:sgdu0:03}), and
1853  can with a probability close to unity eliminate the  can with a probability close to unity eliminate the
1854  case of $\hat{q}-q = 1$ (Lemmas \ref{lem:ccil0:sidv0:sgdu0:04}  case of $\hat{q}-q = 1$ (Lemmas \ref{lem:ccil0:sidv0:sgdu0:04}
1855  and \ref{lem:ccil0:sidv0:sgdu0:05}).  Together these tests, which are present in  and \ref{lem:ccil0:sidv0:sgdu0:05}).  Together these tests, which are present in
1856  the statement of Algorithm \ref{alg:ccil0:sidv0:sgdu0:01},  the statement of Algorithm \ref{alg:ccil0:sidv0:sgdu0:01},
1857  reduce add-back (Step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:06}) to rare occurrence, and create a more  reduce add-back (Step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:06}) to rare occurrence, and create a more
1858  efficient algorithm than would be possible with the  efficient algorithm than would be possible with the
1859  results of Lemmas \ref{lem:ccil0:sidv0:sgdu0:01} and \ref{lem:ccil0:sidv0:sgdu0:02} alone.  results of Lemmas \ref{lem:ccil0:sidv0:sgdu0:01} and \ref{lem:ccil0:sidv0:sgdu0:02} alone.
1860
1861  \begin{vworklemmastatementpar}  \begin{vworklemmastatementpar}
1862  {\mbox{\boldmath$\hat{q} v_{n-2} \leq b \hat{r} + u_{j+n-2} \vworkhimp 0 \leq \hat{q} - q \leq 1$}}  {\mbox{\boldmath$\hat{q} v_{n-2} \leq b \hat{r} + u_{j+n-2} \vworkhimp 0 \leq \hat{q} - q \leq 1$}}
1863  \label{lem:ccil0:sidv0:sgdu0:03}  \label{lem:ccil0:sidv0:sgdu0:03}
1864  If the divisor normalization requirement ($v_{n-1} \geq \lfloor b/2 \rfloor$) as specified in  If the divisor normalization requirement ($v_{n-1} \geq \lfloor b/2 \rfloor$) as specified in
1865  Step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:01} of  Step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:01} of
1866  Algorithm \ref{alg:ccil0:sidv0:sgdu0:01} is met, then  Algorithm \ref{alg:ccil0:sidv0:sgdu0:01} is met, then
1867
1868
1869  \label{eq:lem:ccil0:sidv0:sgdu0:03:01}  \label{eq:lem:ccil0:sidv0:sgdu0:03:01}
1870  \hat{q} v_{n-2} \leq b \hat{r} + u_{j+n-2} \vworkhimp 0 \leq \hat{q} - q \leq 1 .  \hat{q} v_{n-2} \leq b \hat{r} + u_{j+n-2} \vworkhimp 0 \leq \hat{q} - q \leq 1 .
1871
1872  \end{vworklemmastatementpar}  \end{vworklemmastatementpar}
1873  \begin{vworklemmaproof}  \begin{vworklemmaproof}
1874  For reference, note that:  For reference, note that:
1875
1876  \begin{eqnarray}  \begin{eqnarray}
1877  \label{eq:lem:ccil0:sidv0:sgdu0:03:02}  \label{eq:lem:ccil0:sidv0:sgdu0:03:02}
1878  u & = & u_{j+n} b^{j+n} + u_{j+n-1} b^{j+n-1} + \ldots{} + u_1 b + u_0 \\  u & = & u_{j+n} b^{j+n} + u_{j+n-1} b^{j+n-1} + \ldots{} + u_1 b + u_0 \\
1879  \label{eq:lem:ccil0:sidv0:sgdu0:03:03}  \label{eq:lem:ccil0:sidv0:sgdu0:03:03}
1880  v & = & v_{n-1} b^{n-1} + v_{n-2} b^{n-2} + \ldots{} + v_1 b + v_0  v & = & v_{n-1} b^{n-1} + v_{n-2} b^{n-2} + \ldots{} + v_1 b + v_0
1881  \end{eqnarray}  \end{eqnarray}
1882
1883  By definition, we the remainder  By definition, we the remainder
1884  $\hat{r}$ has the value  $\hat{r}$ has the value
1885  $\hat{r} = u_{j+n} b + u_{j+n-1} - \hat{q} v_{n-1}$.  Substituting this value  $\hat{r} = u_{j+n} b + u_{j+n-1} - \hat{q} v_{n-1}$.  Substituting this value
1886  into (\ref{eq:lem:ccil0:sidv0:sgdu0:03:01}) produces  into (\ref{eq:lem:ccil0:sidv0:sgdu0:03:01}) produces
1887
1888
1889  \label{eq:lem:ccil0:sidv0:sgdu0:03:04}  \label{eq:lem:ccil0:sidv0:sgdu0:03:04}
1890  \hat{q} v_{n-2} \leq b (u_{j+n} b + u_{j+n-1} - \hat{q} v_{n-1}) + u_{j+n-2} ,  \hat{q} v_{n-2} \leq b (u_{j+n} b + u_{j+n-1} - \hat{q} v_{n-1}) + u_{j+n-2} ,
1891
1892
1893  and solving for $\hat{q}$ yields  and solving for $\hat{q}$ yields
1894
1895
1896  \label{eq:lem:ccil0:sidv0:sgdu0:03:05}  \label{eq:lem:ccil0:sidv0:sgdu0:03:05}
1897  \hat{q} \leq  \frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}}.  \hat{q} \leq  \frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}}.
1898
1899
1900  It is known from Lemma \ref{lem:ccil0:sidv0:sgdu0:01} that the type of estimate  It is known from Lemma \ref{lem:ccil0:sidv0:sgdu0:01} that the type of estimate
1901  represented by the floor of the right-hand size of (\ref{eq:lem:ccil0:sidv0:sgdu0:03:05})  represented by the floor of the right-hand size of (\ref{eq:lem:ccil0:sidv0:sgdu0:03:05})
1902  can be no less than $q$, leading to  can be no less than $q$, leading to
1903
1904  \begin{eqnarray}  \begin{eqnarray}
1905  \label{eq:lem:ccil0:sidv0:sgdu0:03:06}  \label{eq:lem:ccil0:sidv0:sgdu0:03:06}
1906  & \displaystyle q  \leq  \hat{q}  & \displaystyle q  \leq  \hat{q}
1907  \leq  \leq
1908  \left\lfloor { \frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}} } \right\rfloor & \\  \left\lfloor { \frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}} } \right\rfloor & \\
1909  \nonumber & \displaystyle \leq    \nonumber & \displaystyle \leq
1910  \frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}}. &  \frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}}. &
1911  \end{eqnarray}  \end{eqnarray}
1912
1913  Because $q, \hat{q} \in \vworkintset$, it is only necessary to prove that  Because $q, \hat{q} \in \vworkintset$, it is only necessary to prove that
1914
1915
1916  \label{eq:lem:ccil0:sidv0:sgdu0:03:07}  \label{eq:lem:ccil0:sidv0:sgdu0:03:07}
1917  \frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}} -  \frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}} -
1918  \left\lfloor \frac{u}{v} \right\rfloor  \left\lfloor \frac{u}{v} \right\rfloor
1919  < 2  < 2
1920
1921
1922  in order to prove that $\hat{q}-q \leq 1$.  Using  in order to prove that $\hat{q}-q \leq 1$.  Using
1923  (\cmtnzeroxrefhyphen\ref{eq:cmtn0:sfcf0:02}),  (\cmtnzeroxrefhyphen\ref{eq:cmtn0:sfcf0:02}),
1924  (\ref{eq:lem:ccil0:sidv0:sgdu0:03:07}) can be rewritten as  (\ref{eq:lem:ccil0:sidv0:sgdu0:03:07}) can be rewritten as
1925
1926
1927  \label{eq:lem:ccil0:sidv0:sgdu0:03:08}  \label{eq:lem:ccil0:sidv0:sgdu0:03:08}
1928  \frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}} -  \frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}} -
1929  \frac{u}{v} -  \frac{u}{v} -
1930  \frac{u \bmod v}{v}  \frac{u \bmod v}{v}
1931  < 2 .  < 2 .
1932
1933
1934  In order for implication to hold, we must make the most pessimistic  In order for implication to hold, we must make the most pessimistic
1935  assumptions about $u \bmod v$ (those which maximize it).  The maximum value  assumptions about $u \bmod v$ (those which maximize it).  The maximum value
1936  of $u \bmod v$ is $v-1$, leading to  of $u \bmod v$ is $v-1$, leading to
1937
1938
1939  \label{eq:lem:ccil0:sidv0:sgdu0:03:09}  \label{eq:lem:ccil0:sidv0:sgdu0:03:09}
1940  \frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}} -  \frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}} -
1941  \frac{u + 1}{v}  \frac{u + 1}{v}
1942  < 1 .  < 1 .
1943
1944
1945  In order to maximize the left side of (\ref{}),  In order to maximize the left side of (\ref{}),
1946  we must assume that $u$ is maximized, $v$ is minimized,  we must assume that $u$ is maximized, $v$ is minimized,
1947
1948
1949  \end{vworklemmaproof}  \end{vworklemmaproof}
1950  \vworklemmafooter{}  \vworklemmafooter{}
1951
1952  \begin{vworklemmastatementpar}  \begin{vworklemmastatementpar}
1953  {\mbox{\boldmath$\hat{q} v_{n-2} > b \hat{r} + u_{n-2} \vworkhimp q < \hat{q}$}}  {\mbox{\boldmath$\hat{q} v_{n-2} > b \hat{r} + u_{n-2} \vworkhimp q < \hat{q}$}}
1954  \label{lem:ccil0:sidv0:sgdu0:04}  \label{lem:ccil0:sidv0:sgdu0:04}
1955  Given $\hat{q} > 0$, an estimate of $q$, and the remainder  Given $\hat{q} > 0$, an estimate of $q$, and the remainder
1956  based on the estimate, $\hat{r} = u_n b + u_{n-1} - \hat{q} v_{n-1}$,  based on the estimate, $\hat{r} = u_n b + u_{n-1} - \hat{q} v_{n-1}$,
1957
1958
1959  \label{eq:lem:ccil0:sidv0:sgdu0:04:01}  \label{eq:lem:ccil0:sidv0:sgdu0:04:01}
1960  \hat{q} v_{n-2} > b \hat{r} + u_{n-2} \vworkhimp \hat{q} > q .  \hat{q} v_{n-2} > b \hat{r} + u_{n-2} \vworkhimp \hat{q} > q .
1961
1962  \end{vworklemmastatementpar}  \end{vworklemmastatementpar}
1963  \begin{vworklemmaproof}  \begin{vworklemmaproof}
1964  We make the assumption that $v_{n-1}$ and $v_{n-2}$ are not both 0.  We make the assumption that $v_{n-1}$ and $v_{n-2}$ are not both 0.
1965  $v_{n-1} > 0$ is guaranteed by  $v_{n-1} > 0$ is guaranteed by
1966  the  normalization of  the  normalization of
1967  $v$ in Algorithm \ref{alg:ccil0:sidv0:sgdu0:01}.  $v$ in Algorithm \ref{alg:ccil0:sidv0:sgdu0:01}.
1968
1969
1970  \label{eq:lem:ccil0:sidv0:sgdu0:04:02}  \label{eq:lem:ccil0:sidv0:sgdu0:04:02}
1971  \hat{q} v_{n-2} > b \hat{r} + u_{n-2}  \hat{q} v_{n-2} > b \hat{r} + u_{n-2}
1972
1973
1974
1975  \label{eq:lem:ccil0:sidv0:sgdu0:04:03}  \label{eq:lem:ccil0:sidv0:sgdu0:04:03}
1976  \hat{q} v_{n-2} > b (u_n b + u_{n-1} - \hat{q} v_{n-1}) + u_{n-2}  \hat{q} v_{n-2} > b (u_n b + u_{n-1} - \hat{q} v_{n-1}) + u_{n-2}
1977
1978
1979  Solving (\ref{eq:lem:ccil0:sidv0:sgdu0:04:03}) for $\hat{q}$ yields  Solving (\ref{eq:lem:ccil0:sidv0:sgdu0:04:03}) for $\hat{q}$ yields
1980
1981
1982  \label{eq:lem:ccil0:sidv0:sgdu0:04:04}  \label{eq:lem:ccil0:sidv0:sgdu0:04:04}
1983  \hat{q}  \hat{q}
1984  >  >
1985  \frac{u_n b^2 + u_{n-1} b + u_{n-2}}{v_{n-1} b + v_{n-2}}  \frac{u_n b^2 + u_{n-1} b + u_{n-2}}{v_{n-1} b + v_{n-2}}
1986  \geq  \geq
1987  \left\lfloor {\frac{u_n b^2 + u_{n-1} b + u_{n-2}}{v_{n-1} b + v_{n-2}}} \right\rfloor  \left\lfloor {\frac{u_n b^2 + u_{n-1} b + u_{n-2}}{v_{n-1} b + v_{n-2}}} \right\rfloor
1988  .  .
1989
1990
1991  Note that the right-hand term of  Note that the right-hand term of
1992  (\ref{eq:lem:ccil0:sidv0:sgdu0:04:04})  (\ref{eq:lem:ccil0:sidv0:sgdu0:04:04})
1993  is similar in form to the estimate $\hat{q}$ in  is similar in form to the estimate $\hat{q}$ in
1994  Lemma \ref{lem:ccil0:sidv0:sgdu0:01}, where it is proved that  Lemma \ref{lem:ccil0:sidv0:sgdu0:01}, where it is proved that
1995  $\hat{q} \geq q$.  It is possible to use identical algebraic technique  $\hat{q} \geq q$.  It is possible to use identical algebraic technique
1996  as is used in Lemma \ref{lem:ccil0:sidv0:sgdu0:01} in order to prove that  as is used in Lemma \ref{lem:ccil0:sidv0:sgdu0:01} in order to prove that
1997
1998
1999  \label{eq:lem:ccil0:sidv0:sgdu0:04:05}  \label{eq:lem:ccil0:sidv0:sgdu0:04:05}
2000  \left\lfloor {\frac{u_n b^2 + u_{n-1} b + u_{n-2}}{v_{n-1} b + v_{n-2}}} \right\rfloor  \left\lfloor {\frac{u_n b^2 + u_{n-1} b + u_{n-2}}{v_{n-1} b + v_{n-2}}} \right\rfloor
2001  \geq q,  \geq q,
2002
2003
2004  and it follows that  and it follows that
2005
2006
2007  \label{eq:lem:ccil0:sidv0:sgdu0:04:06}  \label{eq:lem:ccil0:sidv0:sgdu0:04:06}
2008  \hat{q}  \hat{q}
2009  >  >
2010  \frac{u_n b^2 + u_{n-1} b + u_{n-2}}{v_{n-1} b + v_{n-2}}  \frac{u_n b^2 + u_{n-1} b + u_{n-2}}{v_{n-1} b + v_{n-2}}
2011  \geq  \geq
2012  \left\lfloor {\frac{u_n b^2 + u_{n-1} b + u_{n-2}}{v_{n-1} b + v_{n-2}}} \right\rfloor  \left\lfloor {\frac{u_n b^2 + u_{n-1} b + u_{n-2}}{v_{n-1} b + v_{n-2}}} \right\rfloor
2013  \geq q,  \geq q,
2014
2015
2016  and thus $\hat{q} > q$.  and thus $\hat{q} > q$.
2017  \end{vworklemmaproof}  \end{vworklemmaproof}
2018  \vworklemmafooter{}  \vworklemmafooter{}
2019
2020  \begin{vworklemmastatementpar}  \begin{vworklemmastatementpar}
2021  {Add-back occurs with approximate probability \mbox{\boldmath$2/b$}}  {Add-back occurs with approximate probability \mbox{\boldmath$2/b$}}
2022  \label{lem:ccil0:sidv0:sgdu0:05}  \label{lem:ccil0:sidv0:sgdu0:05}
2023  The estimate of $q$ provided by (\ref{eq:ccil0:sidv0:sgdu0:01}),  The estimate of $q$ provided by (\ref{eq:ccil0:sidv0:sgdu0:01}),
2024  \end{vworklemmastatementpar}  \end{vworklemmastatementpar}
2025  \begin{vworklemmaproof}  \begin{vworklemmaproof}
2026  \end{vworklemmaproof}  \end{vworklemmaproof}
2027  \vworklemmafooter{}  \vworklemmafooter{}
2028
2029
2030  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2031  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2032  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2033  \subsection{Division Of Signed Operands}  \subsection{Division Of Signed Operands}
2034
2035
2036  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2037  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2038  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2039  \subsection{Large Dividends With Machine-Native Divisors}  \subsection{Large Dividends With Machine-Native Divisors}
2040  \label{ccil0:sidv0:sldm0}  \label{ccil0:sidv0:sldm0}
2041
2042  Division of arbitrary-sized operands (Section \ref{ccil0:sidv0:sgdu0}) is  Division of arbitrary-sized operands (Section \ref{ccil0:sidv0:sgdu0}) is
2043  a costly operation.  In many practical applications, we are able to exploit  a costly operation.  In many practical applications, we are able to exploit
2044  data sizes of operands or special relationships between the values of  data sizes of operands or special relationships between the values of
2045  operands to use the instruction set of the machine more effectively.  operands to use the instruction set of the machine more effectively.
2046  In this subsection, we investigate what optimizations we may achieve when:  In this subsection, we investigate what optimizations we may achieve when:
2047
2048  \begin{itemize}  \begin{itemize}
2049  \item We wish to calculate the quotient and remainder of  \item We wish to calculate the quotient and remainder of
2050        unsigned integers $p$ and $q$:  $p/q$ and        unsigned integers $p$ and $q$:  $p/q$ and
2051        $p \bmod{} q$; \emph{and}        $p \bmod{} q$; \emph{and}
2052  \item The machine possesses unsigned division instructions  \item The machine possesses unsigned division instructions
2053        which provide both a quotient and a remainder from        which provide both a quotient and a remainder from
2054        a division; \emph{and}        a division; \emph{and}
2055  \item The bitsize of the divisor $q$ is not larger than can  \item The bitsize of the divisor $q$ is not larger than can
2056        be accomodated (as a divisor) by machine division instructions.        be accomodated (as a divisor) by machine division instructions.
2057  \end{itemize}  \end{itemize}
2058
2059  Processors which possess integer division instructions usually  Processors which possess integer division instructions usually
2060  possess one of two types of instructions:  possess one of two types of instructions:
2061
2062  \begin{itemize}  \begin{itemize}
2063  \item Instructions where the the divisor, quotient and remainder are  \item Instructions where the the divisor, quotient and remainder are
2064        $Q$ bits, but the dividend is $2Q$ bits (we call these        $Q$ bits, but the dividend is $2Q$ bits (we call these
2065        large dividend'' instructions).  For example, an        large dividend'' instructions).  For example, an
2066        instruction which accepts a 16-bit dividend and an        instruction which accepts a 16-bit dividend and an
2067        8-bit divisor to produce an 8-bit quotient and an 8-bit remainder is        8-bit divisor to produce an 8-bit quotient and an 8-bit remainder is
2068        typical.        typical.
2069        With such instructions, overflow is possible, and is always detectable.        With such instructions, overflow is possible, and is always detectable.
2070        However, in this subsection we never describe algorithms which detect        However, in this subsection we never describe algorithms which detect
2071        overflow---instead, we arrange for data values which cannot generate an        overflow---instead, we arrange for data values which cannot generate an
2072        overflow.        overflow.
2073  \item Instructions where the dividend, divisor, quotient, and remainder  \item Instructions where the dividend, divisor, quotient, and remainder
2074        are all $Q$ bits (we call these small dividend'' instructions).          are all $Q$ bits (we call these small dividend'' instructions).
2075        With such instructions, overflow is not possible.        With such instructions, overflow is not possible.
2076  \end{itemize}  \end{itemize}
2077
2078  We call the bitsize $Q$ a \emph{chunk}.  We use \emph{chunk} rather than  We call the bitsize $Q$ a \emph{chunk}.  We use \emph{chunk} rather than
2079  \emph{word} because the chunksize and wordsize in general are not  \emph{word} because the chunksize and wordsize in general are not
2080  required to be the same.  required to be the same.
2081
2082  For the remainder of this discussion, we assume large dividend  For the remainder of this discussion, we assume large dividend
2083  instructions (the first category above).  instructions (the first category above).
2084  The algorithms developed can be implemented on small dividend machines  The algorithms developed can be implemented on small dividend machines
2085  by halving data sizes so that the divisor fills no more than half  by halving data sizes so that the divisor fills no more than half
2086  of the available bits.  of the available bits.
2087
2088  We assume that we are interested in calculating the integer quotient  We assume that we are interested in calculating the integer quotient
2089  $\lfloor{}p/q\rfloor$ and remainder $p \bmod{} q$ of two unsigned  $\lfloor{}p/q\rfloor$ and remainder $p \bmod{} q$ of two unsigned
2090  integers $p$ and $q$, where $p$ is of size $P$ bits and $q$ is  integers $p$ and $q$, where $p$ is of size $P$ bits and $q$ is
2091  of size $Q$ bits.  For simplicity and without detracting from the  of size $Q$ bits.  For simplicity and without detracting from the
2092  generality of the solution, we assume that $Q \vworkdivides{} P$.  generality of the solution, we assume that $Q \vworkdivides{} P$.
2093
2094  We then seek to calculate  We then seek to calculate
2095
2096  \begin{eqnarray}  \begin{eqnarray}
2097  \label{eq:ccil0:sidv0:sldm0:001}  \label{eq:ccil0:sidv0:sldm0:001}
2098  \frac{p}{q} & = & \frac{2^{P-Q} p_{[P/Q-1]} + 2^{P-2Q} p_{[P/Q-2]} + \ldots{} + 2^{Q} p_{[1]} + p_{[0]}}{q_{[0]}} \\  \frac{p}{q} & = & \frac{2^{P-Q} p_{[P/Q-1]} + 2^{P-2Q} p_{[P/Q-2]} + \ldots{} + 2^{Q} p_{[1]} + p_{[0]}}{q_{[0]}} \\
2099  \nonumber       & = & \frac{\sum_{i=0}^{P/Q-1} 2^{iQ} p_{[i]}}{q_{[0]}} .  \nonumber       & = & \frac{\sum_{i=0}^{P/Q-1} 2^{iQ} p_{[i]}}{q_{[0]}} .
2100  \end{eqnarray}  \end{eqnarray}
2101
2102  \noindent{}Using the integer identity  \noindent{}Using the integer identity
2103
2104
2105  \label{eq:ccil0:sidv0:sldm0:002}  \label{eq:ccil0:sidv0:sldm0:002}
2106  \frac{a}{b} = \left\lfloor{\frac{a}{b}}\right\rfloor + \frac{a \bmod b}{b} ,  \frac{a}{b} = \left\lfloor{\frac{a}{b}}\right\rfloor + \frac{a \bmod b}{b} ,
2107
2108
2109  \noindent{}we can reform (\ref{eq:ccil0:sidv0:sldm0:001}) into  \noindent{}we can reform (\ref{eq:ccil0:sidv0:sldm0:001}) into
2110
2111  \begin{eqnarray}  \begin{eqnarray}
2112  \nonumber  \nonumber
2113  \frac{p}{q} & = &   2^{P-Q} \left\lfloor{\frac{p_{[P/Q-1]}}{q_{[0]}}}\right\rfloor \\  \frac{p}{q} & = &   2^{P-Q} \left\lfloor{\frac{p_{[P/Q-1]}}{q_{[0]}}}\right\rfloor \\
2114  \label{eq:ccil0:sidv0:sldm0:003}  \label{eq:ccil0:sidv0:sldm0:003}
2115              & + &   2^{P-Q} \frac{p_{[P/Q-1]}\bmod q_{[0]}}{q_{[0]}}              & + &   2^{P-Q} \frac{p_{[P/Q-1]}\bmod q_{[0]}}{q_{[0]}}
2116                    + 2^{P-2Q} \frac{p_{[P/Q-2]}}{q_{[0]}}  \\                    + 2^{P-2Q} \frac{p_{[P/Q-2]}}{q_{[0]}}  \\
2117  \nonumber   & + &   \frac{\sum_{i=0}^{P/Q-3} 2^{iQ} p_{[i]}}{q_{[0]}} ,  \nonumber   & + &   \frac{\sum_{i=0}^{P/Q-3} 2^{iQ} p_{[i]}}{q_{[0]}} ,
2118  \end{eqnarray}  \end{eqnarray}
2119
2120  \noindent{}which can be polished slightly to yield  \noindent{}which can be polished slightly to yield
2121
2122  \begin{eqnarray}  \begin{eqnarray}
2123  \nonumber  \nonumber
2124  \frac{p}{q} & = &   2^{P-Q} \left\lfloor{\frac{p_{[P/Q-1]}}{q_{[0]}}}\right\rfloor \\  \frac{p}{q} & = &   2^{P-Q} \left\lfloor{\frac{p_{[P/Q-1]}}{q_{[0]}}}\right\rfloor \\
2125  \label{eq:ccil0:sidv0:sldm0:004}  \label{eq:ccil0:sidv0:sldm0:004}
2126              & + &   2^{P-2Q} \frac{2^Q (p_{[P/Q-1]}\bmod q_{[0]}) + p_{[P/Q-2]}}{q_{[0]}} \\              & + &   2^{P-2Q} \frac{2^Q (p_{[P/Q-1]}\bmod q_{[0]}) + p_{[P/Q-2]}}{q_{[0]}} \\
2127  \nonumber   & + &   \frac{\sum_{i=0}^{P/Q-3} 2^{iQ} p_{[i]}}{q_{[0]}} .  \nonumber   & + &   \frac{\sum_{i=0}^{P/Q-3} 2^{iQ} p_{[i]}}{q_{[0]}} .
2128  \end{eqnarray}  \end{eqnarray}
2129
2130  Note in (\ref{eq:ccil0:sidv0:sldm0:004}) that the first term,  Note in (\ref{eq:ccil0:sidv0:sldm0:004}) that the first term,
2131  $\lfloor{}p_{[P/Q-1]} / q_{[0]}\rfloor$, as well as  $\lfloor{}p_{[P/Q-1]} / q_{[0]}\rfloor$, as well as
2132  a portion of the second term, $p_{[P/Q-1]}\bmod q_{[0]}$, can be  a portion of the second term, $p_{[P/Q-1]}\bmod q_{[0]}$, can be
2133  calculated using a single machine division instruction with  calculated using a single machine division instruction with
2134  $p_{[P/Q-1]}$ as the dividend and $q_{[0]}$ as the divisor.  $p_{[P/Q-1]}$ as the dividend and $q_{[0]}$ as the divisor.
2135  Note also that multiplication of an integer by a power of 2  Note also that multiplication of an integer by a power of 2
2136  can be achieved by placing the integer correctly within the  can be achieved by placing the integer correctly within the
2137  result.  In this regard note that $Q=8$ or $Q=16$ are  result.  In this regard note that $Q=8$ or $Q=16$ are
2138  the most typical cases, and so the placement can be achieved simply  the most typical cases, and so the placement can be achieved simply
2139  by selecting the correct memory address.  by selecting the correct memory address.
2140
2141  It is initially unclear whether we can evaluate or reduce the fraction in the  It is initially unclear whether we can evaluate or reduce the fraction in the
2142  second term of (\ref{eq:ccil0:sidv0:sldm0:004}),  second term of (\ref{eq:ccil0:sidv0:sldm0:004}),
2143  $[2^Q (p_{[P/Q-1]}\bmod q_{[0]}) + p_{[P/Q-2]}] / q_{[0]}$,  $[2^Q (p_{[P/Q-1]}\bmod q_{[0]}) + p_{[P/Q-2]}] / q_{[0]}$,
2144  using a single large dividend machine instruction, because  using a single large dividend machine instruction, because
2145  the upper chunk of the dividend is populated with non-zero bits  the upper chunk of the dividend is populated with non-zero bits
2146  (specifically, $p_{[P/Q-1]}\bmod q_{[0]}$), and it seems that  (specifically, $p_{[P/Q-1]}\bmod q_{[0]}$), and it seems that
2147  a division overflow may be possible.  However, with some thought,  a division overflow may be possible.  However, with some thought,
2148  it is clear that $p_{[P/Q-1]}\bmod q_{[0]} \leq q_{[0]} - 1$ and  it is clear that $p_{[P/Q-1]}\bmod q_{[0]} \leq q_{[0]} - 1$ and
2149  $p_{[P/Q-2]} \leq 2^Q - 1$, thus the largest numerator possible  $p_{[P/Q-2]} \leq 2^Q - 1$, thus the largest numerator possible
2150  is $2^Q q_{[0]} - 1$, which, when divided by $q_{[0]}$, will result  is $2^Q q_{[0]} - 1$, which, when divided by $q_{[0]}$, will result
2151  in a quotient and remainder of $2^Q - 1$.  Thus, no division overflow  in a quotient and remainder of $2^Q - 1$.  Thus, no division overflow
2152  can occur, and the fraction in the    can occur, and the fraction in the
2153  second term of (\ref{eq:ccil0:sidv0:sldm0:004}) can be evaluated  second term of (\ref{eq:ccil0:sidv0:sldm0:004}) can be evaluated
2154  using a large divisor integer machine instruction.  using a large divisor integer machine instruction.
2155
2156  The fraction in the  The fraction in the
2157  second term of (\ref{eq:ccil0:sidv0:sldm0:004}) can be simplified  second term of (\ref{eq:ccil0:sidv0:sldm0:004}) can be simplified
2158  using (\ref{eq:ccil0:sidv0:sldm0:002}) to yield:  using (\ref{eq:ccil0:sidv0:sldm0:002}) to yield:
2159
2160  \begin{eqnarray}  \begin{eqnarray}
2161  \nonumber  \nonumber
2162  \frac{p}{q} & = &   2^{P-Q} \left\lfloor{\frac{p_{[P/Q-1]}}{q_{[0]}}}\right\rfloor \\  \frac{p}{q} & = &   2^{P-Q} \left\lfloor{\frac{p_{[P/Q-1]}}{q_{[0]}}}\right\rfloor \\
2163  \label{eq:ccil0:sidv0:sldm0:005}  \label{eq:ccil0:sidv0:sldm0:005}
2164              & + &   2^{P-2Q} \left\lfloor{\frac{2^Q (p_{[P/Q-1]}\bmod q_{[0]}) + p_{[P/Q-2]}}{q_{[0]}}}\right\rfloor \\              & + &   2^{P-2Q} \left\lfloor{\frac{2^Q (p_{[P/Q-1]}\bmod q_{[0]}) + p_{[P/Q-2]}}{q_{[0]}}}\right\rfloor \\
2165  \nonumber   & + &   2^{P-2Q} \frac{(2^Q (p_{[P/Q-1]}\bmod q_{[0]}) + p_{[P/Q-2]}) \bmod q_{[0]}}{q_{[0]}} \\  \nonumber   & + &   2^{P-2Q} \frac{(2^Q (p_{[P/Q-1]}\bmod q_{[0]}) + p_{[P/Q-2]}) \bmod q_{[0]}}{q_{[0]}} \\
2166  \nonumber   & + &   \frac{\sum_{i=0}^{P/Q-3} 2^{iQ} p_{[i]}}{q_{[0]}} .  \nonumber   & + &   \frac{\sum_{i=0}^{P/Q-3} 2^{iQ} p_{[i]}}{q_{[0]}} .
2167  \end{eqnarray}  \end{eqnarray}
2168
2169  The process of combining adjacent terms can be continued until all  The process of combining adjacent terms can be continued until all
2170  divisions and modulo operations necessary can be carried out using  divisions and modulo operations necessary can be carried out using
2171  long dividend division instructions.  If we envision a  long dividend division instructions.  If we envision a
2172  long-dividend division instruction as a functional block that  long-dividend division instruction as a functional block that
2173  accepts a $2Q$-bit dividend and a $Q$-bit divisor to produce a  accepts a $2Q$-bit dividend and a $Q$-bit divisor to produce a
2174  $Q$-bit quotient and a $Q$-bit remainder  $Q$-bit quotient and a $Q$-bit remainder
2175  (Figure \ref{fig:ccil0:sidv0:sldm0:00}), then we can draw the  (Figure \ref{fig:ccil0:sidv0:sldm0:00}), then we can draw the
2176  entire division as outlined by (\ref{eq:ccil0:sidv0:sldm0:005})  entire division as outlined by (\ref{eq:ccil0:sidv0:sldm0:005})
2177  as shown in Figure \ref{fig:ccil0:sidv0:sldm0:01}.  as shown in Figure \ref{fig:ccil0:sidv0:sldm0:01}.
2178
2179  \begin{figure}  \begin{figure}
2180  \centering  \centering
2181  \includegraphics[width=4.6in]{c_cil0/lddvblk.eps}  \includegraphics[width=4.6in]{c_cil0/lddvblk.eps}
2182  \caption{Long Dividend Division Machine Instruction As A Functional Block}  \caption{Long Dividend Division Machine Instruction As A Functional Block}
2183  \label{fig:ccil0:sidv0:sldm0:00}  \label{fig:ccil0:sidv0:sldm0:00}
2184  \end{figure}  \end{figure}
2185
2186  \begin{figure}  \begin{figure}
2187  \centering  \centering
2188  \includegraphics[width=4.6in]{c_cil0/ldmnblk.eps}  \includegraphics[width=4.6in]{c_cil0/ldmnblk.eps}
2189  \caption{Long Dividend/Machine-Native Divisor Division In Functional Block Form}  \caption{Long Dividend/Machine-Native Divisor Division In Functional Block Form}
2190  \label{fig:ccil0:sidv0:sldm0:01}  \label{fig:ccil0:sidv0:sldm0:01}
2191  \end{figure}  \end{figure}
2192
2193  The following example illustrates how to apply the technique.  The following example illustrates how to apply the technique.
2194
2195  \begin{vworkexamplestatement}  \begin{vworkexamplestatement}
2196  \label{ex:ccil0:sidv0:sldm0:01}  \label{ex:ccil0:sidv0:sldm0:01}
2197  Implement 32/8 unsigned division on the TMS370C8 processor, which is  Implement 32/8 unsigned division on the TMS370C8 processor, which is
2198  characterized by a 16/8 division instruction.  characterized by a 16/8 division instruction.
2199  \end{vworkexamplestatement}  \end{vworkexamplestatement}
2200  \begin{vworkexampleparsection}{Solution}  \begin{vworkexampleparsection}{Solution}
2201  It would be possible to prepare an implementation directly from  It would be possible to prepare an implementation directly from
2202  Figure \ref{fig:ccil0:sidv0:sldm0:01}:  however, it may be  Figure \ref{fig:ccil0:sidv0:sldm0:01}:  however, it may be
2203  more instructive to work through a solution without the  more instructive to work through a solution without the
2204  aid of this figure.  aid of this figure.
2205
2206  In the case of the TMS370C8, the chunk size $Q$ is 8 bits; therefore  In the case of the TMS370C8, the chunk size $Q$ is 8 bits; therefore
2207  $Q=8$.  The problem statement indicates that we must accept 32-bit dividends;  $Q=8$.  The problem statement indicates that we must accept 32-bit dividends;
2208  therefore $P=32$.  Thus  therefore $P=32$.  Thus
2209
2210
2211  \label{eq:ex:ccil0:sidv0:sldm0:01:001}  \label{eq:ex:ccil0:sidv0:sldm0:01:001}
2212  p = 2^{24} p_{[3]} + 2^{16} p_{[2]} + 2^{8} p_{[1]} + p_{[0]}  p = 2^{24} p_{[3]} + 2^{16} p_{[2]} + 2^{8} p_{[1]} + p_{[0]}
2213
2214
2215  \noindent{}and  \noindent{}and
2216
2217
2218  \label{eq:ex:ccil0:sidv0:sldm0:01:002}  \label{eq:ex:ccil0:sidv0:sldm0:01:002}
2219  q = q_{[0]} .  q = q_{[0]} .
2220
2221
2222  \noindent{}Thus the quotient and remainder we would like to determine,  \noindent{}Thus the quotient and remainder we would like to determine,
2223  $\lfloor p/q \rfloor$ and $p \bmod q$, can be obtained by repeated  $\lfloor p/q \rfloor$ and $p \bmod q$, can be obtained by repeated
2224  application of (\ref{eq:ccil0:sidv0:sldm0:002}) as shown  application of (\ref{eq:ccil0:sidv0:sldm0:002}) as shown
2225  in Equations (\ref{eq:ex:ccil0:sidv0:sldm0:01:003})  in Equations (\ref{eq:ex:ccil0:sidv0:sldm0:01:003})
2226  through (\ref{eq:ex:ccil0:sidv0:sldm0:01:011}).  through (\ref{eq:ex:ccil0:sidv0:sldm0:01:011}).
2227
2228
2229  \label{eq:ex:ccil0:sidv0:sldm0:01:003}  \label{eq:ex:ccil0:sidv0:sldm0:01:003}
2230  \frac{p}{q}  \frac{p}{q}
2231  =  =
2232  \frac{2^{24} p_{[3]} + 2^{16} p_{[2]} + 2^{8} p_{[1]} + p_{[0]}}{q_{[0]}}  \frac{2^{24} p_{[3]} + 2^{16} p_{[2]} + 2^{8} p_{[1]} + p_{[0]}}{q_{[0]}}
2233
2234
2235
2236  \label{eq:ex:ccil0:sidv0:sldm0:01:004}  \label{eq:ex:ccil0:sidv0:sldm0:01:004}
2237  \frac{p}{q}  \frac{p}{q}
2238  =  =
2239  2^{24} \frac{p_{[3]}}{q_{[0]}}  2^{24} \frac{p_{[3]}}{q_{[0]}}
2240  +  +
2241  2^{16} \frac{p_{[2]}}{q_{[0]}}  2^{16} \frac{p_{[2]}}{q_{[0]}}
2242  +  +
2243  2^{8} \frac{p_{[1]}}{q_{[0]}}  2^{8} \frac{p_{[1]}}{q_{[0]}}
2244  +  +
2245  \frac{p_{[0]}}{q_{[0]}}  \frac{p_{[0]}}{q_{[0]}}
2246
2247
2248
2249  \label{eq:ex:ccil0:sidv0:sldm0:01:005}  \label{eq:ex:ccil0:sidv0:sldm0:01:005}
2250  \frac{p}{q}  \frac{p}{q}
2251  =  =
2252  2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor  2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor
2253  +  +
2254  2^{24} \frac{(p_{[3]} \bmod q_{[0]})}{q_{[0]}}  2^{24} \frac{(p_{[3]} \bmod q_{[0]})}{q_{[0]}}
2255  +  +
2256  2^{16} \frac{p_{[2]}}{q_{[0]}}  2^{16} \frac{p_{[2]}}{q_{[0]}}
2257  +  +
2258  2^{8} \frac{p_{[1]}}{q_{[0]}}  2^{8} \frac{p_{[1]}}{q_{[0]}}
2259  +  +
2260  \frac{p_{[0]}}{q_{[0]}}  \frac{p_{[0]}}{q_{[0]}}
2261
2262
2263
2264  \label{eq:ex:ccil0:sidv0:sldm0:01:006}  \label{eq:ex:ccil0:sidv0:sldm0:01:006}
2265  \frac{p}{q}  \frac{p}{q}
2266  =  =
2267  2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor  2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor
2268  +  +
2269  2^{16} \frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}  2^{16} \frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}
2270  +  +
2271  2^{8} \frac{p_{[1]}}{q_{[0]}}  2^{8} \frac{p_{[1]}}{q_{[0]}}
2272  +  +
2273  \frac{p_{[0]}}{q_{[0]}}  \frac{p_{[0]}}{q_{[0]}}
2274
2275
2276  \begin{eqnarray}  \begin{eqnarray}
2277  \label{eq:ex:ccil0:sidv0:sldm0:01:007}  \label{eq:ex:ccil0:sidv0:sldm0:01:007}
2278  \frac{p}{q} & = & 2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor  \frac{p}{q} & = & 2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor
2279  +  +
2280  2^{16} \left\lfloor{\frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}}\right\rfloor \\  2^{16} \left\lfloor{\frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}}\right\rfloor \\
2281  \nonumber & + &  \nonumber & + &
2282  2^{16} \frac{(2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}}{q_{[0]}}  2^{16} \frac{(2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}}{q_{[0]}}
2283  +  +
2284  2^{8} \frac{p_{[1]}}{q_{[0]}}  2^{8} \frac{p_{[1]}}{q_{[0]}}
2285  +  +
2286  \frac{p_{[0]}}{q_{[0]}}  \frac{p_{[0]}}{q_{[0]}}
2287  \end{eqnarray}  \end{eqnarray}
2288
2289  \begin{eqnarray}  \begin{eqnarray}
2290  \label{eq:ex:ccil0:sidv0:sldm0:01:008}  \label{eq:ex:ccil0:sidv0:sldm0:01:008}
2291  \frac{p}{q} & = & 2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor  \frac{p}{q} & = & 2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor
2292  +  +
2293  2^{16} \left\lfloor{\frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}}\right\rfloor \\  2^{16} \left\lfloor{\frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}}\right\rfloor \\
2294  \nonumber & + &  \nonumber & + &
2295  2^{8} \frac{2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}}{q_{[0]}}  2^{8} \frac{2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}}{q_{[0]}}
2296  +  +
2297  \frac{p_{[0]}}{q_{[0]}}  \frac{p_{[0]}}{q_{[0]}}
2298  \end{eqnarray}  \end{eqnarray}
2299
2300  \begin{eqnarray}  \begin{eqnarray}
2301  \nonumber\frac{p}{q} & = & 2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor  \nonumber\frac{p}{q} & = & 2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor
2302  +  +
2303  2^{16} \left\lfloor{\frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}}\right\rfloor \\  2^{16} \left\lfloor{\frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}}\right\rfloor \\
2304  \label{eq:ex:ccil0:sidv0:sldm0:01:009}  \label{eq:ex:ccil0:sidv0:sldm0:01:009}
2305   & + &   & + &
2306  2^{8} \left\lfloor{\frac{2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}}{q_{[0]}}}\right\rfloor \\  2^{8} \left\lfloor{\frac{2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}}{q_{[0]}}}\right\rfloor \\
2307  \nonumber & + &  \nonumber & + &
2308  2^{8} \frac{(2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}) \bmod q_{[0]}}{q_{[0]}} \\  2^{8} \frac{(2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}) \bmod q_{[0]}}{q_{[0]}} \\
2309  \nonumber & + &  \nonumber & + &
2310  \frac{p_{[0]}}{q_{[0]}}  \frac{p_{[0]}}{q_{[0]}}
2311  \end{eqnarray}  \end{eqnarray}
2312
2313  \begin{eqnarray}  \begin{eqnarray}
2314  \nonumber\frac{p}{q} & = & 2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor  \nonumber\frac{p}{q} & = & 2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor
2315  +  +
2316  2^{16} \left\lfloor{\frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}}\right\rfloor \\  2^{16} \left\lfloor{\frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}}\right\rfloor \\
2317  \label{eq:ex:ccil0:sidv0:sldm0:01:010}  \label{eq:ex:ccil0:sidv0:sldm0:01:010}
2318   & + &   & + &
2319  2^{8} \left\lfloor{\frac{2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}}{q_{[0]}}}\right\rfloor \\  2^{8} \left\lfloor{\frac{2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}}{q_{[0]}}}\right\rfloor \\
2320  \nonumber & + &  \nonumber & + &
2321  \frac{2^8((2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}) \bmod q_{[0]}) + p_{[0]}}{q_{[0]}}  \frac{2^8((2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}) \bmod q_{[0]}) + p_{[0]}}{q_{[0]}}
2322  \end{eqnarray}  \end{eqnarray}
2323
2324  \begin{eqnarray}  \begin{eqnarray}
2325  \nonumber & \displaystyle \frac{p}{q} =  2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor  \nonumber & \displaystyle \frac{p}{q} =  2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor
2326  +  +
2327  2^{16} \left\lfloor{\frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}}\right\rfloor & \\  2^{16} \left\lfloor{\frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}}\right\rfloor & \\
2328  \label{eq:ex:ccil0:sidv0:sldm0:01:011}  \label{eq:ex:ccil0:sidv0:sldm0:01:011}
2329   & \displaystyle +   & \displaystyle +
2330  2^{8} \left\lfloor{\frac{2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}}{q_{[0]}}}\right\rfloor & \\  2^{8} \left\lfloor{\frac{2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}}{q_{[0]}}}\right\rfloor & \\
2331  \nonumber & \displaystyle +  \nonumber & \displaystyle +
2332  \left\lfloor{\frac{2^8((2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}) \bmod q_{[0]}) + p_{[0]}}{q_{[0]}}}\right\rfloor + & \\  \left\lfloor{\frac{2^8((2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}) \bmod q_{[0]}) + p_{[0]}}{q_{[0]}}}\right\rfloor + & \\
2333  \nonumber  \nonumber
2334  & \displaystyle \frac{(2^8((2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}) \bmod q_{[0]}) + p_{[0]}) \bmod q_{[0]}}{q_{[0]}}  & \displaystyle \frac{(2^8((2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}) \bmod q_{[0]}) + p_{[0]}) \bmod q_{[0]}}{q_{[0]}}
2335  &  &
2336  \end{eqnarray}  \end{eqnarray}
2337
2338  Note several things about the implementation suggested by  Note several things about the implementation suggested by
2339  (\ref{eq:ex:ccil0:sidv0:sldm0:01:011}):  (\ref{eq:ex:ccil0:sidv0:sldm0:01:011}):
2340
2341  \begin{itemize}  \begin{itemize}
2342  \item No addition or multiplication is required to calculate terms such as  \item No addition or multiplication is required to calculate terms such as
2343        $2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}$.  The high-order byte of the        $2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}$.  The high-order byte of the
2344        large dividend can be stuffed with $p_{[3]} \bmod q_{[0]}$ and        large dividend can be stuffed with $p_{[3]} \bmod q_{[0]}$ and
2345        the low-order byte with $p_{[2]}$.        the low-order byte with $p_{[2]}$.
2346  \item No addition or multiplication is required to calculate the  \item No addition or multiplication is required to calculate the
2347        result $\lfloor p/q \rfloor$.        result $\lfloor p/q \rfloor$.
2348        Note in (\ref{eq:ex:ccil0:sidv0:sldm0:01:011}) that the results are        Note in (\ref{eq:ex:ccil0:sidv0:sldm0:01:011}) that the results are
2349        conveniently grouped as bytes with multipliers of $2^{24}$,        conveniently grouped as bytes with multipliers of $2^{24}$,
2350        $2^{16}$, $2^8$, and $2^0=1$.  The terms can simply be placed into        $2^{16}$, $2^8$, and $2^0=1$.  The terms can simply be placed into
2351        the appropriate byte, as a way of multplication by the appropriate        the appropriate byte, as a way of multplication by the appropriate
2352        power of 2.  Note also that each term is guaranteed to be        power of 2.  Note also that each term is guaranteed to be
2353        $\in \{0, 1, 2, \ldots{} , 255\}$, by the argument presented        $\in \{0, 1, 2, \ldots{} , 255\}$, by the argument presented
2354        earlier for (\ref{eq:ccil0:sidv0:sldm0:004}).  Thus, the        earlier for (\ref{eq:ccil0:sidv0:sldm0:004}).  Thus, the
2355        addition will result in no carries, and each result byte can simply        addition will result in no carries, and each result byte can simply
2356        be placed directly in the correct memory location---addition is        be placed directly in the correct memory location---addition is
2357        not necessary.        not necessary.
2358  \item Four machine division instructions are required, and the remainder  \item Four machine division instructions are required, and the remainder
2359        is produced automatically by the fourth instruction.        is produced automatically by the fourth instruction.
2360  \end{itemize}  \end{itemize}
2361
2362  An implemenation for the TMS370C8 is supplied as Figure  An implemenation for the TMS370C8 is supplied as Figure
2363  \ref{fig:ex:ccil0:sidv0:sldm0:01:01}.  A block diagram of the data  \ref{fig:ex:ccil0:sidv0:sldm0:01:01}.  A block diagram of the data
2364  flow for this implementation is supplied as  flow for this implementation is supplied as
2365  Figure \ref{fig:ex:ccil0:sidv0:sldm0:01:02}.  Figure \ref{fig:ex:ccil0:sidv0:sldm0:01:02}.
2366  \end{vworkexampleparsection}  \end{vworkexampleparsection}
2367  \vworkexamplefooter{}  \vworkexamplefooter{}
2368
2369  \begin{figure}  \begin{figure}
2370  \begin{verbatim}  \begin{verbatim}
2371  ;Assume that byte memory locations p3, p2, p1, and p0 contain the  ;Assume that byte memory locations p3, p2, p1, and p0 contain the
2372  ;32-bit unsigned dividend, and byte q0 contains the 8-bit unsigned  ;32-bit unsigned dividend, and byte q0 contains the 8-bit unsigned
2373  ;divisor.  Assume also that the result quotient will be placed  ;divisor.  Assume also that the result quotient will be placed
2374  ;in byte memory locations d3, d2, d1, and d0; and that the  ;in byte memory locations d3, d2, d1, and d0; and that the
2375  ;remainder will be placed in the byte memory location r0.  Further  ;remainder will be placed in the byte memory location r0.  Further
2376  ;assume that all memory locations are in the register file (near).  ;assume that all memory locations are in the register file (near).
2377          CLR A        ;High-order chunk of large divisor          CLR A        ;High-order chunk of large divisor
2378                       ;must be 0.                       ;must be 0.
2379          MOV p3, B    ;Load the low-order chunk of divisor.          MOV p3, B    ;Load the low-order chunk of divisor.
2380          DIV q0, A    ;Perform the first division.          DIV q0, A    ;Perform the first division.
2381          MOV A,  d3   ;Quotient becomes this part of the          MOV A,  d3   ;Quotient becomes this part of the
2382                       ;result.                       ;result.
2383          MOV B,  A    ;Remainder becomes high-order chunk of          MOV B,  A    ;Remainder becomes high-order chunk of
2384                       ;next division.                       ;next division.
2385          MOV p2, B    ;Next byte becomes low-order chunk.          MOV p2, B    ;Next byte becomes low-order chunk.
2386          DIV q0, A    ;Do the second division.          DIV q0, A    ;Do the second division.
2387          MOV A,  d2   ;Quotient becomes this part of the          MOV A,  d2   ;Quotient becomes this part of the
2388                       ;result.                       ;result.
2389          MOV B,  A    ;Remainder becomes high-order chunk of          MOV B,  A    ;Remainder becomes high-order chunk of
2390                       ;next division.                       ;next division.
2391          MOV p1, B    ;Next byte becomes low-order chunk.          MOV p1, B    ;Next byte becomes low-order chunk.
2392          DIV q0, A    ;Do the third division.          DIV q0, A    ;Do the third division.
2393          MOV A,  d1   ;Quotient becomes this part of the          MOV A,  d1   ;Quotient becomes this part of the
2394                       ;result.                       ;result.
2395          MOV B,  A    ;Remainder becomes high-order chunk of          MOV B,  A    ;Remainder becomes high-order chunk of
2396                       ;next division.                       ;next division.
2397          MOV p0, B    ;Next byte becomes low-order chunk.          MOV p0, B    ;Next byte becomes low-order chunk.
2398          DIV q0, A    ;Do the fourth division.          DIV q0, A    ;Do the fourth division.
2399          MOV A,  d0   ;Quotient becomes this part of the          MOV A,  d0   ;Quotient becomes this part of the
2400                       ;result.                       ;result.
2401          MOV B,  r0   ;This is the remainder, which could be used          MOV B,  r0   ;This is the remainder, which could be used
2402                       ;for rounding.                       ;for rounding.
2403  \end{verbatim}  \end{verbatim}
2404  \caption{TMS370C8 Code Snippet Illustrating Unsigned 32/8  \caption{TMS370C8 Code Snippet Illustrating Unsigned 32/8
2405           Division Using Unsigned 16/8           Division Using Unsigned 16/8
2406           Machine Instructions (Example \ref{ex:ccil0:sidv0:sldm0:01})}           Machine Instructions (Example \ref{ex:ccil0:sidv0:sldm0:01})}
2407  \label{fig:ex:ccil0:sidv0:sldm0:01:01}  \label{fig:ex:ccil0:sidv0:sldm0:01:01}
2408  \end{figure}  \end{figure}
2409
2410  \begin{figure}  \begin{figure}
2411  \centering  \centering
2412  \includegraphics[width=4.6in]{c_cil0/t370dmp.eps}  \includegraphics[width=4.6in]{c_cil0/t370dmp.eps}
2413  \caption{Block Diagram Of Data Flow Of Figure \ref{fig:ex:ccil0:sidv0:sldm0:01:01}}  \caption{Block Diagram Of Data Flow Of Figure \ref{fig:ex:ccil0:sidv0:sldm0:01:01}}
2414  \label{fig:ex:ccil0:sidv0:sldm0:01:02}  \label{fig:ex:ccil0:sidv0:sldm0:01:02}
2415  \end{figure}  \end{figure}
2416
2417
2418  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2419  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2420  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2421  \section{Miscellaneous Integer Mappings}  \section{Miscellaneous Integer Mappings}
2422  %Section tag: MIM0  %Section tag: MIM0
2423  \label{ccil0:smim0}  \label{ccil0:smim0}
2424
2425  Embedded system work and ROM constraints often inspire a great deal  Embedded system work and ROM constraints often inspire a great deal
2426  of cleverness in the selection of instructions to perform mappings or  of cleverness in the selection of instructions to perform mappings or
2427  tests.  In this section, we discuss integer mappings (i.e. functions)  tests.  In this section, we discuss integer mappings (i.e. functions)
2428  for which economical implementations are known; and in the next section  for which economical implementations are known; and in the next section
2429  (Section \ref{ccil0:smit0})  (Section \ref{ccil0:smit0})
2430  we discuss integer tests for which economical implementations are known.  we discuss integer tests for which economical implementations are known.
2431
2432  To the best of our knowledge, there is no way to derive these mappings  To the best of our knowledge, there is no way to derive these mappings
2433  and tests---they have been collected from many software developers and  and tests---they have been collected from many software developers and
2434  come from human intuition and experience.  come from human intuition and experience.
2435
2436
2437  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2438  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2439  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2440  \subsection{Lowest-Order Bit}  \subsection{Lowest-Order Bit}
2441  %Subsection tag: LIB0  %Subsection tag: LIB0
2442  \label{ccil0:smim0:slib0}  \label{ccil0:smim0:slib0}
2443
2444  \index{lowest-order bit}  \index{lowest-order bit}
2445  \index{least significant bit}  \index{least significant bit}
2446
2447  The mapping  The mapping
2448
2450
2451  \noindent{}is the most economical way known to extract the  \noindent{}is the most economical way known to extract the
2452  lowest-order bit set in an integer \texttt{x}, or  lowest-order bit set in an integer \texttt{x}, or
2453  0 if no bits are set.\footnote{This mapping was contributed by  0 if no bits are set.\footnote{This mapping was contributed by
2454  David Baker (\texttt{bakerda@engin.umich.edu})  David Baker (\texttt{bakerda@engin.umich.edu})
2455  and Raul Selgado (\texttt{rselgado@visteon.com}).}  Since most processors have an instruction to form the  and Raul Selgado (\texttt{rselgado@visteon.com}).}  Since most processors have an instruction to form the
2456  two's complement of an integer, this mapping usually requires only  two's complement of an integer, this mapping usually requires only
2457  two arithmetic instructions.  two arithmetic instructions.
2458
2459  When implementing this mapping in assembly-language on processors without a  When implementing this mapping in assembly-language on processors without a
2460  two's complement instruction, two other possible implementations are:  two's complement instruction, two other possible implementations are:
2461
2462  \begin{itemize}  \begin{itemize}
2463  \item \texttt{mask = x \& ($\sim$x + 1)}  \item \texttt{mask = x \& ($\sim$x + 1)}
2464  \item \texttt{mask = x \& ((x \^{ } -1) + 1)}  \item \texttt{mask = x \& ((x \^{ } -1) + 1)}
2465  \end{itemize}  \end{itemize}
2466
2467
2468  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2469  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2470  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2471  \section{Miscellaneous Integer Tests}  \section{Miscellaneous Integer Tests}
2472  %Section tag: MIT0  %Section tag: MIT0
2473  \label{ccil0:smit0}  \label{ccil0:smit0}
2474
2475  \subsection{Power Of 2}  \subsection{Power Of 2}
2476  %Subsection tag: PTW0  %Subsection tag: PTW0
2477  \label{ccil0:smit0:sptw0}  \label{ccil0:smit0:sptw0}
2478
2479  \index{power of two}  \index{power of two}
2480  \index{2N@$2^N$}  \index{2N@$2^N$}
2481
2482  The test  The test
2483
2484  \texttt{(x \& (x-1) == 0) \&\& (x != 0)}  \texttt{(x \& (x-1) == 0) \&\& (x != 0)}
2485
2486  \noindent{}is the most economical way known to  \noindent{}is the most economical way known to
2487  test whether an integer is a positive power of two  test whether an integer is a positive power of two
2488  (1, 2, 4, 8, 16, etc.).\footnote{The test appeared as part of  (1, 2, 4, 8, 16, etc.).\footnote{The test appeared as part of
2489  a discussion on  a discussion on
2490  the GMP mailing list in 2002.}  the GMP mailing list in 2002.}
2491
2492
2493  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2494  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2495  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2496  \section{Exercises}  \section{Exercises}
2497
2498  \begin{vworkexercisestatement}  \begin{vworkexercisestatement}
2499  \label{exe:ccil0:sexe0:01}  \label{exe:ccil0:sexe0:01}
2500  Show that any $m$-bit two's complement integer $u_{[m-1:0]}$ except  Show that any $m$-bit two's complement integer $u_{[m-1:0]}$ except
2501  $-2^{m-1}$ can be negated by forming the one's complement, then adding one.  $-2^{m-1}$ can be negated by forming the one's complement, then adding one.
2502  \end{vworkexercisestatement}  \end{vworkexercisestatement}
2503
2504
2505
2506  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2507  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2508  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2509  \vfill  \vfill
2510  \noindent\begin{figure}[!b]  \noindent\begin{figure}[!b]
2511  \noindent\rule[-0.25in]{\textwidth}{1pt}  \noindent\rule[-0.25in]{\textwidth}{1pt}
2512  \begin{tiny}  \begin{tiny}
2513  \begin{verbatim}  \begin{verbatim}
2514  $RCSfile: c_cil0.tex,v$  $RCSfile: c_cil0.tex,v$
2515  $Source: /home/dashley/cvsrep/e3ft_gpl01/e3ft_gpl01/dtaipubs/esrgubka/c_cil0/c_cil0.tex,v$  $Source: /home/dashley/cvsrep/e3ft_gpl01/e3ft_gpl01/dtaipubs/esrgubka/c_cil0/c_cil0.tex,v$
2516  $Revision: 1.24$  $Revision: 1.24$
2517  $Author: dtashley$  $Author: dtashley$
2518  $Date: 2003/11/03 02:14:24$  $Date: 2003/11/03 02:14:24$
2519  \end{verbatim}  \end{verbatim}
2520  \end{tiny}  \end{tiny}
2521  \noindent\rule[0.25in]{\textwidth}{1pt}  \noindent\rule[0.25in]{\textwidth}{1pt}
2522  \end{figure}  \end{figure}
2523
2524  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2525  % $Log: c_cil0.tex,v$  % $Log: c_cil0.tex,v$
2526  % Revision 1.24  2003/11/03 02:14:24  dtashley  % Revision 1.24  2003/11/03 02:14:24  dtashley
2527  % All duplicate labels as flagged by LaTeX removed.  Additional files added  % All duplicate labels as flagged by LaTeX removed.  Additional files added
2528  % to Microsoft Visual Studio edit context.  % to Microsoft Visual Studio edit context.
2529  %  %
2530  % Revision 1.23  2002/08/26 17:57:03  dtashley  % Revision 1.23  2002/08/26 17:57:03  dtashley
2532  % that I've captured all changes.  % that I've captured all changes.
2533  %  %
2534  % Revision 1.22  2002/08/25 13:18:21  dtashley  % Revision 1.22  2002/08/25 13:18:21  dtashley
2535  % Edits.  % Edits.
2536  %  %
2537  % Revision 1.21  2002/08/22 00:33:33  dtashley  % Revision 1.21  2002/08/22 00:33:33  dtashley
2538  % Have made aesthetic changes in CFRY0 and CCFR0.  Checking in all before  % Have made aesthetic changes in CFRY0 and CCFR0.  Checking in all before
2539  % rebuild of book.  % rebuild of book.
2540  %  %
2541  % Revision 1.20  2002/08/21 09:12:11  dtashley  % Revision 1.20  2002/08/21 09:12:11  dtashley
2542  % Safety checkin.  % Safety checkin.
2543  %  %
2544  % Revision 1.19  2002/08/19 05:06:45  dtashley  % Revision 1.19  2002/08/19 05:06:45  dtashley
2545  % Substantial progress in integer numerics chapter.  Preparing to  % Substantial progress in integer numerics chapter.  Preparing to
2546  % work at home.  % work at home.
2547  %  %
2548  % Revision 1.18  2002/08/10 07:34:48  dtashley  % Revision 1.18  2002/08/10 07:34:48  dtashley
2549  % Checkin before resuming work on a lemma at home.  % Checkin before resuming work on a lemma at home.
2550  %  %
2551  % Revision 1.17  2002/08/09 03:13:42  dtashley  % Revision 1.17  2002/08/09 03:13:42  dtashley
2552  % Significant lemma proved, and other progress.  % Significant lemma proved, and other progress.
2553  %  %
2554  % Revision 1.16  2002/08/06 22:15:24  dtashley  % Revision 1.16  2002/08/06 22:15:24  dtashley
2555  % Substantial lemma proof progress.  % Substantial lemma proof progress.
2556  %  %
2557  % Revision 1.15  2002/08/05 01:28:33  dtashley  % Revision 1.15  2002/08/05 01:28:33  dtashley
2558  % Lemma demonstrating that 0 \leq \hat{q}-q \leq 2 completed.  % Lemma demonstrating that 0 \leq \hat{q}-q \leq 2 completed.
2559  %  %
2560  % Revision 1.14  2002/08/03 23:04:57  dtashley  % Revision 1.14  2002/08/03 23:04:57  dtashley
2561  % Safety checkin.  Having difficulty:  may have solved the wrong  % Safety checkin.  Having difficulty:  may have solved the wrong
2562  % problem.  % problem.
2563  %  %
2564  % Revision 1.13  2002/08/01 01:31:24  dtashley  % Revision 1.13  2002/08/01 01:31:24  dtashley
2565  % Safety checkin.  Having difficulty completing an alternate version of  % Safety checkin.  Having difficulty completing an alternate version of
2566  % Knuth's proof about the accuracy of quotient estimates based on first  % Knuth's proof about the accuracy of quotient estimates based on first
2567  % digits of dividend and divisor.  % digits of dividend and divisor.
2568  %  %
2569  % Revision 1.12  2002/07/31 04:38:31  dtashley  % Revision 1.12  2002/07/31 04:38:31  dtashley
2570  % Edits to basic integer algorithms chapter, safety checkin.  % Edits to basic integer algorithms chapter, safety checkin.
2571  %  %
2572  % Revision 1.11  2002/07/29 16:30:08  dtashley  % Revision 1.11  2002/07/29 16:30:08  dtashley
2573  % Safety checkin before moving work back to WSU server Kalman.  % Safety checkin before moving work back to WSU server Kalman.
2574  %  %
2575  % Revision 1.10  2002/07/26 20:39:43  dtashley  % Revision 1.10  2002/07/26 20:39:43  dtashley
2576  % Safety checkin.  % Safety checkin.
2577  %  %
2578  % Revision 1.9  2002/07/26 14:58:01  dtashley  % Revision 1.9  2002/07/26 14:58:01  dtashley
2579  % Safety checkin.  % Safety checkin.
2580  %  %
2581  % Revision 1.8  2002/07/21 16:19:28  dtashley  % Revision 1.8  2002/07/21 16:19:28  dtashley
2582  % Finalization of material about integer division when dividend is  % Finalization of material about integer division when dividend is
2583  % long but divisor is same as data size that machine can use as  % long but divisor is same as data size that machine can use as
2584  % divisor natively.  % divisor natively.
2585  %  %
2586  % Revision 1.7  2002/07/20 23:01:45  dtashley  % Revision 1.7  2002/07/20 23:01:45  dtashley
2588  % divisor.  % divisor.
2589  %  %
2590  % Revision 1.6  2002/06/11 04:34:19  dtashley  % Revision 1.6  2002/06/11 04:34:19  dtashley
2591  % Information for David Baker added.  % Information for David Baker added.
2592  %  %
2593  % Revision 1.5  2002/06/08 05:50:25  dtashley  % Revision 1.5  2002/06/08 05:50:25  dtashley
2594  % Added missing tilde.  Can't get it to be same size as other  % Added missing tilde.  Can't get it to be same size as other
2595  % characters--need to research that.  % characters--need to research that.
2596  %  %
2597  % Revision 1.4  2002/06/07 02:08:32  dtashley  % Revision 1.4  2002/06/07 02:08:32  dtashley
2598  % Section added for integer mappings and tests.  Mapping for lowest-order  % Section added for integer mappings and tests.  Mapping for lowest-order
2599  % bit set from Raul Selgado and Dave (last name presently unknown)  % bit set from Raul Selgado and Dave (last name presently unknown)
2601  %  %
2602  % Revision 1.3  2001/08/31 23:11:17  dtashley  % Revision 1.3  2001/08/31 23:11:17  dtashley
2603  % End of August 2001 safety check-in.  % End of August 2001 safety check-in.
2604  %  %
2605  % Revision 1.2  2001/08/27 20:06:08  dtashley  % Revision 1.2  2001/08/27 20:06:08  dtashley
2606  % Edits and substantial re-organization.  % Edits and substantial re-organization.
2607  %  %
2608  % Revision 1.1  2001/08/25 22:51:25  dtashley  % Revision 1.1  2001/08/25 22:51:25  dtashley
2609  % Complex re-organization of book.  % Complex re-organization of book.
2610  %  %
2611  %End of file C_CIL0.TEX  %End of file C_CIL0.TEX

Legend:
 Removed from v.139 changed lines Added in v.140