%$Header$
\chapter[\ccilzeroshorttitle{}]{\ccilzerolongtitle{}}
\label{ccil0}
\beginchapterquote{``If our ancestors had invented arithmetic by counting with
their two fists or their eight fingers, instead of their
ten `digits', we would never have to worry about
writing binary-decimal conversion routines.
(And we would perhaps never have learned as much about
number systems.)''}
{Donald E. Knuth, \cite[p. 319]{bibref:b:knuthclassic2ndedvol2}}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Introduction}
%Section tag: INT0
\label{ccil0:sint0}
Low-cost microcontrollers have no support for floating-point arithmetic,
and so integer arithmetic and fixed-point arithmetic are used nearly exclusively
in embedded systems. The ability to implement integer arithmetic
economically is a critical skill in the development of embedded
systems.
Integer arithmetic algorithms are critically important in embedded
systems for the following reasons:
\begin{itemize}
\item Mistakes in the implementation of arithmetic are frequently
responsible for product problems. (Mistakes are not confined
to obvious errors---errors such as filters which do not converge
on their input are also responsible for product problems.)
\item Floating-point arithmetic is not available or ill-advised
for nearly all small embedded systems for the following reasons:
\begin{itemize}
\item Low-cost microcontrollers do not possess hardware support for
floating-point arithmetic.
\item Implementation of floating-point arithmetic in software is
computationally expensive.
\item Implementation of floating-point arithmetic in software may
require large floating-point libraries, typically consuming
1K-4K of ROM.
\item Safety-critical software standards typically prohibit the
use of floating-point arithmetic.
\end{itemize}
\item Integer arithmetic algorithms (other than addition and subtraction)
are quite tedious and error-prone for a software developer to design, implement, and
unit test. The implementation of such algorithms represents
cost and risk. Cost and risk benefits are achieved if the algorithms in detail are
available in advance (thus precluding design activities), or
better yet if ready-to-use integer algorithm libraries are available.
\end{itemize}
This chapter describes the more fundamental principles and algorithms
(representation, fixed-point arithmetic, treatment of overflow, comparison,
addition, subtraction, multiplication, and division). A section
(Section \ref{ccil0:smim0}) is also included
on miscellaneous mappings involving integers which
are not numerical in intent.
Chapter %\cdtazeroxrefhyphen\cdtazerovolarabic{}
TBD
describes more complicated
integer algorithms and techniques (discrete-time operations
such as filtering, integration, and differentiation as well as more
complex functions such as square root). The split between these two chapters
is arbitrary; and in fact the material could have been divided differently
or combined.
Treatment of the topics in this chapter is largely in accordance with
Knuth \cite{bibref:b:knuthclassic2ndedvol2}. The principal issues in
the implementation of integer algorithms are:
\begin{itemize}
\item \textbf{How to use the arithmetic [or other] instructions provided by the machine to
operate on larger operands.} Microcontrollers typically provide arithmetic
instructions (comparison, shifting, addition, subtraction, and often but not
always multiplication and/or division) that operate on 8-bit or 16-bit integers.
A key question
is how small-operand instructions ``scale up''---that is, if and how they can
be used to assist in the implementation of integer arithmetic for much larger
operands.
\item \textbf{The order of the algorithm involved.} The order of algorithms
is a complicated issue when applied to microcontroller work. Many sophisticated
algorithms have a breakpoint below which they are less economical than
an inferior algorithm. Some applications (such as generating cryptographic keys
when integers thousands of bits long must be tested for primality) will
benefit from sophisticated algorithms becuase the operand sizes are large enough
to pass any such breakpoints. However, in microcontroller work, the need to manipulate
integers longer than 64 bits is very rare; thus, the breakpoints that indicate the
use of more sophisticated algorithms may not be reached. In microcontroller work,
depending on the operand sizes, there are circumstances in which an
$O(n^2)$ algorithm may be preferable to an $O(\log n)$ algorithm. Generally,
the order of algorithms must be balanced against operand sizes.
\end{itemize}
We do diverge from Knuth in some areas. The most prominent divergence is
in the proofs offered for some important theorems and lemmas. Knuth
employs contrapositive proof formats in many circumstances, whereas we prefer
to use linear proofs that are more understandable to engineers and microcontroller
software developers.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section[Paradigms And Principles]
{Paradigms And Principles Of Microcontroller Arithmetic}
%Section tag: PPM0
\label{ccil0:sppm0}
How should one think about microcontroller arithmetic? What principles
guide us in its design and implementation? In this section,
we provide some general principles and paradigms of thought.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Microcontroller Arithmetic As An Accident Of Silicon Design}
%Subsection tag: MAS0
\label{ccil0:sppm0:smas0}
In chapters
\cfryzeroxrefhyphen{}\ref{cfry0},
\ccfrzeroxrefhyphen{}\ref{ccfr0},
and
\cratzeroxrefhyphen{}\ref{crat0}
we consider rational approximation,
both in the form $h/k$ and $h/2^q$. Both forms of rational approximation
tend to be effective because we know that all modern processors possess
shift instructions, most possess integer multiply instructions, and many
possess integer divide instructions. In other words, the design
of the machine instruction set drives the strategies for implementation
of arithmetic, and makes some strategies attractive.
Similarly, the observation that all microcontrollers provide instructions
for integer arithmetic creates the attractiveness of fixed-point arithmetic.
Thus, we might view our approaches to microcontroller arithmetic as
an ``accident'' of silicon design, or as being driven by silicon
design.
Generally, we seek to determine the best way to use the primitive
operations provided by the machine (the instruction set) to
accomplish the mappings of interest.
The ``classic'' algorithms
presented by Knuth
Knuth (\cite[pp. 265-284]{bibref:b:knuthclassic2ndedvol2}) are especially
designed to use the ``small'' addition, subtraction, multiplication, and
division provided by the machine to add, subtract, multiply, and divide arbitrarily
large integers. In
\cite[pp. 265-266]{bibref:b:knuthclassic2ndedvol2}) Knuth writes:
\begin{quote}
\emph{The most important fact to understand about extended-precision numbers
is that they may be regarded as numbers written in radix-$w$ notation,
where $w$ is the computer's word size. For example, an integer that
fills 10 words on a computer whose word size is $w=10^{10}$ has 100
decimal digits; but we will consider it to be a 10-place number to
the base $10^{10}$. This viewpoint is justified for the same reason
that we may convert, say, from binary to hexadecimal notation,
simply by grouping the bits together.}
\emph{In these terms, we are given the following primitive operations to work with:}
\begin{itemize}
\item \emph{a$_0$) addition or subtraction of one-place integers, giving a one-place
answer and a carry;}
\item \emph{b$_0$) multiplication of a one-place integer by another one-place integer,
giving a two place answer;}
\item \emph{c$_0$) division of a two-place integer by a one-place integer,
provided that the quotient is a one-place integer, and yielding
also a one-place remainder.}
\end{itemize}
\noindent{}\emph{By adjusting the word size, if necessary, nearly all computers
will have these three operations available; so we will construct algorithms
(a), (b), and (c) mentioned above in terms of the primitive operations
(a$_0$), (b$_0$), and (c$_0$).}
\emph{Since we are visualizing extended-precision integers as base $b$ numbers, it is
sometimes helpful to think of the situation when $b = 10$, and to imagine
that we are doing the arithmetic by hand. Then operation (a$_0$) is analogous
to memorizing the addition table; (b$_0$) is analogous to memorizing the
multiplication table, and (c$_0$) is essentially memorizing the multiplication
table in reverse. The more complicated operations (a), (b), (c) on
high-precision numbers can now be done using simple addition, subraction,
multiplication, and long-division procedures that children are taught
in elementary school.}
\end{quote}
The critical issue for implementation of integer arithmetic with large operands
is how to use small-operand instructions to operate on larger operands---in other words,
how to ``scale up'' the capability provided by the instruction set.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Microcontroller Arithmetic As A Mapping From Quantized Domain To
Quantized Range}
%Subsection tag: MAM0
\label{ccil0:sppm0:smam0}
Microcontroller software accepts inputs which are quantized. In nearly all cases,
this involves a mapping from $\vworkrealset$ to $\vworkintset$. Often, because
microcontroller products are optimized for cost, the quantization hardware
delivers quite poor precision, frequently less than 8 bits.
When a quantized input is accepted, it defines an inquality. Knowledge of
the quantized input (an integer) confines the actual input (a real
number, before
quantization) to an interval. With a low-cost hardware design, the
interval can be fairly large. Usually, by adding cost, the
interval can be made smaller.
Microcontroller outputs tend to be quantized as well, so it is
accurate to also characterize outputs as integers. For example, a PWM signal
generated by a microcontroller or the output of a D/A converter is
controlled by data that is an integer. Like inputs, often the ``granularity''
with which outputs can be controlled is quite coarse---again, 8 bits or
less is not uncommon.
Thus, we may view microcontroller software as a mapping from poor-quality
inputs to poor-quality outputs.
In such a framework, where the nature of inputs and outputs introduces
substantial error, it is imperative not to introduce additional error
in computer arithmetic. In other words, given inputs which are
integers, the responsibility of the software is to choose the best
integers as outputs. Usually this means that calculations should be
devised so as not to lose any information (i.e.
not to lose remainders, for example). Losing information is usually
equivalent to not being able to make the most correct mapping from input
to output. ``Lossy'' arithmetic can degrade the performance of a system,
since poor arithmetic may compound an inexpensive hardware design.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Microcontroller Arithmetic As A Simulation Of Continuous Controllers}
%Subsection tag: MAE0
\label{ccil0:sppm0:smae0}
Control systems have not always employed digital controllers.
Many books and web sites (see \cite{bibref:w:historycontrol01}, for example)
discuss the historical development of feedback control. Controllers
have not historically been digital, or even electronic.
Early controllers for governing steam devices or windmills were
ultimately mechanical, and relied upon inertia or other physical
properties. It is possible to realize abstract notions
(integrators, differentiators, gains) using hydraulic systems or other mechanical systems;
and in fact hydraulic feedback controllers were used in early rockets
and aircraft. Very naturally, abstract notions (integrators,
differentiators, gains) can be implemented using analog
electronic components. The most common implementations involve
operational amplifiers, and the behavior of such implementations comes
very close to the ideal mathematical models.
Mechanical, hydraulic, and non-digital electronic controllers have
one very desirable characteristic---\emph{clipping}. If, for example,
one provides an analog differentiator with a $dV/dt$ which
is too large, the output that the differentiator can
provide is limited, usually by the supply voltage available to an operational
amplifier.
The differentiator \emph{must} clip.
Clipping often leads to behavior which is close to what
intuition would expect (i.e. we would present
clipping as an occasional advantage). For example, if an input to
an analog control system suffers a failure, the behavior
the of the controller is limited, as is its internal
state. Similarly, when the
input is restored, the controller will usually recover
in a reasonable time because the
state of the controller (typically maintained in capacitors) is limited
in the magnitude it can attain.
We might view a digital controller as an emulation of
an analog controller. We may want to cause the
controller to have limits (i.e. rails) internally, for
example to prevent excessive integrator ``windup''. We discuss
this further in Section \ref{ccil0:sode0}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Practical Design Issues}
%Section tag: PDI0
\label{ccil0:spdi0}
In this section, we consider practical issues surrounding the design
and construction of a set of integer arithmetic subroutines.
In practice, such a collection of subroutines is likely to be
arranged into a library. The purpose of the library would be to
free the clients (or callers) of the library from the complexity of
large integer calculations.
The design decisions surrounding the construction of a library vary in
the objectivity with which they can be approached. Some design decisions
(such as the best mechanism for passing parameters) can be approached
rigorously because the measures of goodness are unequivocal (minimal ROM consumption
or execution time, for example). However, other design decisions, particulary the
decision of the exact nature of the interface between an arithmetic library and
its clients, are more subjective. One size does \emph{not} fit all.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Parameter Passing And Temporary Storage Mechanisms}
%Subsection tag: PPM0
\label{ccil0:spdi0:sppm0}
In small microcontroller work, the desire to save ROM and execution time
may lead to inelegant software construction. Because an arithmetic library
used in microcontroller work may be called from many different places
throughout ROM, serious thought should be given to optimizing the
parameter passing mechanisms, even perhaps at the expense of elegance.
The way in which the arithmetic library allocates temporary storage is
also a point of concern, because the most elegant way of allocating temporary
storage (on the stack) may either not be feasible (because of the possibility
of stack overflow) or may not be efficient (because the addressing modes of
the machine make data on the stack inefficient to address). In this section
we discuss both parameter passing and temporary storage mechanisms.
In the remainder of the discussion, we make the following assumptions
about software architecture.
\begin{enumerate}
\item \textbf{The arithmetic library need not be re-entrant.}
Most ``small'' microcontroller software loads use a non-preemptive
scheduling paradigm, so this is a reasonable assumption. We also
make the reasonable assumption that ISR's may not make calls into
the arithmetic library.
\item \textbf{Dynamic memory allocation, other than on the stack,
is not allowed by the software architecture.} This is also
a reasonable assumption in ``small'' microcontroller software.
\end{enumerate}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{Parameter Passing Mechanisms}
%Subsubsection tag: PPM0
\label{ccil0:spdi0:sppm0:sppm0}
If an arithmetic library exists in a microcontroller software load,
it may be called many times throughout ROM. Thus, the parameter
passing mechanisms chosen may have a large effect on ROM consumption
(due to the setup required for each subroutine call multiplied by
many instances throughout ROM) and execution time
(because in microcontroller software longer instructions nearly always
require more time). Because of the criticality of ROM consumption,
parameter passing mechanisms that lack elegance may be attractive.
In the category of parameter passing, we also include the way in which
return value(s) are passed back to the caller.
The following parameter-passing mechanisms may be employed:
\begin{enumerate}
\item \textbf{Pass by value as storage class \emph{automatic}.}
The most common scenario is that the arithmetic
library is written in assembly-language to be called from
`C', and so the assembly-language subroutines must adhere to the
parameter-passing conventions used by the compiler.
This usually means that the entire input or output value
will be passed in CPU registers or on the stack. Somewhat rarely,
a compiler will pass parameters in static locations.\footnote{The
usual reason for a `C' compiler to pass parameters in static locations
is because the instruction set of the machine was not designed for
higher-level languages, and references to [usually \emph{near}] memory
are cheaper than stack references. Such compilers also typically
analyze the calling tree of the program where possible and use this
information to overlay the parameter-passing memory areas of
subroutines that cannot be active simultaneously. Without the ability
to analyze the calling tree and make overlay decisions based on it,
memory would be exhausted, because each subroutine would need to have its
own static storage for parameters and local variables.}
\item \textbf{Pass by reference.}
Typically, it is convenient to pass pointer(s) to area(s) of memory
containing input operands, and also a pointer to an area of memory
owned by the caller which is written with the result by the arithmetic subroutine.
The efficiency of this approach depends on the compiler and the instruction
set of the machine. If the instruction set of the machine cannot
make effective use of pointers or stack frames, an arithmetic subroutine
might be constructed so that it first copies the operands to a static area of memory
reserved for the arithmetic library, then performs the necessary arithmetic
operations on the operands in the static area,
then copies the result(s) back to the area owned by the caller.
\item \textbf{Pass by common data block.}
In some cases, it may be preferable to reserve a block of memory in which to
pass parameters to arithmetic library functions, and from which to retrieve
results after an arithmetic library function returns. The allocation of such
a static memory block may be done manually\footnote{Note to self: need to
include gentleman's agreements on memory usage in my note on software architecture.}
or automatically (by development tools which can analyze the function calling tree and
manage the overlaying).
\end{enumerate}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{Temporary Storage Mechanisms}
%Subsubsection tag: TSM0
\label{ccil0:spdi0:sppm0:stsm0}
Need to indicate clearly on section on software architectures the
primary temporary storage mechanisms:
\begin{itemize}
\item Stack.
\item Memory block with overlay functionality.
\end{itemize}
Need to expand software architecture section to cover this, so don't
discuss here.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Reporting Of Overflow, Underflow, And Domain Errors}
%Subsection tag: OUD0
\label{ccil0:spdi0:soud0}
Long integer data types used in microcontroller work are typically
of a static size (they cannot grow in size in as operations are
performed on them). The reason for the typical static sizes is that
dynamic allocation (except for allocation and
deallocation on the stack as subroutines
are called and return) is rarely used in small microcontroller work.
It will come about in the normal usage of an integer arithmetic
library that an attempt will be made to operate on integers in
a way which generates an overflow, generates an
underflow, or
represents a domain error (division by zero or
square root of a negative integer, for example).
An important design decision is how such normal exceptions should be
handled.
Possible design decisions in this area are:
\begin{enumerate}
\item \label{enum:ccil0:spdi0:soud0:01:01}
\textbf{To design arithmetic subroutines so that exceptions are not
possible.}
For example, multiplying an $m$-word integer by an $n$-word integer
will always generate an integer that will fit within $m+n$ words.
If a multiplication subroutine is designed so that the caller must
provide an $m$-word operand and an $n$-word operand and a pointer to
an $(m+n)$-word area of memory for the result, an overflow cannot occur.
Such a design decision essentially pushes overflow detection back up to
the callers of arithmetic subroutines.
\item \label{enum:ccil0:spdi0:soud0:01:02}
\textbf{To design arithmetic subroutines so that exceptions are possible,
but not to detect the exceptions, thus providing an implementation that
will produce incorrect results with some operand data values.}
For example, if an arithmetic subroutine is designed to add an $m$-word
operand to another $m$-word operand to produce an $m$-word result, overflow
is possible. A design decision to fail to detect such exceptions pushes
the responsibility up to the callers of the arithmetic subroutines.
Callers must devise a method for not calling arithmetic subroutines
with data values that will cause an exception, or else to detect an exception
when it has occurred.
\item \label{enum:ccil0:spdi0:soud0:01:03}
\textbf{To ``rail'' the result in response to an exception.}
It was stated earlier that analog control system functional blocks
built with operational
amplifiers typically have an output which cannot go beyond the
supply rails. One may implement similar behavior in arithmetic subroutines.
In an addition subroutine which adds two $m$-word operands to produce an
$m$-word result (with each word having $w$ bits), it would be natural to
return $2^{mw}-1$ in the event of an overflow in a positive direction and
$-2^{mw}$ in the event of an overlfow in a negative direction. Note that
the caller will not be able to distinguish a ``rail'' value which represents
a valid result from a ``rail'' value substituted to indicate an exception.
\item \label{enum:ccil0:spdi0:soud0:01:04}
\textbf{To reserve special result data values to indicate exceptions.}
Depending on the arithmetic subroutine being implemented, it may be possible
to reserve certain result data values to indicate exceptions. This approach
is often awkward, as most mathematical subroutines are naturally defined so that
all bit patterns in the memory reserved for the result are valid numbers.
Additionally, with long result data values, it may not be economical to
compare the result against the reserved exception values. Thus, this is seldom
an optimal way to deal with exceptions.
Additionally, if this approach is employed, the semantics of how exception
values combine with other exception values and data values must be decided.
\item \label{enum:ccil0:spdi0:soud0:01:05}
\textbf{To return exception codes to the caller separate from the result data.}
In the `C' language, pointers are often used to supply an arithmetic subroutine
with the input operands and to provide the arithmetic subroutine with a location
(which belongs to the caller) in which to store the result data. Thus, the return
value of the arithmetic subroutine (normally assigned through the subroutine name)
is often available to return exception codes. For example,
a `C' function may be defined as
\begin{verbatim}
unsigned char int128_add(INT128 *result,
INT128 *arg1,
INT128 *arg2),
\end{verbatim}
leaving the returned \texttt{unsigned char} value available to return
exception information. Note that this arrangement has the following advantages:
\begin{enumerate}
\item All bit patterns in the result data memory area area available
as data bit patterns.
\item The exception data is very economical to test, because it is placed
in a machine-native data type.
\item The exception data can easily be discarded or by the caller if desired.
\item All decisions about how to handle exceptions are left to the caller.
\end{enumerate}
\item \label{enum:ccil0:spdi0:soud0:01:06}
\textbf{To maintain exception data with each result integer.}
It is possible to reserve bits for exception information which are part of the
long integer data type. This exception state essentially conveys
\emph{NaN}\footnote{\emph{N}ot \emph{a} \emph{n}umber.} information---integers with
exception information set are not true numbers, but rather they are different from the
true result in some way. As with (\ref{enum:ccil0:spdi0:soud0:01:04}), the
semantics of how to combine NaN values and NaN values with ordinary non-NaN numbers
must be defined.
\item \label{enum:ccil0:spdi0:soud0:01:07}
\textbf{To maintain a global exception state variable.}
A variable or set of variables can be reserved which hold the
exception information, if any, from the most recent call to
a function in the arithmetic library.
To save CPU cycles, the arithmetic library can
be designed so that it will assign the global exception state variable only if an
exception occurs---the caller then has the responsibility of clearing the exception
state variable before making any call into the arithmetic library where the
exception result is of interest. The interface between the caller and the library
can be further optimized if the library only OR's data into the variable containing
the exception state. Using this optimization, the caller can clear the exception state
variable, make several calls into the arithmetic library, and then retrieve a meaningful
exception state variable value summarizing several arithmetic operations.
\item \label{enum:ccil0:spdi0:soud0:01:08}
\textbf{Hybrid approaches.}
The approaches (\ref{enum:ccil0:spdi0:soud0:01:01})
through (\ref{enum:ccil0:spdi0:soud0:01:07}) can be combined.
For example, approach (\ref{enum:ccil0:spdi0:soud0:01:03})
might be combined with approach (\ref{enum:ccil0:spdi0:soud0:01:05})
so that exceptions are ``railed'', but the caller may also be made
aware that an exception has occured. Many hybrid approaches are possible.
\end{enumerate}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Semantics Of Combining Overflow, Underflow, And Domain Errors}
%Subsection tag: CMB0
\label{ccil0:spdi0:scmb0}
For control system arithmetic, some form of clipping as suggested
in Section \ref{ccil0:sppm0:smae0} is probably the
best approach. Definitely, an overflow should generate a result
which is the largest representable integer, and an
underflow should generate a result which is the smallest
representable integer.
In addition to treating an overflow by clipping, it may be
advantageous to reserve a flag in the representation of a
multiple-precision integer to record that an overflow has occured and been clipped.
Some functions which accept the integer as input may be interested
in the value of such a flag, where othere---perhaps most---may
not.
The correct course of action in the event of a domain error (such
as division by zero) is less clear. It is noteworthy that in a
normal control system, domain errors cannot occur (but overflows
can).
The best approach when a domain error is involved probably
depends on the basis for the underlying calculation. For
example, if integer division is used as part of a
strategy for software ratiometric conversion, a value
of zero in the denominator probably represents extreme electrical
noise, and the most sane approach may be to replace the
denominator by one. However, in other contexts it may be appropriate
to think in terms of
\begin{equation}
\lim_{k \rightarrow 0^+} \frac{kn}{kd}
\end{equation}
\noindent{}or
\begin{equation}
\lim_{k \rightarrow 0^-} \frac{kn}{kd} .
\end{equation}
Put in other terms, there is not a clear ``one solution
meets all needs'' approach to dealing with domain
errors.
As in the case of overflows, it may be advantageous to reserve a bit
flag to signal that a domain error has occured and that the result
is not valid or not reliable. Note that floating point chips
(such as the 80x87) provide similar indications of domain errors.
It may also be advantageous to adopt conventions for how
overflow or domain error flags propagate through binary or
unary operators. For example, if two numbers are multiplied, and
one of the two has the overflow flag set, it may be wise set the
overflow flag in the result. A scheme for how warning
flags propagate may be beneficial.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Variable Versus Constant Subroutine Execution Time}
%Subsection tag: VVC0
\label{ccil0:spdi0:svvc0}
As a design goal of an embedded system, we seek to minimize the
timing variability of software components. An arithmetic subroutine
that with a high probability takes a short time to execute and with a
low probability takes a long time to execute, and where the execution
time is data dependent, is a serious risk. An embedded software product
may pass all release testing, but then fail in the field because of
specific data values used in calculations.
A very conservative design goal would be to design every arithmetic subroutine
to require exactly the same execution time, regardless of data values.
This goal is not practical because machine instructions themselves
usually have a variable execution time, particularly for multiplication
and division instructions. A fallback goal would be to avoid
large differences between minimum and maximum execution time, without
increasing the maximum execution time. A very practical step to take
(using division as an example)
is to insert artifical delays into easily detectable exception cases (such as
division by zero) so that the exception case takes as long as the
minimum time for a division with valid operands.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Fixed-Point Arithmetic}
%Section tag: FPA0
\label{ccil0:sfpa0}
\emph{Fixed-point arithmetic}\index{fixed-point arithmetic}
is a scheme for the representation
of engineering quantities (conceptually real numbers with optional
units) by integers so that calculations can be performed
on these quantitites using [usually multiple-precision]
integer arithmetic.
In discussing fixed-point arithmetic,
we must be careful to distinguish between the
\emph{represented value} (the engineering quantity)
and the \emph{representation} (the integer which represents
the engineering quantity). In most cases, we must also
be careful to devise a system to track the units of the
represented values, as, especially with control systems,
the units of represented values (due to integration and
differentiation) can become very complex
and mistakes are easy to make.
Fixed-point arithmetic is the dominant paradigm of construction
for calculations in small microcontroller systems. It may not be
clear why this should be so or what advantages it offers [over
floating-point arithmetic]. The reasons for
this predominance are:
\begin{itemize}
\item Fixed-point calculations tend to be very efficient, because
they make direct use of the integer arithmetic instructions in the
microcontroller's instruction set. On the other hand, floating-point
arithmetic operations tend to be much slower.
\item Floating-point calculations typically require a floating-point
library, which may consume at least several hundred bytes
of ROM.
\item Some safety-critical software standards prohibit the use of
floating-point arithmetic because it can result in nebulous
behavior. Fixed-point arithmetic avoids these concerns.
\end{itemize}
In order to carry out fixed-point arithmetic---that is, in order to
operate on engineering quantities as integers---we
require that the relationship between the
represented value and the representation be of the form
\begin{equation}
\label{eq:ccil0:sfpa0:00}
x = r_I u + \Psi,
\end{equation}
\noindent{}where $x \in \vworkrealset$ (possibly with units) is the represented
value, $u \in \vworkintset$ is the representation,
$r_I \in \vworkrealset$ is the scaling factor (possibly with
units), and $\Psi \in \vworkrealset$ (possibly with units) is the offset.
Note that the units of $r_I$, $\Psi$, and $x$ must match.
We further require that $r_I \vworkdivides{} \Psi$\footnote{We \emph{are}
aware that this is an abuse of nomenclature, as
`$\vworkdivides$' (``divides'') is traditionally only applied to integers.}
(i.e. that $\Psi$ be an
integral multiple of $r_I$)
so that the offset in the represented value corresponds to an integer
in the representation. Without this restriction, we could not remove the
offset from the representation with integer subtraction only. Note that
we do \emph{not} require that $r_I$ or $\Psi$ be rational, although
they must both be rational or both be irrational in order to satisfy
(\ref{eq:ccil0:sfpa0:00}).
In (\ref{eq:ccil0:sfpa0:00}), since we've required that $r_I \vworkdivides{} \Psi$,
we can replace $\Psi$ by
\begin{equation}
\label{eq:ccil0:sfpa0:00b}
\psi = \frac{\Psi}{r_I}
\end{equation}
\noindent{}to obtain
\begin{equation}
\label{eq:ccil0:sfpa0:01}
x = r_I (u + \psi),
\end{equation}
\noindent{}where $\psi \in \vworkintset$ is the offset in representational
counts.
Note that we can also easily obtain the following relationships from the
defining equations (\ref{eq:ccil0:sfpa0:00}),
(\ref{eq:ccil0:sfpa0:00b}), and (\ref{eq:ccil0:sfpa0:01}).
\begin{equation}
\label{eq:ccil0:sfpa0:02}
u = \frac{x - \Psi}{r_I} = \frac{x}{r_I} - \psi
\end{equation}
\begin{equation}
\label{eq:ccil0:sfpa0:03}
\psi = \frac{\Psi}{r_I}
\Longleftrightarrow
\Psi = r_I \psi
\Longleftrightarrow
r_I = \frac{\Psi}{\psi}
\end{equation}
For example, in a 16-bit signed integer (which inherently may range
from -32768 to 32767 inclusive), one might used a fixed-point
representation of 100 integer counts per $^\circ$C ($r_I = 0.01 \; ^\circ$C)
with an offset of 100$^\circ$C ($\Psi$ = 100$^\circ$C or equivalently
$\psi$ = 10000), giving
a representational range from -227.68$^\circ$C to 427.67$^\circ$C inclusive.
If $r_I = 2^N, \; N \in \vworkintset$, then the radix point of the represented
value is positioned
between two bits of the representation---this arrangement may have computational
advantages
if the whole and fractional parts of the represented value need to be separated
easily in the representation. Note also that if $r_I = 2^{WN}$ where $W$ is the
machine integer size of the computer (in bits), then the radix point of the represented value
occurs between two addressable machine integers of the representation, which can be convenient.
However, we do not require for our definition of a fixed-point representation that
$r_I$ be an integral power of two; and in fact we do not even require by our
definition that $r_I$ be rational. Note that in general our definition above is the
weakest set of conditions so that real-world engineering values can be manipulated
using integer operations performed upon the representation.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Representation Of Integers}
%Section Tag: ROI0
\label{ccil0:sroi0}
In this section, we discuss common representations of integers, both
\emph{machine} integers and \emph{synthetic long} integers.
By \emph{representation} we mean the mapping between the abstract
mathematical notion of an integer and the way it is stored in the computer
(voltage levels and the programming model).
Although
in Knuth's development of integer arithmetic
\cite{bibref:b:knuthclassic2ndedvol2}
it is assumed that
integers may be represented in any base, we don't require such generality
in practice and in this work we confine ourselves for the most
part to $b=2^n$, and often to $n = 8i$, $i \in \vworkintsetpos$. The assumption
of $b=2^n$ characterizes all modern digital computers, and we feel comfortable
making this assumption throughout the work. However, the assumption
$n = 8i$, $i \in \vworkintsetpos$ does not hold universally, and so we
most often do not make this assumption.
By \emph{machine} integer, we mean an integer upon which the computer
can operate in a single instruction (such as to add, increment,
load, store, etc.). For most microcontrollers, machine integers are
either 8 or 16 bits in size. The representation of a machine integer
is designed and specified by the microcontroller manufacturer. In principle,
nothing would prevent a microcontroller manufacturer from devising
and implementing a novel way of representing machine integers and supporting
this novel representation with an instruction set. However, in
practice, all machine integers are either simple unsigned integers or two's
complement signed integers. In addition to the efficiency of these
representations with respect to the design of digital logic, these representations
are so standard and so pervasive that they are universally tacitly assumed.
For example, ``\texttt{if (i \& 0x1)}'' is an accepted C-language
idiom for ``if $i$ is odd'', and it is expected that such code will
work on all platforms.
By \emph{synthetic long} integer, we mean an integer of
arbitrary\footnote{By \emph{arbitrary}, we do not necessarily mean that
the integer can grow to be arbitrarily large in magnitude, or that
its maximum size is not known at compile time. We mean \emph{arbitrarily}
longer than a machine integer. Multiple-precision arithmetic libraries
can be divided into two classes---those that fix the size of the integers
at compile time, and those that use dynamic allocation and allow integers to
grow as needed at run time. The former category
is normally used for small microcontroller work, whereas the latter category
(such as the GNU MP Library \cite{bibref:s:gnumultipleprecisionarithmeticlibrary})
is normally used in scientific and number theoretic calculation and on
more powerful platforms than microcontrollers. The representational concepts
we present here apply to both categories.}
length that is formed by concatenating machine integers. There is some
subjectivity in deciding the representation of multiple-precision integers,
and we discuss in the subsections
\ref{ccil0:sroi0:srou0} and
\ref{ccil0:sroi0:sros0} which immediately follow.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Representation Of Unsigned Integers}
%Subsection Tag: ROU0
\label{ccil0:sroi0:srou0}
Unsigned machine integers are always represented as an ordered
array of bits (Figure \ref{fig:ccil0:sroi0:srou0:00}). For an
$m$-bit unsigned integer $u$, we denote these bits $u_{[m-1]}$ through
$u_{[0]}$, with $u_{[m-1]}$ the most significant bit. The value
of $u$ is the sum of the values of each bit multiplied
by the power of 2 it represents:
\begin{figure}
\centering
\includegraphics[width=4.6in]{c_cil0/uintrep1.eps}
\caption{Representation Of Unsigned Machine Integers}
\label{fig:ccil0:sroi0:srou0:00}
\end{figure}
\begin{equation}
\label{eq:ccil0:sroi0:srou0:00}
u = \sum_{i=0}^{m-1} u_{[i]} 2^i .
\end{equation}
In general, an $m$-bit unsigned integer can assume the values of
0 through $2^m - 1$, so that
\begin{equation}
\label{eq:ccil0:sroi0:srou0:01}
u \in \{0, \ldots{} , 2^m - 1 \} .
\end{equation}
Unsigned synthetic long integers are always represented as an array
of unsigned machine integers.
Consistent with the GMP \cite{bibref:s:gnumultipleprecisionarithmeticlibrary},
we call each element of the array a \emph{limb} and we call the size of
each limb the \emph{limbsize}. This usage is very close to what Knuth
calls the \emph{word}size $w$ in the excerpt presented in
Section \ref{ccil0:sppm0:smas0}.
Microcontroller processors are more likely than more powerful processors to have
a non-orthogonal instruction set, and so the limbsize may not be consistent between
operations in an arithmetic library.
With some processors, the appropriate limbsize may vary depending on the operation being
performed.
For example, a microcontroller processor may be able to add two 16-bit integers to
obtain a 16-bit result plus a carry, but only be able to multiply two 8-bit integers to
obtain a 16-bit result (thus, the appropriate limbsize for addition may be
16 bits while the appropriate limbsize for multiplication may be 8 bits).
In such cases, it is usually most efficient to add using 16-bit limbs but
multiply using 8-bit limbs. Adding using 8-bit limbs on a machine which will
support 16-bit additions is not normally a good design decision---even if the
machine supports an 8-bit addition instruction which is faster than the 16-bit addition
instruction, $m/2$ 16-bit additions will nearly always be faster than
$m$ 8-bit additions. Using different limbsizes within the same arithmetic library
may require some consideration of alignment and
endian issues, but these are implementation details
which are easily solved.
We view a synthetic long unsigned integer as an array of limbs (machine integers)
of some size, and we agree that we will not address the array in any other way than
by loading and storing limbs of this size.\footnote{Well \ldots{} not quite.
In software for large computers (personal computers and workstations) with an
orthogonal instruction set, we may be able to adhere to this rule. However,
with microcontrollers, arithmetic libraries which are optimized
may break this rule.} In particular, because
computers may be ``big-endian'' or ``little-endian'', loading and storing
smaller units than limbs may lead in
a worst case to software defects or in a best case to non-portable code.
Assume that $w$ is the number of bits in a limb.
Notationally, we denote an unsigned
synthetic long integer as an array of $m$ limbs
$u_{m-1}$ through $u_0$, each containing $w$ bits,
with $u_0$ the least significant machine integer.
We may also define $b=2^w$ (consistent with Knuth's
notation).
The value of
such a synthetic long machine integer is
\begin{equation}
\label{eq:ccil0:sroi0:srou0:02}
u = \sum_{i=0}^{m-1} u_{i} 2^{wi}
=
\sum_{i=0}^{m-1} u_{i} b^i.
\end{equation}
As an alternative, we may write the value as the sum of the bit-values,
\begin{equation}
\label{eq:ccil0:sroi0:srou0:03}
u = \sum_{i=0}^{wm-1} u_{[i]} 2^{i} .
\end{equation}
Naturally, the range of such a synthetic long integer is
\begin{equation}
\label{eq:ccil0:sroi0:srou0:04}
u \in \{0, \ldots{} , 2^{wm} - 1 \} .
\end{equation}
In storing an unsigned synthetic long machine integer, the most natural way
to order the array of limbs depends on whether dynamic memory allocation is
used by the arithmetic library. In microcontroller work, where arithmetic
library subroutines typically operate on fixed-size operands and produce
fixed-size results, storing
limbs most significant limb first (i.e. in `C', so that element \texttt{[0]}
of the array of limbs contains the most significant limb) may be natural
and convenient. However, this ordering would lead to computational waste in a library such
as the GMP \cite{bibref:s:gnumultipleprecisionarithmeticlibrary} where integers
may grow arbitrarily large and the library may need to reallocate long synthetic
integers to contain more limbs, as each reallocation would need to be followed
by a memory copy to align the integer's existing limbs to the end of the array.
For libraries such as the GMP, it is more practical to store limbs
least-significant limb first, as it eliminates the need to copy memory
when reallocations are done.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Representation Of Signed Integers}
%Subsection Tag: ROS0
\label{ccil0:sroi0:sros0}
Signed machine integers are always represented in two's complement form on modern
processors. This representation is universal because of the digital logic
conveniences---the same addition and subtraction mappings which are correct
for unsigned machine integers are also correct for signed machine integers,
although the criteria for overflow and comparison are
different.
Most readers are familiar with two's complement representation, so we will not
belabor it. However, we will present essential properties.
When two's complement representation is used in an $m$-bit machine integer $u$:
\begin{enumerate}
\item All bit patterns with $u_{[m-1]} = 0$ represent non-negative integers, and
represent the same integer as if the representation were unsigned.
\item All bit patterns with $u_{[m-1]} = 1$ represent negative numbers; specifically
$u_{[m-2:0]} - 2^{m-1}$; i.e.
\begin{equation}
\label{eq:ccil0:sroi0:sros0:00}
u = - u_{[m-1]} 2^{m-1} + \sum_{i=0}^{m-2} u_{[i]} 2^i .
\end{equation}
\item $u \in \{-2^{m-1}, \ldots{}, 2^{m-1}-1 \}$.
\item All bit patterns represent a unique integer.
\item For any integer except $-2^{m-1}$,
negation can be performed by forming the one's complement (complementing
every bit), then adding one. To see why this is true algebraically, note that
\end{enumerate}
However, let us observe that the value of an
$m$-bit two's complement
machine integer is
In general, an $m$-bit signed machine integer can assume the values of
$-2^{m-1}$ through $2^{m-1} - 1$, so that
\begin{equation}
\label{eq:ccil0:sroi0:sros0:01}
u \in \{-2^{m-1}, \ldots{} , 2^{m-1} - 1 \} .
\end{equation}
There are [at least] two different representations of signed
multiple-precision integers:
\begin{itemize}
\item Two's complement representation.
\item Sign-magnitude representation.
\end{itemize}
There are two different representations commonly used
for signed multiple-precision integers because two's complement
representation is not ideal for multiplication and division, although
it is ideal for addition and subtraction. For multiple-precision
integer arithmetic, sign-magnitude representation is more common.
In two's complement representation of multiple-precision integers,
the representation is the same as suggested by
(\ref{eq:ccil0:sroi0:sros0:00}), except
more bits are involved. For a two's complement representation
of a number consisting of $n$ machine integers with $W$ bits per
machine integer,
\begin{equation}
\label{eq:ccil0:sroi0:sros0:02}
u = - u_{B(Wn-1)} 2^{Wn-1} + \sum_{i=0}^{Wn-2} u_{B(i)} 2^i .
\end{equation}
Because we would like to know how to compare signed multiple-precision
integers in two's complement representation, we can gain some
insight into the representation by rewriting
(\ref{eq:ccil0:sroi0:sros0:02}) in terms of machine integers:
\begin{equation}
\label{eq:ccil0:sroi0:sros0:03}
u = - u_{B(Wn-1)} 2^{Wn-1}
+
\sum_{i=W(m-1)}^{Wn-2} u_{B(i)} 2^i
+
\sum_{i=0}^{m-2} u_{i} 2^{Wi} .
\end{equation}
(\ref{eq:ccil0:sroi0:sros0:03}) gives some insight into the
relative values of multiple-precision signed two's complement
integers with respect to the values of the machine integers
that comprise them. We discuss this further in
Section \ref{ccil0:scsi0}.
In sign-magnitude representation of multiple-precision signed two's
complement integers, an integer $u$ is represented as a sign
bit (usually a value of one indicates negativity), and a magnitude,
which is $|u|$. Unlike two's complement representation,
sign-magnitude representation, has two representations of zero---a positive
one and a negative one. Either a canonical form for zero should be
adopted, or both values should be treated identically.
Assuming that the sign bit is stored in the most significant bit position,
it is easy to deduce that the value of a multiple-precision
signed two's complement integer in sign-magnitude representation is
\begin{equation}
\label{eq:ccil0:sros0:srou0:04}
u = (-1)^{u_{B(m-1)}} \sum_{i=0}^{m-2} u_{B(i)} 2^i .
\end{equation}
Sign-magnitude representation is especially convenient because
it allows machine instructions which accept unsigned operands to be used
to make calculations involving signed integers.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Characteristics Of Practical Processors}
%Section tag: CPP
\label{ccil0:scpp0}
Before discussing specific algorithms, it is necessary to
discuss the construction of practical processors---how such a processor
manipulates machine integers. We accept as a typical processor the
TMS-370C8, an 8-bit microcontroller manufactured by
Texas Instruments.
\begin{figure}
\centering
\includegraphics[width=4.6in]{c_cil0/t370flag.eps}
\caption{Texas Instruments TMS-370C8 Flags}
\label{fig:ccil0:scpp0:00}
\end{figure}
\begin{figure}
\centering
\includegraphics[width=4.6in]{c_cil0/t370cjmp.eps}
\caption{Texas Instruments TMS-370C8 Conditional Jump Instructions}
\label{fig:ccil0:scpp0:01}
\end{figure}
A typical microcontroller allows operations on machine integers
in the following steps:
\begin{itemize}
\item A machine instruction is performed on one or two machine
integer operands (for example: addition, subtraction,
multiplication, division, increment, decrement, complement,
negation, or comparison). This machine instruction may
produce a result, and usually sets a number of condition flags that
reflect the nature and validity of the result (Is it zero?
Is it negative? Did the result overflow?). As an
example, the condition
flags of the TMS-370C8 are shown in Figure \ref{fig:ccil0:scpp0:00}.
.
\item A conditional branch instruction is used to branch conditionally
based on the state of the condition flags. The definition of
the condition flags and the way in which the conditional
branch instruction utilizes them is designed to provide a
way to treat both unsigned and signed machine integers.
As an example, the way in which the conditional
jump instructions of the TMS-370C8 use the flags
is shown in Figure \ref{fig:ccil0:scpp0:01}.
\end{itemize}
It is not too often necessary to understand in detail
the Boolean relationships that govern how machine integers are
added, subtracted, and compared; and how signed comparisons differ
from unsigned comparisons. In most cases, it is adequate to
rely on the design of the microcontroller. However, we do present
rudimentary observations in this section.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Comparison Of Integers}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Comparison Of Unsigned Integers}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Comparison Of Signed Integers}
%Section Tag: CSI0
\label{ccil0:scsi0}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Integer Addition}
%Section tag: IAD0
\label{ccil0:siad0}
Addition of two $m$-bit integers is a combinational function---that is,
the inputs uniquely determine the output. Addition of binary
numbers is performed
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Hardware Implementation Of Addition}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Addition Of Unsigned Operands}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Addition Of Signed Operands}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section {Integer Subtraction}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Hardware Implementation Of Subtraction}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Subtraction Of Unsigned Operands}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Subtraction Of Signed Operands}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Integer Multiplication}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Hardware Implementation Of Multiplication}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Multiplication Of Unsigned Operands}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Multiplication Of Signed Operands}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Integer Division}
\label{ccil0:sidv0}
\index{division}\index{integer division}In this section,
we discuss the best known methods of dividing integers using
typical microcontroller instruction sets. In general, given
two arbitrary integers $p$ and $q$, we are interested in determining
their quotient $q=\lfloor{}p/q\rfloor$ and remainder
$r=p\bmod{}q$ as economically as possible.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Hardware Implementation Of Division}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{General Unsigned Division Without A Machine Division Instruction}
\label{ccil0:sidv0:sgdn0}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{General Unsigned Division With A Machine Division Instruction}
\label{ccil0:sidv0:sgdu0}
As mentioned many places in this work, efficiency in microcontroller software
involves phrasing computational problems in a way which makes good use of
the machine instruction set. In Section \ref{ccil0:sidv0:sgdn0} we discussed
the classic shift-compare-subtract algorithm for division. This algorithm
is far from efficient. A reasonable question to ask is whether we can
leverage ``small'' division capability (provided by the machine instruction set)
to accomplish ``large'' divisions (those which we require in practice).
It ends up that this is possible: the technique involved is effectively
to use machine division instructions to estimate the highest-order bits of
the quotient based on the highest-order bits of the dividend and divisor.
Knuth's discussion of division
algorithms \cite[pp. 270-275]{bibref:b:knuthclassic2ndedvol2} is the
basis for most of the material in this subsection. However, Knuth has
a gift for terseness that is sometimes a curse for the reader, and so
we take more time than Knuth to explain certain results.
First, as a starting point, we present \emph{Algorithm D} from
Knuth \cite[pp. 272-273]{bibref:b:knuthclassic2ndedvol2}. Then,
we justify the algorithm and explain why it is valid. Finally,
we supply implementation advice for microcontroller instruction sets.
\begin{vworkalgorithmstatementpar}{Arbitrary Unsigned Division Using
Machine Unsigned Division Instructions}
\label{alg:ccil0:sidv0:sgdu0:01}
(From Knuth \cite[pp. 272-273]{bibref:b:knuthclassic2ndedvol2})
Given nonnegative integers $u=(u_{m+n-1} \ldots{} u_1 u_0)_b$
and $v=(v_{n-1} \ldots{} v_1 v_0)_b$, where
$v_{n-1} \neq 0$ and $n > 1$, we form the radix-$b$ quotient
$\lfloor{}u/v\rfloor{} = (q_m q_{m-1} \ldots{} q_0)_b$ and
the remainder $u \bmod v = (r_{n-1} \ldots{} r_1 r_0)_b$. When
$n=1$, the simpler algorithm of
Subsection \ref{ccil0:sidv0:sldm0}
should be used.
\begin{algblvl0}
\item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:01}
[Normalize.] Set $d \gets \lfloor{}b/(v_{n-1} + 1)\rfloor$.
Then set $(u_{m+n} u_{m+n-1} \ldots{} u_1 u_0)_b$ equal to
$(u_{m+n-1} \ldots{} u_1 u_0)_b$ times $d$; similarly,
set $(v_{n-1} \ldots{} v_1 v_0)_b$ equal to
$(v_{n-1} \ldots{} v_1 v_0)_b$ times $d$. (Notice the introduction
of a new digit position $u_{m+n}$ at the left of
$u_{m+n-1}$; if $d=1$, all we need to do in this step is set
$u_{m+n} \gets 0$. On a binary computer it may be preferable
to choose $d$ to be a power of 2 instead of using the value
suggested here; any value of $d$ that results in
$v_{n-1} \geq \lfloor{}b/2\rfloor$ will suffice. See also
exercise 37.)
\item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:02}
[Initialize $j$.] Set $j \gets m$. (The loop on $j$,
steps
\ref{enumstep:alg:ccil0:sidv0:sgdu0:01:03}
through
\ref{enumstep:alg:ccil0:sidv0:sgdu0:01:07},
will be essentially a division of
$(u_{j+n} \ldots{} u_{j+1} u_j)_b$ by $(v_{n-1} \ldots{} v_1 v_0)_b$ to
get a single quotient digit $q_j$; see Figure \ref{fig:alg:ccil0:sidv0:sgdu0:01:01}.)
\begin{figure}
\centering
\includegraphics[width=4.6in]{c_cil0/kdfc01.eps}
\caption{Flowchart For Algorithm \ref{alg:ccil0:sidv0:sgdu0:01} (From \cite[p. 273]{bibref:b:knuthclassic2ndedvol2})}
\label{fig:alg:ccil0:sidv0:sgdu0:01:01}
\end{figure}
\item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:03}
[Calculate $\hat{q}$.] Set
$\hat{q} \gets \lfloor{}u_{j+n}b + u_{j+n-1})/v_{n-1}\rfloor$ and
let $\hat{r}$ be the remainder, $(u_{j+n}b + u_{j+n-1}) \bmod v_{n-1}$.
Now test if $\hat{q} = b$ or $\hat{q} v_{n-2} > b\hat{r} + u_{j+n-2}$;
if so, decrease $\hat{q}$ by 1, increase $\hat{r}$ by $v_{n-1}$, and repeat
this test if $\hat{r} < b$. (The test of $v_{n-2}$ determines at high
speed most of the cases in which the trial value $\hat{q}$ is one too large,
and it eliminates \emph{all} cases where $\hat{q}$ is two too large;
see exercises 19, 20, 21.)
\item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:04}
[Multiply and subtract.] Replace $(u_{j+n} u_{j+n-1} \ldots{} u_j)_b$ by
\begin{equation}
\nonumber
(u_{j+n} u_{j+n-1} \ldots{} u_j)_b - \hat{q} (0 v_{n-1} \ldots{} v_1 v_0)_b.
\end{equation}
This computation consists of a simple multiplication by a one-place number,
combined with a subtraction. The digits $(u_{j+n}, u_{j+n-1}, \ldots{}, u_j)$
should be kept positive; if the result of this step is actually negative,
$(u_{j+n} u_{j+n-1} \ldots{} u_j)_b$ should be left as the true
value plus $b^{n+1}$, namely as the $b$'s complement of the true value, and
a ``borrow'' to the left should be remembered.
\item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:05}
[Test remainder.] Set $q_j \gets \hat{q}$. If the result of step
\ref{enumstep:alg:ccil0:sidv0:sgdu0:01:04} was negative, go to
step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:06}; otherwise go on to
step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:07}.
\item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:06}
[Add back.] (The probability that this step is necessary is very small, on the
order of only $2/b$, as shown in exercise 21; test data to activate this step
should therefore be specifically contrived when debugging.) Decrease
$q_j$ by 1, and add $(0 v_{n-1} \ldots v_1 v_0)_b$ to
$(u_{j+n} u_{j+n-1} \ldots{} u_{j+1} u_j)_b$. (A carry will occur to the left of
$u_{j+n}$, and it should be ignored since it cancels with the borrow that
occured in step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:04}.)
\item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:07}
[Loop on $j$.] Decrease $j$ by one. Now if $j \geq 0$, go back to step
\ref{enumstep:alg:ccil0:sidv0:sgdu0:01:03}.
\item \label{enumstep:alg:ccil0:sidv0:sgdu0:01:08}
[Unnormalize.] Now $(q_m \ldots{} q_1 q_0)_b$ is the desired quotient, and
the desired remainder may be obtained by dividing
$(u_{n-1} \ldots{} u_1 u_0)_b$ by $d$.
\end{algblvl0}
\end{vworkalgorithmstatementpar}
\vworkalgorithmfooter{}
The general idea of Algorithm \ref{alg:ccil0:sidv0:sgdu0:01} is that
digits (machine words) of the quotient $q$ can be successively estimated
based on the first digits of the dividend and divisor. Knuth
\cite[p. 271]{bibref:b:knuthclassic2ndedvol2} explores the properties of
the digit estimate
\begin{equation}
\label{eq:ccil0:sidv0:sgdu0:01}
\hat{q} = \min \left( {\left\lfloor{\frac{u_n b + u_{n-1}}{v_{n-1}}}\right\rfloor, b-1} \right).
\end{equation}
The first point to make about an estimate in the form of
(\ref{eq:ccil0:sidv0:sgdu0:01}) is that it can only be accomplished
efficiently if the machine-native division instruction supports
overflow detection, since it is possible that
$(u_n b + u_{n-1})/v_{n-1} \geq b$, even if
$u/v < b$, as is shown by the following example.
\begin{vworkexamplestatement}
\label{ex:ccil0:sidv0:sgdu0:01:01}
Assume that we wish to apply the estimate of $\hat{q}$ provided by
(\ref{eq:ccil0:sidv0:sgdu0:01})
to $u=16,776,704$ and $v=65,535$. Demonstrate that a machine division
overflow will occur when estimating the first digit, assuming a processor
that can divide a 16-bit dividend by an 8-bit divisor to produce an 8-bit
quotient.
\end{vworkexamplestatement}
\begin{vworkexampleparsection}{Solution}
Note that according to Knuth's intention, the word size on such a machine
is 8 bits. Thus, $b=256$. Note that $u/v = 255 + 255/256 < b = 256$, as
required by Knuth's precondition. However, although $u/v < b$,
$u = [255 \; 254 \; 0] [256^2 \; 256 \; 1]^T = [u_2 u_1 u_0] [b^2 b^1 b^0]^T$ and
$v = [255 \; 255] [256 \; 1]^T = [v_1 v_0] [b^1 b^0]^T$, so that calculating
an estimate $\hat{q}$ as required by (\ref{eq:ccil0:sidv0:sgdu0:01}),
$(u_n b + u_{n-1})/v_{n-1} = 65,534/255 = 256 + 254/255 \geq b$, is a division
overflow for a single machine division instruction. Thus, it follows that
a machine with division overflow detection can quickly determine that $b-1$ from
(\ref{eq:ccil0:sidv0:sgdu0:01}) is the minimum, whereas a machine without
division overflow
detection would have to use several additional machine instructions to make
this determination.
\end{vworkexampleparsection}
\vworkexamplefooter{}
The second thing to establish about $\hat{q}$ as defined by
(\ref{eq:ccil0:sidv0:sgdu0:01}) is how ``good'' of an estimate
$\hat{q}$ is---how much information, exactly, about $q$ can we
obtain by examining the first two words of $u$ and the first
word of $v$?
We first establish in the following lemma that our estimate of
$q$, $\hat{q}$, can be no less than $q$.
\begin{vworklemmastatementpar}{\mbox{\boldmath$\hat{q} \geq q$}}
\label{lem:ccil0:sidv0:sgdu0:01}
The estimate of $q$ provided by (\ref{eq:ccil0:sidv0:sgdu0:01}),
$\hat{q}$ is always at least as great as the actual value of
$q$, i.e. $\hat{q} \geq q$.
\end{vworklemmastatementpar}
\begin{vworklemmaproof}
Knowledge of $u_{n}$, $u_{n-1}$, and $v_{n-1}$ necessarily confine
the intervals in which the actual values of $u$ and $v$ may be;
specifically:\footnote{In
(\ref{eq:lem:ccil0:sidv0:sgdu0:01:04})
and
(\ref{eq:lem:ccil0:sidv0:sgdu0:01:07}),
we use statements of the form ``$x = x$'' as an idiom for
``$x$ is known''.}
\begin{eqnarray}
\label{eq:lem:ccil0:sidv0:sgdu0:01:01}
u & = & \sum_{i=0}^{n} u_i b^i \\
\label{eq:lem:ccil0:sidv0:sgdu0:01:02}
& = & u_n b^n + u_{n-1} b^{n-1} + \ldots{} + u_2 b^2 + u_1 b + u_0 \\
\label{eq:lem:ccil0:sidv0:sgdu0:01:03}
& = & (u_n b + u_{n-1}) b^{n-1} + \ldots{} + u_2 b^2 + u_1 b + u_0
\end{eqnarray}
\begin{eqnarray}
\nonumber & (u_n = u_n \wedge u_{n-1} = u_{n-1}) & \\
\label{eq:lem:ccil0:sidv0:sgdu0:01:04}
& \vworkvimp & \\
\nonumber & (u_n b + u_{n-1}) b^{n-1} \leq u \leq (u_n b + u_{n-1}) b^{n-1} + b^{n-1} - 1 &
\end{eqnarray}
\begin{eqnarray}
\label{eq:lem:ccil0:sidv0:sgdu0:01:05}
v & = & \sum_{i=0}^{n-1} v_i b^i \\
\label{eq:lem:ccil0:sidv0:sgdu0:01:06}
& = & v_{n-1} b^{n-1} + v_{n-2} b^{n-2} + \ldots{} + v_2 b^2 + v_1 b + v_0
\end{eqnarray}
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:01:07}
(v_{n-1} = v_{n-1})
\vworkhimp
v_{n-1} b^{n-1} \leq v \leq v_{n-1} b^{n-1} + b^{n-1} - 1
\end{equation}
(\ref{eq:lem:ccil0:sidv0:sgdu0:01:04}) and
(\ref{eq:lem:ccil0:sidv0:sgdu0:01:07}) reflect the uncertainties in the
values of $u$ and $v$ respectively because only the first digit(s) of
$u$ and $v$ are being considered in forming the estimate $\hat{q}$.
By definition, the actual value of $q$ is $\lfloor{}u/v\rfloor$. For a
rational function $f(u,v) = u/v$ where $u \in [u_{min}, u_{max}]$ and
$v \in [v_{min}, v_{max}]$, the minimum value of $u/v$ occurs at
$u_{min}/v_{max}$, and the maximum value of $u/v$ occurs at
$u_{max}/v_{min}$. We can therefore write that
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:01:08}
\left\lfloor{\frac{(u_n b + u_{n-1}) b^{n-1}}{v_{n-1} b^{n-1} + b^{n-1} - 1}}\right\rfloor
\leq
q
\leq
\left\lfloor{\frac{(u_n b + u_{n-1}) b^{n-1} + b^{n-1} - 1}{v_{n-1} b^{n-1}}}\right\rfloor .
\end{equation}
In other words, knowledge of $u_{n}$, $u_{n-1}$, and $v_{n-1}$ confines $q$ to the
interval indicated in (\ref{eq:lem:ccil0:sidv0:sgdu0:01:08}). We must prove that,
given a specific $u_{n}$, $u_{n-1}$, and $v_{n-1}$, $\hat{q}$ is at least as large as
the upper bound in (\ref{eq:lem:ccil0:sidv0:sgdu0:01:08}); otherwise we could find a
$q$ such that $q > \hat{q}$. We can algebraically manipulate the upper bound in
in (\ref{eq:lem:ccil0:sidv0:sgdu0:01:08}) to yield
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:01:09}
\left\lfloor{\frac{(u_n b + u_{n-1}) b^{n-1}}{v_{n-1} b^{n-1} + b^{n-1} - 1}}\right\rfloor
\leq
q
\leq
\left\lfloor{\frac{u_n b + u_{n-1} + \frac{b^{n-1}-1}{b^{n-1}}}{v_{n-1}}}\right\rfloor .
\end{equation}
In (\ref{eq:lem:ccil0:sidv0:sgdu0:01:09}), since $(b^{n-1}-1)/b^{n-1} < 1$ and since
$u_n b + u_{n-1}$ is an integer, we can conclude that
$\lfloor u_n b + u_{n-1} + (b^{n-1}-1)/b^{n-1} \rfloor = \lfloor u_n b + u_{n-1} \rfloor$
and hence that
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:01:10}
q
\leq
\left\lfloor{\frac{u_n b + u_{n-1} + \frac{b^{n-1}-1}{b^{n-1}}}{v_{n-1}}}\right\rfloor
=
\left\lfloor{\frac{u_n b + u_{n-1}}{v_{n-1}}}\right\rfloor
=
\hat{q} .
\end{equation}
Therefore, $\hat{q} \geq q$.
\end{vworklemmaproof}
\vworklemmafooter{}
Lemma \ref{lem:ccil0:sidv0:sgdu0:01} establishes that
a digit estimate $\hat{q}$ based on the first digit of the
divisor $v$ can be no less than the actual digit $q$, i.e.
$\hat{q}-q \geq 0$. However, we must also establish an upper bound
on $\hat{q}-q$.
Intuitively, based on
(\ref{eq:lem:ccil0:sidv0:sgdu0:01:06}), we might guess that
if $v_{n-1}$ is small, the estimate $\hat{q}$ may be quite
poor, as the interval to which the actual value of $v$ is confined
may be quite large. This fact is the basis for the normalization
step [\ref{enumstep:alg:ccil0:sidv0:sgdu0:01:01}] in Algorithm
\ref{alg:ccil0:sidv0:sgdu0:01}. We now prove a useful result
for how much $u, v$ must be normalized so that $\hat{q}-q \leq 2$.
\begin{vworklemmastatementpar}{Normalization Requirement So That
\mbox{\boldmath$\hat{q} - q \leq 2$}}
\label{lem:ccil0:sidv0:sgdu0:02}
If $v_{n-1} \geq \lfloor b/2 \rfloor$ and $\hat{q}$ is chosen as
indicated in (\ref{eq:ccil0:sidv0:sgdu0:01}), then
$0 \leq \hat{q} - q \leq 2$.
\end{vworklemmastatementpar}
\begin{vworklemmaproof}
The lower limit on $\hat{q} - q$ is proved in Lemma \ref{lem:ccil0:sidv0:sgdu0:01}.
We now seek only to prove that $\hat{q} - q \leq 2$.
By definition of $\hat{q}$ and $q$,
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:02:01}
\hat{q} - q = \left\lfloor {\frac{u_n b + u_{n-1}}{v_{n-1}}} \right\rfloor
- \left\lfloor {\frac{u}{v}} \right\rfloor
\end{equation}
When only nonnegative integers are involved,
(\cmtnzeroxrefhyphen\ref{eq:cmtn0:sfcf0:02})
supplies an exact expression for the floor of a
ratio of integers. Using (\cmtnzeroxrefhyphen\ref{eq:cmtn0:sfcf0:02}),
(\ref{eq:lem:ccil0:sidv0:sgdu0:02:01}) can be decomposed into
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:02:02}
\hat{q} - q = \frac{u_n b + u_{n-1}}{v_{n-1}}
- \frac{(u_n b + u_{n-1}) \bmod v_{n-1}}{v_{n-1}}
- \frac{u}{v}
+ \frac{u \bmod v}{v} .
\end{equation}
\noindent{}Note that (\ref{eq:lem:ccil0:sidv0:sgdu0:02:02}) is an exact
expression (rather than an
inequality).
Note in (\ref{eq:lem:ccil0:sidv0:sgdu0:02:02}) that
$(u_n b + u_{n-1}) \bmod v_{n-1} \in [0, v_{n-1}-1]$, and that in general
there is no reason to expect it cannot be zero. Thus, we can assume that
it \emph{is} zero, which will maximize $\hat{q}-q$. We can thus convert
(\ref{eq:lem:ccil0:sidv0:sgdu0:02:02}) into the inequality
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:02:03}
\hat{q} - q \leq \frac{u_n b + u_{n-1}}{v_{n-1}}
- \frac{u}{v}
+ \frac{u \bmod v}{v} .
\end{equation}
In (\ref{eq:lem:ccil0:sidv0:sgdu0:02:03}) we can also observe that
$(u \bmod v)/v \in [0, (v-1)/v]$. If we replace this expression with
``1'' (which is unattainable, but barely), this will change the relational
operator from ``$\leq$'' to ``$<$'':
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:02:04}
\hat{q} - q < \frac{u_n b + u_{n-1}}{v_{n-1}}
- \frac{u}{v}
+ 1 .
\end{equation}
The result we wish to show is that with $v_{n-1} \geq \lfloor b/2 \rfloor$,
$\hat{q}-q \leq 2$. To simplify the subsequent algebraic manipulations, note in
(\ref{eq:lem:ccil0:sidv0:sgdu0:02:04}) that
\begin{eqnarray}
\label{eq:lem:ccil0:sidv0:sgdu0:02:05}
& \hat{q} - q \leq 2 & \\
\nonumber & \vworkvertequiv & \\
\label{eq:lem:ccil0:sidv0:sgdu0:02:06}
& \displaystyle \frac{u_n b + u_{n-1}}{v_{n-1}}
- \frac{u}{v}
+ 1 \leq 3 & \\
\nonumber & \vworkvertequiv & \\
\label{eq:lem:ccil0:sidv0:sgdu0:02:07}
& \displaystyle \frac{u_n b + u_{n-1}}{v_{n-1}}
- \frac{u}{v} \leq 2 &
\end{eqnarray}
(\ref{eq:lem:ccil0:sidv0:sgdu0:02:06}) may be counterintuitive, so
further explanation is offered here. Since $\hat{q} \in \vworkintset$ and $q \in \vworkintset$,
$\hat{q}-q \in \vworkintset$. Thus, proving
(\ref{eq:lem:ccil0:sidv0:sgdu0:02:06}) or
(\ref{eq:lem:ccil0:sidv0:sgdu0:02:07}) proves that
$\hat{q}-q \in \{ \ldots, -1, 0, 1, 2 \}$ (however, by
Lemma \ref{lem:ccil0:sidv0:sgdu0:01}, $\hat{q}-q \geq 0$, so in fact
what would be proved is that $\hat{q}-q \in \{ 0, 1, 2 \}$). For
algebraic simplicity,
we choose to prove (\ref{eq:lem:ccil0:sidv0:sgdu0:02:07}).
First, adjust numerator and denominator of the first term in
(\ref{eq:lem:ccil0:sidv0:sgdu0:02:07}) by $b^{n-1}$ so that the terms more closely
resemble $u$ and $v$ in
(\ref{eq:lem:ccil0:sidv0:sgdu0:01:02})
and
(\ref{eq:lem:ccil0:sidv0:sgdu0:01:06}):
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:02:08}
\frac{u_n b^n + u_{n-1} b^{n-1}}{v_{n-1} b^{n-1}}
- \frac{u}{v} \leq 2 .
\end{equation}
For logical implication to be maintained,
we must make the most pessimistic choices and assumptions possible in
(\ref{eq:lem:ccil0:sidv0:sgdu0:02:08}) in order to maximize the value
of the left side of the inequality.
The first assumption to be made is the error in estimating
$u$ and $v$ based on their most significant digits. It can be
seen that (\ref{eq:lem:ccil0:sidv0:sgdu0:02:08}) will be maximized if:
\begin{itemize}
\item We assume that $u = u_{n} b^n + u_{n-1} b^{n-1}$ (i.e. that we estimate
$u$ precisely).
\item We assume that $v = v_{n-1} b^{n-1} + b^{n-1} - 1$ (i.e. that
we underestimate $v$ by the maximum amount possible).
\item We minimize the value of $v_{n-1} b^{n-1}$.
\end{itemize}
Assuming that $u$ is estimated precisely yields
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:02:09}
\frac{u}{v_{n-1} b^{n-1}}
- \frac{u}{v} \leq 2 .
\end{equation}
Assuming that $v$ is underestimated by the maximum amount possible
yields
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:02:10}
\frac{u}{v - b^{n-1} + 1}
- \frac{u}{v} \leq 2 .
\end{equation}
Finally, with $b$ and $v$ fixed, $u$ can be maximized by noting that
$u \leq bv - 1$ (by the problem assumption that the quotient is a single digit).
However, for algebraic simplicity, we make the substitution $u=bv$ (rather than
$u=bv-1$), since the weaker upper bound is strong enough to prove the
first result we seek.
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:02:11}
\frac{bv}{v - b^{n-1} + 1}
- \frac{bv}{v} \leq 2
\end{equation}
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:02:12}
\frac{bv}{v - b^{n-1} + 1}
- b \leq 2
\end{equation}
Solving (\ref{eq:lem:ccil0:sidv0:sgdu0:02:12})
for $v$ yields
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:02:13}
v \geq \frac{b^n}{2} + b^{n-1} - \frac{1}{b} - 1
\end{equation}
Again using the assumption that $v$ is underestimated by the maximum amount
possible, we may make the substitution that $v = v_{n-1} b^{n-1} + b^{n-1} -1$,
leading to
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:02:13b}
v_{n-1} \geq \frac{b}{2} - \frac{1}{2 b^{n-2}} .
\end{equation}
There are two cases to consider: $b$ even and $b$ odd. If $b$ is even, the
proof is complete, as $\lfloor b/2 \rfloor = b/2$ and the choice of
$v_{n-1} = \lfloor b/2 \rfloor$ will automatically satisfy
(\ref{eq:lem:ccil0:sidv0:sgdu0:02:13b}). However, if $b$ is odd,
$\lfloor b/2 \rfloor = b/2 - 1/2 < b/2 - 1/2b^{n-2}$, violating
(\ref{eq:lem:ccil0:sidv0:sgdu0:02:13b}),
and so we need to
further examine this case.
If $b$ is odd and $v_{n-1} = \lfloor b/2 \rfloor$, then
$v_{n-1} = b/2 - 1/2$, violating (\ref{eq:lem:ccil0:sidv0:sgdu0:02:13b}).
However, any larger choice of $v_{n-1}$ (such as
$\lfloor b/2 \rfloor + 1$, $\lfloor b/2 \rfloor + 2$, etc.) satisfies
(\ref{eq:lem:ccil0:sidv0:sgdu0:02:13b}); so that it remains only to prove
the $v_{n-1} = \lfloor b/2 \rfloor = b/2 - 1/2$
case.
If $v_{n-1} = \lfloor b/2 \rfloor = b/2 - 1/2$, then
\begin{eqnarray}
\label{eq:lem:ccil0:sidv0:sgdu0:02:14}
v & \in & \left[
\left( \frac{b}{2} - \frac{1}{2}\right) b^{n-1},
\left( \frac{b}{2} - \frac{1}{2}\right) b^{n-1} + b^{n-1} - 1
\right] \\
\nonumber & = &
\left[
\frac{b^n}{2} - \frac{b^{n-1}}{2},
\frac{b^n}{2} + \frac{b^{n-1}}{2} -1
\right] .
\end{eqnarray}
Note in this case that the estimation error ($v - v_{n-1}b^{n-1}$) and the
value of $v$ are not independent; and in fact it is this aspect
of the problem that has led to the violation of
(\ref{eq:lem:ccil0:sidv0:sgdu0:02:13b}) with $b$ odd and
$v_{n-1} = \lfloor b/2 \rfloor$.
In order to prove the case of $b$ odd and $v = \lfloor b/2 \rfloor = b/2 - 1/2$,
we must reexamine some simplifying assumptions made earlier in order to obtain
tighter inequalities. In (\ref{eq:lem:ccil0:sidv0:sgdu0:02:02}), we can no
longer accept the maximum of $(u \bmod v)/v$ as one; instead we construct the
tighter inequality
\begin{eqnarray}
\label{eq:lem:ccil0:sidv0:sgdu0:02:15}
& \displaystyle \hat{q} - q \leq \frac{u_n b + u_{n-1}}{v_{n-1}}
- \frac{u}{v}
+ \frac{u \bmod v}{v} & \\
\nonumber & \vworkvimp & \\
\label{eq:lem:ccil0:sidv0:sgdu0:02:16}
& \displaystyle \hat{q} - q \leq \frac{u_n b + u_{n-1}}{v_{n-1}}
- \frac{u}{v}
+ \frac{v-1}{v} , &
\end{eqnarray}
which leads to
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:02:17}
\frac{u_n b + u_{n-1}}{v_{n-1}}
- \frac{u+1}{v}
< 2 .
\end{equation}
In order to maximize the left-hand side of
(\ref{eq:lem:ccil0:sidv0:sgdu0:02:17}), we assume that we
estimate $u$ exactly so that $u = u_n b^n + u_{n-1} b^{n-1}$,
yielding
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:02:18}
\frac{u}{v_{n-1} b^{n-1}}
- \frac{u+1}{v}
< 2 .
\end{equation}
We also assume that $u$ is the maximum value
possible, $u=bv-1$, leading to
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:02:19}
\frac{bv-1}{v_{n-1} b^{n-1}}
- b
< 2 .
\end{equation}
Finally, we assume that $v$ is the upper limit in
(\ref{eq:lem:ccil0:sidv0:sgdu0:02:14}),
$v=b^n/2 + b^{n-1}/2 - 1$, and substitute the known value of
$v_{n-1}$ for the case being proved, $v_{n-1} = b/2-1/2$, yielding
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:02:20}
\frac{b \left( \frac{b^n}{2} + \frac{b^{n-1}}{2} - 1 \right) - 1}
{\left( \frac{b}{2} - \frac{1}{2} \right) b^{n-1}}
- b
< 2 .
\end{equation}
Simplification of (\ref{eq:lem:ccil0:sidv0:sgdu0:02:20})
will establish that it is always true. This completes the proof.
\end{vworklemmaproof}
\vworklemmafooter{}
Lemmas \ref{lem:ccil0:sidv0:sgdu0:01} and
\ref{lem:ccil0:sidv0:sgdu0:02},
standing alone, lead to a good implementation of
division without any further results. If it is known that
$0 \leq \hat{q} - q \leq 2$, Algorithm \ref{alg:ccil0:sidv0:sgdu0:01}
can be trivially modified to only calculate
$\hat{q}$ (omitting the additional tests in Step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:03}),
and then
to include up to two add-back steps
(duplication of Step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:06}).
Although such an algorithm would be
satisfactory, it has the disadvantage that the add-back steps would
be executed very frequently, slowing the algorithm substantially, especially
for long operands.
We now show that the additional tests in Step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:03} of
Algorithm \ref{alg:ccil0:sidv0:sgdu0:01} can eliminate
altogether the case of $\hat{q}-q = 2$ (Lemma \ref{lem:ccil0:sidv0:sgdu0:03}), and
can with a probability close to unity eliminate the
case of $\hat{q}-q = 1$ (Lemmas \ref{lem:ccil0:sidv0:sgdu0:04}
and \ref{lem:ccil0:sidv0:sgdu0:05}). Together these tests, which are present in
the statement of Algorithm \ref{alg:ccil0:sidv0:sgdu0:01},
reduce add-back (Step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:06}) to rare occurrence, and create a more
efficient algorithm than would be possible with the
results of Lemmas \ref{lem:ccil0:sidv0:sgdu0:01} and \ref{lem:ccil0:sidv0:sgdu0:02} alone.
\begin{vworklemmastatementpar}
{\mbox{\boldmath$\hat{q} v_{n-2} \leq b \hat{r} + u_{j+n-2} \vworkhimp 0 \leq \hat{q} - q \leq 1$}}
\label{lem:ccil0:sidv0:sgdu0:03}
If the divisor normalization requirement ($v_{n-1} \geq \lfloor b/2 \rfloor$) as specified in
Step \ref{enumstep:alg:ccil0:sidv0:sgdu0:01:01} of
Algorithm \ref{alg:ccil0:sidv0:sgdu0:01} is met, then
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:03:01}
\hat{q} v_{n-2} \leq b \hat{r} + u_{j+n-2} \vworkhimp 0 \leq \hat{q} - q \leq 1 .
\end{equation}
\end{vworklemmastatementpar}
\begin{vworklemmaproof}
For reference, note that:
\begin{eqnarray}
\label{eq:lem:ccil0:sidv0:sgdu0:03:02}
u & = & u_{j+n} b^{j+n} + u_{j+n-1} b^{j+n-1} + \ldots{} + u_1 b + u_0 \\
\label{eq:lem:ccil0:sidv0:sgdu0:03:03}
v & = & v_{n-1} b^{n-1} + v_{n-2} b^{n-2} + \ldots{} + v_1 b + v_0
\end{eqnarray}
By definition, we the remainder
$\hat{r}$ has the value
$\hat{r} = u_{j+n} b + u_{j+n-1} - \hat{q} v_{n-1}$. Substituting this value
into (\ref{eq:lem:ccil0:sidv0:sgdu0:03:01}) produces
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:03:04}
\hat{q} v_{n-2} \leq b (u_{j+n} b + u_{j+n-1} - \hat{q} v_{n-1}) + u_{j+n-2} ,
\end{equation}
and solving for $\hat{q}$ yields
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:03:05}
\hat{q} \leq \frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}}.
\end{equation}
It is known from Lemma \ref{lem:ccil0:sidv0:sgdu0:01} that the type of estimate
represented by the floor of the right-hand size of (\ref{eq:lem:ccil0:sidv0:sgdu0:03:05})
can be no less than $q$, leading to
\begin{eqnarray}
\label{eq:lem:ccil0:sidv0:sgdu0:03:06}
& \displaystyle q \leq \hat{q}
\leq
\left\lfloor { \frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}} } \right\rfloor & \\
\nonumber & \displaystyle \leq
\frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}}. &
\end{eqnarray}
Because $q, \hat{q} \in \vworkintset$, it is only necessary to prove that
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:03:07}
\frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}} -
\left\lfloor \frac{u}{v} \right\rfloor
< 2
\end{equation}
in order to prove that $\hat{q}-q \leq 1$. Using
(\cmtnzeroxrefhyphen\ref{eq:cmtn0:sfcf0:02}),
(\ref{eq:lem:ccil0:sidv0:sgdu0:03:07}) can be rewritten as
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:03:08}
\frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}} -
\frac{u}{v} -
\frac{u \bmod v}{v}
< 2 .
\end{equation}
In order for implication to hold, we must make the most pessimistic
assumptions about $u \bmod v$ (those which maximize it). The maximum value
of $u \bmod v$ is $v-1$, leading to
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:03:09}
\frac{u_{j+n} b^2 + u_{j+n-1}b + u_{j+n-2}}{v_{n-1}b + v_{n-2}} -
\frac{u + 1}{v}
< 1 .
\end{equation}
In order to maximize the left side of (\ref{}),
we must assume that $u$ is maximized, $v$ is minimized,
\end{vworklemmaproof}
\vworklemmafooter{}
\begin{vworklemmastatementpar}
{\mbox{\boldmath$\hat{q} v_{n-2} > b \hat{r} + u_{n-2} \vworkhimp q < \hat{q}$}}
\label{lem:ccil0:sidv0:sgdu0:04}
Given $\hat{q} > 0$, an estimate of $q$, and the remainder
based on the estimate, $\hat{r} = u_n b + u_{n-1} - \hat{q} v_{n-1}$,
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:04:01}
\hat{q} v_{n-2} > b \hat{r} + u_{n-2} \vworkhimp \hat{q} > q .
\end{equation}
\end{vworklemmastatementpar}
\begin{vworklemmaproof}
We make the assumption that $v_{n-1}$ and $v_{n-2}$ are not both 0.
$v_{n-1} > 0$ is guaranteed by
the normalization of
$v$ in Algorithm \ref{alg:ccil0:sidv0:sgdu0:01}.
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:04:02}
\hat{q} v_{n-2} > b \hat{r} + u_{n-2}
\end{equation}
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:04:03}
\hat{q} v_{n-2} > b (u_n b + u_{n-1} - \hat{q} v_{n-1}) + u_{n-2}
\end{equation}
Solving (\ref{eq:lem:ccil0:sidv0:sgdu0:04:03}) for $\hat{q}$ yields
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:04:04}
\hat{q}
>
\frac{u_n b^2 + u_{n-1} b + u_{n-2}}{v_{n-1} b + v_{n-2}}
\geq
\left\lfloor {\frac{u_n b^2 + u_{n-1} b + u_{n-2}}{v_{n-1} b + v_{n-2}}} \right\rfloor
.
\end{equation}
Note that the right-hand term of
(\ref{eq:lem:ccil0:sidv0:sgdu0:04:04})
is similar in form to the estimate $\hat{q}$ in
Lemma \ref{lem:ccil0:sidv0:sgdu0:01}, where it is proved that
$\hat{q} \geq q$. It is possible to use identical algebraic technique
as is used in Lemma \ref{lem:ccil0:sidv0:sgdu0:01} in order to prove that
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:04:05}
\left\lfloor {\frac{u_n b^2 + u_{n-1} b + u_{n-2}}{v_{n-1} b + v_{n-2}}} \right\rfloor
\geq q,
\end{equation}
and it follows that
\begin{equation}
\label{eq:lem:ccil0:sidv0:sgdu0:04:06}
\hat{q}
>
\frac{u_n b^2 + u_{n-1} b + u_{n-2}}{v_{n-1} b + v_{n-2}}
\geq
\left\lfloor {\frac{u_n b^2 + u_{n-1} b + u_{n-2}}{v_{n-1} b + v_{n-2}}} \right\rfloor
\geq q,
\end{equation}
and thus $\hat{q} > q$.
\end{vworklemmaproof}
\vworklemmafooter{}
\begin{vworklemmastatementpar}
{Add-back occurs with approximate probability \mbox{\boldmath$2/b$}}
\label{lem:ccil0:sidv0:sgdu0:05}
The estimate of $q$ provided by (\ref{eq:ccil0:sidv0:sgdu0:01}),
\end{vworklemmastatementpar}
\begin{vworklemmaproof}
\end{vworklemmaproof}
\vworklemmafooter{}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Division Of Signed Operands}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Large Dividends With Machine-Native Divisors}
\label{ccil0:sidv0:sldm0}
Division of arbitrary-sized operands (Section \ref{ccil0:sidv0:sgdu0}) is
a costly operation. In many practical applications, we are able to exploit
data sizes of operands or special relationships between the values of
operands to use the instruction set of the machine more effectively.
In this subsection, we investigate what optimizations we may achieve when:
\begin{itemize}
\item We wish to calculate the quotient and remainder of
unsigned integers $p$ and $q$: $p/q$ and
$p \bmod{} q$; \emph{and}
\item The machine possesses unsigned division instructions
which provide both a quotient and a remainder from
a division; \emph{and}
\item The bitsize of the divisor $q$ is not larger than can
be accomodated (as a divisor) by machine division instructions.
\end{itemize}
Processors which possess integer division instructions usually
possess one of two types of instructions:
\begin{itemize}
\item Instructions where the the divisor, quotient and remainder are
$Q$ bits, but the dividend is $2Q$ bits (we call these
``large dividend'' instructions). For example, an
instruction which accepts a 16-bit dividend and an
8-bit divisor to produce an 8-bit quotient and an 8-bit remainder is
typical.
With such instructions, overflow is possible, and is always detectable.
However, in this subsection we never describe algorithms which detect
overflow---instead, we arrange for data values which cannot generate an
overflow.
\item Instructions where the dividend, divisor, quotient, and remainder
are all $Q$ bits (we call these ``small dividend'' instructions).
With such instructions, overflow is not possible.
\end{itemize}
We call the bitsize $Q$ a \emph{chunk}. We use \emph{chunk} rather than
\emph{word} because the chunksize and wordsize in general are not
required to be the same.
For the remainder of this discussion, we assume large dividend
instructions (the first category above).
The algorithms developed can be implemented on small dividend machines
by halving data sizes so that the divisor fills no more than half
of the available bits.
We assume that we are interested in calculating the integer quotient
$\lfloor{}p/q\rfloor$ and remainder $p \bmod{} q$ of two unsigned
integers $p$ and $q$, where $p$ is of size $P$ bits and $q$ is
of size $Q$ bits. For simplicity and without detracting from the
generality of the solution, we assume that $Q \vworkdivides{} P$.
We then seek to calculate
\begin{eqnarray}
\label{eq:ccil0:sidv0:sldm0:001}
\frac{p}{q} & = & \frac{2^{P-Q} p_{[P/Q-1]} + 2^{P-2Q} p_{[P/Q-2]} + \ldots{} + 2^{Q} p_{[1]} + p_{[0]}}{q_{[0]}} \\
\nonumber & = & \frac{\sum_{i=0}^{P/Q-1} 2^{iQ} p_{[i]}}{q_{[0]}} .
\end{eqnarray}
\noindent{}Using the integer identity
\begin{equation}
\label{eq:ccil0:sidv0:sldm0:002}
\frac{a}{b} = \left\lfloor{\frac{a}{b}}\right\rfloor + \frac{a \bmod b}{b} ,
\end{equation}
\noindent{}we can reform (\ref{eq:ccil0:sidv0:sldm0:001}) into
\begin{eqnarray}
\nonumber
\frac{p}{q} & = & 2^{P-Q} \left\lfloor{\frac{p_{[P/Q-1]}}{q_{[0]}}}\right\rfloor \\
\label{eq:ccil0:sidv0:sldm0:003}
& + & 2^{P-Q} \frac{p_{[P/Q-1]}\bmod q_{[0]}}{q_{[0]}}
+ 2^{P-2Q} \frac{p_{[P/Q-2]}}{q_{[0]}} \\
\nonumber & + & \frac{\sum_{i=0}^{P/Q-3} 2^{iQ} p_{[i]}}{q_{[0]}} ,
\end{eqnarray}
\noindent{}which can be polished slightly to yield
\begin{eqnarray}
\nonumber
\frac{p}{q} & = & 2^{P-Q} \left\lfloor{\frac{p_{[P/Q-1]}}{q_{[0]}}}\right\rfloor \\
\label{eq:ccil0:sidv0:sldm0:004}
& + & 2^{P-2Q} \frac{2^Q (p_{[P/Q-1]}\bmod q_{[0]}) + p_{[P/Q-2]}}{q_{[0]}} \\
\nonumber & + & \frac{\sum_{i=0}^{P/Q-3} 2^{iQ} p_{[i]}}{q_{[0]}} .
\end{eqnarray}
Note in (\ref{eq:ccil0:sidv0:sldm0:004}) that the first term,
$\lfloor{}p_{[P/Q-1]} / q_{[0]}\rfloor$, as well as
a portion of the second term, $p_{[P/Q-1]}\bmod q_{[0]}$, can be
calculated using a single machine division instruction with
$p_{[P/Q-1]}$ as the dividend and $q_{[0]}$ as the divisor.
Note also that multiplication of an integer by a power of 2
can be achieved by placing the integer correctly within the
result. In this regard note that $Q=8$ or $Q=16$ are
the most typical cases, and so the placement can be achieved simply
by selecting the correct memory address.
It is initially unclear whether we can evaluate or reduce the fraction in the
second term of (\ref{eq:ccil0:sidv0:sldm0:004}),
$[2^Q (p_{[P/Q-1]}\bmod q_{[0]}) + p_{[P/Q-2]}] / q_{[0]}$,
using a single large dividend machine instruction, because
the upper chunk of the dividend is populated with non-zero bits
(specifically, $p_{[P/Q-1]}\bmod q_{[0]}$), and it seems that
a division overflow may be possible. However, with some thought,
it is clear that $p_{[P/Q-1]}\bmod q_{[0]} \leq q_{[0]} - 1$ and
$p_{[P/Q-2]} \leq 2^Q - 1$, thus the largest numerator possible
is $2^Q q_{[0]} - 1$, which, when divided by $q_{[0]}$, will result
in a quotient and remainder of $2^Q - 1$. Thus, no division overflow
can occur, and the fraction in the
second term of (\ref{eq:ccil0:sidv0:sldm0:004}) can be evaluated
using a large divisor integer machine instruction.
The fraction in the
second term of (\ref{eq:ccil0:sidv0:sldm0:004}) can be simplified
using (\ref{eq:ccil0:sidv0:sldm0:002}) to yield:
\begin{eqnarray}
\nonumber
\frac{p}{q} & = & 2^{P-Q} \left\lfloor{\frac{p_{[P/Q-1]}}{q_{[0]}}}\right\rfloor \\
\label{eq:ccil0:sidv0:sldm0:005}
& + & 2^{P-2Q} \left\lfloor{\frac{2^Q (p_{[P/Q-1]}\bmod q_{[0]}) + p_{[P/Q-2]}}{q_{[0]}}}\right\rfloor \\
\nonumber & + & 2^{P-2Q} \frac{(2^Q (p_{[P/Q-1]}\bmod q_{[0]}) + p_{[P/Q-2]}) \bmod q_{[0]}}{q_{[0]}} \\
\nonumber & + & \frac{\sum_{i=0}^{P/Q-3} 2^{iQ} p_{[i]}}{q_{[0]}} .
\end{eqnarray}
The process of combining adjacent terms can be continued until all
divisions and modulo operations necessary can be carried out using
long dividend division instructions. If we envision a
long-dividend division instruction as a functional block that
accepts a $2Q$-bit dividend and a $Q$-bit divisor to produce a
$Q$-bit quotient and a $Q$-bit remainder
(Figure \ref{fig:ccil0:sidv0:sldm0:00}), then we can draw the
entire division as outlined by (\ref{eq:ccil0:sidv0:sldm0:005})
as shown in Figure \ref{fig:ccil0:sidv0:sldm0:01}.
\begin{figure}
\centering
\includegraphics[width=4.6in]{c_cil0/lddvblk.eps}
\caption{Long Dividend Division Machine Instruction As A Functional Block}
\label{fig:ccil0:sidv0:sldm0:00}
\end{figure}
\begin{figure}
\centering
\includegraphics[width=4.6in]{c_cil0/ldmnblk.eps}
\caption{Long Dividend/Machine-Native Divisor Division In Functional Block Form}
\label{fig:ccil0:sidv0:sldm0:01}
\end{figure}
The following example illustrates how to apply the technique.
\begin{vworkexamplestatement}
\label{ex:ccil0:sidv0:sldm0:01}
Implement 32/8 unsigned division on the TMS370C8 processor, which is
characterized by a 16/8 division instruction.
\end{vworkexamplestatement}
\begin{vworkexampleparsection}{Solution}
It would be possible to prepare an implementation directly from
Figure \ref{fig:ccil0:sidv0:sldm0:01}: however, it may be
more instructive to work through a solution without the
aid of this figure.
In the case of the TMS370C8, the chunk size $Q$ is 8 bits; therefore
$Q=8$. The problem statement indicates that we must accept 32-bit dividends;
therefore $P=32$. Thus
\begin{equation}
\label{eq:ex:ccil0:sidv0:sldm0:01:001}
p = 2^{24} p_{[3]} + 2^{16} p_{[2]} + 2^{8} p_{[1]} + p_{[0]}
\end{equation}
\noindent{}and
\begin{equation}
\label{eq:ex:ccil0:sidv0:sldm0:01:002}
q = q_{[0]} .
\end{equation}
\noindent{}Thus the quotient and remainder we would like to determine,
$\lfloor p/q \rfloor$ and $p \bmod q$, can be obtained by repeated
application of (\ref{eq:ccil0:sidv0:sldm0:002}) as shown
in Equations (\ref{eq:ex:ccil0:sidv0:sldm0:01:003})
through (\ref{eq:ex:ccil0:sidv0:sldm0:01:011}).
\begin{equation}
\label{eq:ex:ccil0:sidv0:sldm0:01:003}
\frac{p}{q}
=
\frac{2^{24} p_{[3]} + 2^{16} p_{[2]} + 2^{8} p_{[1]} + p_{[0]}}{q_{[0]}}
\end{equation}
\begin{equation}
\label{eq:ex:ccil0:sidv0:sldm0:01:004}
\frac{p}{q}
=
2^{24} \frac{p_{[3]}}{q_{[0]}}
+
2^{16} \frac{p_{[2]}}{q_{[0]}}
+
2^{8} \frac{p_{[1]}}{q_{[0]}}
+
\frac{p_{[0]}}{q_{[0]}}
\end{equation}
\begin{equation}
\label{eq:ex:ccil0:sidv0:sldm0:01:005}
\frac{p}{q}
=
2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor
+
2^{24} \frac{(p_{[3]} \bmod q_{[0]})}{q_{[0]}}
+
2^{16} \frac{p_{[2]}}{q_{[0]}}
+
2^{8} \frac{p_{[1]}}{q_{[0]}}
+
\frac{p_{[0]}}{q_{[0]}}
\end{equation}
\begin{equation}
\label{eq:ex:ccil0:sidv0:sldm0:01:006}
\frac{p}{q}
=
2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor
+
2^{16} \frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}
+
2^{8} \frac{p_{[1]}}{q_{[0]}}
+
\frac{p_{[0]}}{q_{[0]}}
\end{equation}
\begin{eqnarray}
\label{eq:ex:ccil0:sidv0:sldm0:01:007}
\frac{p}{q} & = & 2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor
+
2^{16} \left\lfloor{\frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}}\right\rfloor \\
\nonumber & + &
2^{16} \frac{(2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}}{q_{[0]}}
+
2^{8} \frac{p_{[1]}}{q_{[0]}}
+
\frac{p_{[0]}}{q_{[0]}}
\end{eqnarray}
\begin{eqnarray}
\label{eq:ex:ccil0:sidv0:sldm0:01:008}
\frac{p}{q} & = & 2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor
+
2^{16} \left\lfloor{\frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}}\right\rfloor \\
\nonumber & + &
2^{8} \frac{2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}}{q_{[0]}}
+
\frac{p_{[0]}}{q_{[0]}}
\end{eqnarray}
\begin{eqnarray}
\nonumber\frac{p}{q} & = & 2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor
+
2^{16} \left\lfloor{\frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}}\right\rfloor \\
\label{eq:ex:ccil0:sidv0:sldm0:01:009}
& + &
2^{8} \left\lfloor{\frac{2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}}{q_{[0]}}}\right\rfloor \\
\nonumber & + &
2^{8} \frac{(2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}) \bmod q_{[0]}}{q_{[0]}} \\
\nonumber & + &
\frac{p_{[0]}}{q_{[0]}}
\end{eqnarray}
\begin{eqnarray}
\nonumber\frac{p}{q} & = & 2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor
+
2^{16} \left\lfloor{\frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}}\right\rfloor \\
\label{eq:ex:ccil0:sidv0:sldm0:01:010}
& + &
2^{8} \left\lfloor{\frac{2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}}{q_{[0]}}}\right\rfloor \\
\nonumber & + &
\frac{2^8((2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}) \bmod q_{[0]}) + p_{[0]}}{q_{[0]}}
\end{eqnarray}
\begin{eqnarray}
\nonumber & \displaystyle \frac{p}{q} = 2^{24} \left\lfloor{\frac{p_{[3]}}{q_{[0]}}}\right\rfloor
+
2^{16} \left\lfloor{\frac{2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}}{q_{[0]}}}\right\rfloor & \\
\label{eq:ex:ccil0:sidv0:sldm0:01:011}
& \displaystyle +
2^{8} \left\lfloor{\frac{2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}}{q_{[0]}}}\right\rfloor & \\
\nonumber & \displaystyle +
\left\lfloor{\frac{2^8((2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}) \bmod q_{[0]}) + p_{[0]}}{q_{[0]}}}\right\rfloor + & \\
\nonumber
& \displaystyle \frac{(2^8((2^8((2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}) \bmod q_{[0]}) + p_{[1]}) \bmod q_{[0]}) + p_{[0]}) \bmod q_{[0]}}{q_{[0]}}
&
\end{eqnarray}
Note several things about the implementation suggested by
(\ref{eq:ex:ccil0:sidv0:sldm0:01:011}):
\begin{itemize}
\item No addition or multiplication is required to calculate terms such as
$2^8 (p_{[3]} \bmod q_{[0]}) + p_{[2]}$. The high-order byte of the
large dividend can be stuffed with $p_{[3]} \bmod q_{[0]}$ and
the low-order byte with $p_{[2]}$.
\item No addition or multiplication is required to calculate the
result $\lfloor p/q \rfloor$.
Note in (\ref{eq:ex:ccil0:sidv0:sldm0:01:011}) that the results are
conveniently grouped as bytes with multipliers of $2^{24}$,
$2^{16}$, $2^8$, and $2^0=1$. The terms can simply be placed into
the appropriate byte, as a way of multplication by the appropriate
power of 2. Note also that each term is guaranteed to be
$\in \{0, 1, 2, \ldots{} , 255\}$, by the argument presented
earlier for (\ref{eq:ccil0:sidv0:sldm0:004}). Thus, the
addition will result in no carries, and each result byte can simply
be placed directly in the correct memory location---addition is
not necessary.
\item Four machine division instructions are required, and the remainder
is produced automatically by the fourth instruction.
\end{itemize}
An implemenation for the TMS370C8 is supplied as Figure
\ref{fig:ex:ccil0:sidv0:sldm0:01:01}. A block diagram of the data
flow for this implementation is supplied as
Figure \ref{fig:ex:ccil0:sidv0:sldm0:01:02}.
\end{vworkexampleparsection}
\vworkexamplefooter{}
\begin{figure}
\begin{verbatim}
;Assume that byte memory locations p3, p2, p1, and p0 contain the
;32-bit unsigned dividend, and byte q0 contains the 8-bit unsigned
;divisor. Assume also that the result quotient will be placed
;in byte memory locations d3, d2, d1, and d0; and that the
;remainder will be placed in the byte memory location r0. Further
;assume that all memory locations are in the register file (near).
CLR A ;High-order chunk of large divisor
;must be 0.
MOV p3, B ;Load the low-order chunk of divisor.
DIV q0, A ;Perform the first division.
MOV A, d3 ;Quotient becomes this part of the
;result.
MOV B, A ;Remainder becomes high-order chunk of
;next division.
MOV p2, B ;Next byte becomes low-order chunk.
DIV q0, A ;Do the second division.
MOV A, d2 ;Quotient becomes this part of the
;result.
MOV B, A ;Remainder becomes high-order chunk of
;next division.
MOV p1, B ;Next byte becomes low-order chunk.
DIV q0, A ;Do the third division.
MOV A, d1 ;Quotient becomes this part of the
;result.
MOV B, A ;Remainder becomes high-order chunk of
;next division.
MOV p0, B ;Next byte becomes low-order chunk.
DIV q0, A ;Do the fourth division.
MOV A, d0 ;Quotient becomes this part of the
;result.
MOV B, r0 ;This is the remainder, which could be used
;for rounding.
\end{verbatim}
\caption{TMS370C8 Code Snippet Illustrating Unsigned 32/8
Division Using Unsigned 16/8
Machine Instructions (Example \ref{ex:ccil0:sidv0:sldm0:01})}
\label{fig:ex:ccil0:sidv0:sldm0:01:01}
\end{figure}
\begin{figure}
\centering
\includegraphics[width=4.6in]{c_cil0/t370dmp.eps}
\caption{Block Diagram Of Data Flow Of Figure \ref{fig:ex:ccil0:sidv0:sldm0:01:01}}
\label{fig:ex:ccil0:sidv0:sldm0:01:02}
\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Miscellaneous Integer Mappings}
%Section tag: MIM0
\label{ccil0:smim0}
Embedded system work and ROM constraints often inspire a great deal
of cleverness in the selection of instructions to perform mappings or
tests. In this section, we discuss integer mappings (i.e. functions)
for which economical implementations are known; and in the next section
(Section \ref{ccil0:smit0})
we discuss integer tests for which economical implementations are known.
To the best of our knowledge, there is no way to derive these mappings
and tests---they have been collected from many software developers and
come from human intuition and experience.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Lowest-Order Bit}
%Subsection tag: LIB0
\label{ccil0:smim0:slib0}
\index{lowest-order bit}
\index{least significant bit}
The mapping
\texttt{mask = x \& -x}
\noindent{}is the most economical way known to extract the
lowest-order bit set in an integer \texttt{x}, or
0 if no bits are set.\footnote{This mapping was contributed by
David Baker (\texttt{bakerda@engin.umich.edu})
and Raul Selgado (\texttt{rselgado@visteon.com}).} Since most processors have an instruction to form the
two's complement of an integer, this mapping usually requires only
two arithmetic instructions.
When implementing this mapping in assembly-language on processors without a
two's complement instruction, two other possible implementations are:
\begin{itemize}
\item \texttt{mask = x \& ($\sim$x + 1)}
\item \texttt{mask = x \& ((x \^{ } -1) + 1)}
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Miscellaneous Integer Tests}
%Section tag: MIT0
\label{ccil0:smit0}
\subsection{Power Of 2}
%Subsection tag: PTW0
\label{ccil0:smit0:sptw0}
\index{power of two}
\index{2N@$2^N$}
The test
\texttt{(x \& (x-1) == 0) \&\& (x != 0)}
\noindent{}is the most economical way known to
test whether an integer is a positive power of two
(1, 2, 4, 8, 16, etc.).\footnote{The test appeared as part of
a discussion on
the GMP mailing list in 2002.}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Exercises}
\begin{vworkexercisestatement}
\label{exe:ccil0:sexe0:01}
Show that any $m$-bit two's complement integer $u_{[m-1:0]}$ except
$-2^{m-1}$ can be negated by forming the one's complement, then adding one.
\end{vworkexercisestatement}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\vfill
\noindent\begin{figure}[!b]
\noindent\rule[-0.25in]{\textwidth}{1pt}
\begin{tiny}
\begin{verbatim}
$RCSfile: c_cil0.tex,v $
$Source: /home/dashley/cvsrep/e3ft_gpl01/e3ft_gpl01/dtaipubs/esrgubka/c_cil0/c_cil0.tex,v $
$Revision: 1.24 $
$Author: dtashley $
$Date: 2003/11/03 02:14:24 $
\end{verbatim}
\end{tiny}
\noindent\rule[0.25in]{\textwidth}{1pt}
\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% $Log: c_cil0.tex,v $
% Revision 1.24 2003/11/03 02:14:24 dtashley
% All duplicate labels as flagged by LaTeX removed. Additional files added
% to Microsoft Visual Studio edit context.
%
% Revision 1.23 2002/08/26 17:57:03 dtashley
% Additional solutions chapter added. Precautionary checkin to be sure
% that I've captured all changes.
%
% Revision 1.22 2002/08/25 13:18:21 dtashley
% Edits.
%
% Revision 1.21 2002/08/22 00:33:33 dtashley
% Have made aesthetic changes in CFRY0 and CCFR0. Checking in all before
% rebuild of book.
%
% Revision 1.20 2002/08/21 09:12:11 dtashley
% Safety checkin.
%
% Revision 1.19 2002/08/19 05:06:45 dtashley
% Substantial progress in integer numerics chapter. Preparing to
% work at home.
%
% Revision 1.18 2002/08/10 07:34:48 dtashley
% Checkin before resuming work on a lemma at home.
%
% Revision 1.17 2002/08/09 03:13:42 dtashley
% Significant lemma proved, and other progress.
%
% Revision 1.16 2002/08/06 22:15:24 dtashley
% Substantial lemma proof progress.
%
% Revision 1.15 2002/08/05 01:28:33 dtashley
% Lemma demonstrating that 0 \leq \hat{q}-q \leq 2 completed.
%
% Revision 1.14 2002/08/03 23:04:57 dtashley
% Safety checkin. Having difficulty: may have solved the wrong
% problem.
%
% Revision 1.13 2002/08/01 01:31:24 dtashley
% Safety checkin. Having difficulty completing an alternate version of
% Knuth's proof about the accuracy of quotient estimates based on first
% digits of dividend and divisor.
%
% Revision 1.12 2002/07/31 04:38:31 dtashley
% Edits to basic integer algorithms chapter, safety checkin.
%
% Revision 1.11 2002/07/29 16:30:08 dtashley
% Safety checkin before moving work back to WSU server Kalman.
%
% Revision 1.10 2002/07/26 20:39:43 dtashley
% Safety checkin.
%
% Revision 1.9 2002/07/26 14:58:01 dtashley
% Safety checkin.
%
% Revision 1.8 2002/07/21 16:19:28 dtashley
% Finalization of material about integer division when dividend is
% long but divisor is same as data size that machine can use as
% divisor natively.
%
% Revision 1.7 2002/07/20 23:01:45 dtashley
% Adding material about division with large dividend and machine-native
% divisor.
%
% Revision 1.6 2002/06/11 04:34:19 dtashley
% Information for David Baker added.
%
% Revision 1.5 2002/06/08 05:50:25 dtashley
% Added missing tilde. Can't get it to be same size as other
% characters--need to research that.
%
% Revision 1.4 2002/06/07 02:08:32 dtashley
% Section added for integer mappings and tests. Mapping for lowest-order
% bit set from Raul Selgado and Dave (last name presently unknown)
% at Visteon added. Test for power of 2 added.
%
% Revision 1.3 2001/08/31 23:11:17 dtashley
% End of August 2001 safety check-in.
%
% Revision 1.2 2001/08/27 20:06:08 dtashley
% Edits and substantial re-organization.
%
% Revision 1.1 2001/08/25 22:51:25 dtashley
% Complex re-organization of book.
%
%End of file C_CIL0.TEX