How To Use The GMP And Perform Large Integer
Or High-Precision Arithmetic On SourceForge Web Sites
04/15/03: I (Dave Ashley) have several number theory CGI pages
here at SourceForge. These pages
do things like calculate the GCD of two integers, find best rational
approximations, etc.; and they do need arbitrarily large integer capability to
be useful. It was quite difficult to obtain large integer capability for
these pages. Here are the steps I went through, and the eventual solution
(in the hopes that it helps others).
- I had used the bcmath library with PHP before (it is built into the
version of PHP I have on my Linux box at home). Unfortunately, I
discovered immediately that the version of PHP used by SourceForge
does not have bcmath built in.
- I tried writing arbitrary integer functions using PHP. These proved
to be painfully inefficient. I abandoned the effort when early
observations indicated that these probably would not allow calculation of a
couple hundred rounds of Euclid's
GCD algorithm in a practical time (not too long for a user to wait, and
within the CPU time limits for a PHP script).
- I tried writing a separate program in 'C' to do the calculations using the
GMP library, and tried calling this
program from the PHP script using the exec() function. This
worked famously at home, but did not work at SourceForge. The
behavior at SourceForge was a mystery because using a shell I could
invoke the program successfully, but PHP could not invoke it.
- I completed a SourceForge support ticket. The response to the
ticket indicated that:
- The GMP is installed on the shell servers but not the web servers,
which is why I could run the compiled external program from a shell but
not from PHP.
- The solution is to compile the GMP library, place it in the SourceForge
web space, and link to it.
- After much experimentation, I discovered that I could not figure out how
to place another copy of the GMP library in the web space at SourceForge and
link to it. As best I could glean, there are security implications to
replacing libraries (perhaps one could trick a program into dynamically
linking with the a trojan library), and so dynamically linked libraries
could only exist in certain locations (all paths which belong to root).
- However, I also discovered that I could statically link my program with
the GMP (which gives a much larger executable), and place that much larger
executable on SourceForge and run it. The executable went from
about 20K to 500K when static linking was specified.
All that being said, here are the specific steps I took to use the GMP from
PHP:
- Be sure that the GMP is installed on your system (the one you are using
for compiling). You need both the library and the development
files. Here is what shows on my system at home:
[apache@khinchin apache]$ rpm -q -a | grep gmp
gmp-4.0.1-3
gmp-devel-4.0.1-3
Note that both GMP and GMP-devel are installed.
On a RedHat system, you can use rpm
to install the packages. On other systems, you may need to install
differently or compile and install them from source.
- Because static linking bloats the executable, the sanest approach (and the
one I use) is to write one executable and have it do all the number
crunching for all of your PHP scripts. (By writing one executable, the
GMP library is present in only one place--that one executable.) The
executable I wrote accepts a subfunction code, which is essentially an
indication of which PHP script is calling it. Thus, my single
executable does GCD calculation, best rational approximations, Miller-Rabin
testing, etc., depending on the command-line parameters (which are
controlled by the calling PHP script).
For example, when the GCD PHP script calls the executable, it uses the
subfunction code "gcd" and gets output like this (when finding the
GCD of 27 and 150):
[apache@khinchin phpcgibin]$ ./arith_large_cgi gcd 27 150
S
27
150
1
150
27
5
15
2
27
15
1
12
3
15
12
1
3
4
12
3
4
0
3
S
On the other hand, when doing Miller-Rabin testing to see if a certain
number is prime (101, for example), the single executable is called like
this and produces this output:
[apache@khinchin phpcgibin]$ ./arith_large_cgi gmp_prob_prime
101 25
S
101
25
2
S
In the first example above, the numbers which are emitted by the program are
data from successive rounds of Euclid's GCD algorithm (which the PHP script
formats and displays). In the second case, the output is,
respectively, the number whose primality is to be determined (an echo back),
the number of rounds of Miller-Rabin testing (again, and echo-back), and
finally a code to indicate the testing results (0 = a compsite number, 1 =
probably a prime number, and 2 = definitely a prime number). The
output of the program is meant to be interpreted by the calling PHP script
(i.e. it is not human-friendly).
Also note that the first and last lines of the output are
"S". This prevents the calling PHP script from assuming all
output is present if the program coredumps or somehow terminates. The
calling PHP script checks the number of lines and also checks for the
starting and ending "S" characters.
- Accomplishing static linking is a little bit tricky. Here are the
conditions that need to be met:
- The static GMP library has to be on the system you use for compilation
(it is by default). If it is not present, you may need to compile
and install the GMP from source code.
- When calling the gcc compiler, the -static
switch can be anywhere, but the -lgmp
switch must be near the end of the command line, after the modules that
reference the GMP. This is because gcc links from left
to right. For example, my program is built like this:
gcc -static -o arith_large_cgi arith_large_cgi.c auxfuncs.c subfunc_gcd.c subfunc_gmp_prob_prime.c -lgmp
Notice that the -lgmp switch is at the end
of the command line.
- The executable can be placed in the same directory as the PHP scripts (it
will always find it there).
- The PHP exec() function is well-documented. You can give it a
variable in which it will store the output of the called program in an
array. For example, from one of my PHP scripts, here is the code which
calls exec() and also checks the output for sanity.
//We can now run the external program to do the calculation.
$cgi_command_string = "./arith_large_cgi gcd " . $N1 . " " . $N2;
exec($cgi_command_string, $cgi_result);
$cgi_result_nelem = count($cgi_result);
//We need to perform some sanity checks on the CGI output to be sure it
//is what we want. If it seems wrong, we need to generate an error message
//and not try to display the output.
$cgi_output_is_sane = 1;
if ($cgi_result_nelem == 0)
$cgi_output_is_sane = 0;
if ($cgi_result_nelem > 0)
{
if (($cgi_result_nelem % 5) != 0)
$cgi_output_is_sane = 0;
if ($cgi_result[0] != "S")
$cgi_output_is_sane = 0;
if ($cgi_result[$cgi_result_nelem - 1] != "S")
$cgi_output_is_sane = 0;
}
if (!$cgi_output_is_sane)
{
do_err_msg("An unspecified error occurred when ...
Notice how the snippet above checks the output of the program for sanity.
- Finally, you must have the execute permission for other (the web
server is other) set in your compiled executable, or else PHP cannot
run the compiled program. You can change these permissions with the chmod
command.
Please drop me
an e-mail if I should change or enhance this documentation or if there is a
better way to do something.
This
web page is maintained by David
T. Ashley.
$Header: /cvsroot/esrg/sfesrg/esrgweba/htdocs/howtos/sourceforge/use_the_gmp.htm,v 1.2 2003/04/27 21:01:30 dtashley Exp $