M7350v1_en_gpl

This commit is contained in:
T
2024-09-09 08:52:07 +00:00
commit f9cc65cfda
65988 changed files with 26357421 additions and 0 deletions

View File

@@ -0,0 +1,29 @@
There seems to be a problem with exp(double) and our emulator. I haven't
been able to track it down yet. This does not occur with the emulator
supplied by Russell King.
I also found one oddity in the emulator. I don't think it is serious but
will point it out. The ARM calling conventions require floating point
registers f4-f7 to be preserved over a function call. The compiler quite
often uses an stfe instruction to save f4 on the stack upon entry to a
function, and an ldfe instruction to restore it before returning.
I was looking at some code, that calculated a double result, stored it in f4
then made a function call. Upon return from the function call the number in
f4 had been converted to an extended value in the emulator.
This is a side effect of the stfe instruction. The double in f4 had to be
converted to extended, then stored. If an lfm/sfm combination had been used,
then no conversion would occur. This has performance considerations. The
result from the function call and f4 were used in a multiplication. If the
emulator sees a multiply of a double and extended, it promotes the double to
extended, then does the multiply in extended precision.
This code will cause this problem:
double x, y, z;
z = log(x)/log(y);
The result of log(x) (a double) will be calculated, returned in f0, then
moved to f4 to preserve it over the log(y) call. The division will be done
in extended precision, due to the stfe instruction used to save f4 in log(y).

View File

@@ -0,0 +1,70 @@
This directory contains the version 0.92 test release of the NetWinder
Floating Point Emulator.
The majority of the code was written by me, Scott Bambrough It is
written in C, with a small number of routines in inline assembler
where required. It was written quickly, with a goal of implementing a
working version of all the floating point instructions the compiler
emits as the first target. I have attempted to be as optimal as
possible, but there remains much room for improvement.
I have attempted to make the emulator as portable as possible. One of
the problems is with leading underscores on kernel symbols. Elf
kernels have no leading underscores, a.out compiled kernels do. I
have attempted to use the C_SYMBOL_NAME macro wherever this may be
important.
Another choice I made was in the file structure. I have attempted to
contain all operating system specific code in one module (fpmodule.*).
All the other files contain emulator specific code. This should allow
others to port the emulator to NetBSD for instance relatively easily.
The floating point operations are based on SoftFloat Release 2, by
John Hauser. SoftFloat is a software implementation of floating-point
that conforms to the IEC/IEEE Standard for Binary Floating-point
Arithmetic. As many as four formats are supported: single precision,
double precision, extended double precision, and quadruple precision.
All operations required by the standard are implemented, except for
conversions to and from decimal. We use only the single precision,
double precision and extended double precision formats. The port of
SoftFloat to the ARM was done by Phil Blundell, based on an earlier
port of SoftFloat version 1 by Neil Carson for NetBSD/arm32.
The file README.FPE contains a description of what has been implemented
so far in the emulator. The file TODO contains a information on what
remains to be done, and other ideas for the emulator.
Bug reports, comments, suggestions should be directed to me at
<scottb@netwinder.org>. General reports of "this program doesn't
work correctly when your emulator is installed" are useful for
determining that bugs still exist; but are virtually useless when
attempting to isolate the problem. Please report them, but don't
expect quick action. Bugs still exist. The problem remains in isolating
which instruction contains the bug. Small programs illustrating a specific
problem are a godsend.
Legal Notices
-------------
The NetWinder Floating Point Emulator is free software. Everything Rebel.com
has written is provided under the GNU GPL. See the file COPYING for copying
conditions. Excluded from the above is the SoftFloat code. John Hauser's
legal notice for SoftFloat is included below.
-------------------------------------------------------------------------------
SoftFloat Legal Notice
SoftFloat was written by John R. Hauser. This work was made possible in
part by the International Computer Science Institute, located at Suite 600,
1947 Center Street, Berkeley, California 94704. Funding was partially
provided by the National Science Foundation under grant MIP-9311980. The
original version of this code was written as part of a project to build
a fixed-point vector processor in collaboration with the University of
California at Berkeley, overseen by Profs. Nelson Morgan and John Wawrzynek.
THIS SOFTWARE IS DISTRIBUTED AS IS, FOR FREE. Although reasonable effort
has been made to avoid it, THIS SOFTWARE MAY CONTAIN FAULTS THAT WILL AT
TIMES RESULT IN INCORRECT BEHAVIOR. USE OF THIS SOFTWARE IS RESTRICTED TO
PERSONS AND ORGANIZATIONS WHO CAN AND WILL TAKE FULL RESPONSIBILITY FOR ANY
AND ALL LOSSES, COSTS, OR OTHER PROBLEMS ARISING FROM ITS USE.
-------------------------------------------------------------------------------

View File

@@ -0,0 +1,156 @@
The following describes the current state of the NetWinder's floating point
emulator.
In the following nomenclature is used to describe the floating point
instructions. It follows the conventions in the ARM manual.
<S|D|E> = <single|double|extended>, no default
{P|M|Z} = {round to +infinity,round to -infinity,round to zero},
default = round to nearest
Note: items enclosed in {} are optional.
Floating Point Coprocessor Data Transfer Instructions (CPDT)
------------------------------------------------------------
LDF/STF - load and store floating
<LDF|STF>{cond}<S|D|E> Fd, Rn
<LDF|STF>{cond}<S|D|E> Fd, [Rn, #<expression>]{!}
<LDF|STF>{cond}<S|D|E> Fd, [Rn], #<expression>
These instructions are fully implemented.
LFM/SFM - load and store multiple floating
Form 1 syntax:
<LFM|SFM>{cond}<S|D|E> Fd, <count>, [Rn]
<LFM|SFM>{cond}<S|D|E> Fd, <count>, [Rn, #<expression>]{!}
<LFM|SFM>{cond}<S|D|E> Fd, <count>, [Rn], #<expression>
Form 2 syntax:
<LFM|SFM>{cond}<FD,EA> Fd, <count>, [Rn]{!}
These instructions are fully implemented. They store/load three words
for each floating point register into the memory location given in the
instruction. The format in memory is unlikely to be compatible with
other implementations, in particular the actual hardware. Specific
mention of this is made in the ARM manuals.
Floating Point Coprocessor Register Transfer Instructions (CPRT)
----------------------------------------------------------------
Conversions, read/write status/control register instructions
FLT{cond}<S,D,E>{P,M,Z} Fn, Rd Convert integer to floating point
FIX{cond}{P,M,Z} Rd, Fn Convert floating point to integer
WFS{cond} Rd Write floating point status register
RFS{cond} Rd Read floating point status register
WFC{cond} Rd Write floating point control register
RFC{cond} Rd Read floating point control register
FLT/FIX are fully implemented.
RFS/WFS are fully implemented.
RFC/WFC are fully implemented. RFC/WFC are supervisor only instructions, and
presently check the CPU mode, and do an invalid instruction trap if not called
from supervisor mode.
Compare instructions
CMF{cond} Fn, Fm Compare floating
CMFE{cond} Fn, Fm Compare floating with exception
CNF{cond} Fn, Fm Compare negated floating
CNFE{cond} Fn, Fm Compare negated floating with exception
These are fully implemented.
Floating Point Coprocessor Data Instructions (CPDT)
---------------------------------------------------
Dyadic operations:
ADF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - add
SUF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - subtract
RSF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse subtract
MUF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - multiply
DVF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - divide
RDV{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse divide
These are fully implemented.
FML{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - fast multiply
FDV{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - fast divide
FRD{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - fast reverse divide
These are fully implemented as well. They use the same algorithm as the
non-fast versions. Hence, in this implementation their performance is
equivalent to the MUF/DVF/RDV instructions. This is acceptable according
to the ARM manual. The manual notes these are defined only for single
operands, on the actual FPA11 hardware they do not work for double or
extended precision operands. The emulator currently does not check
the requested permissions conditions, and performs the requested operation.
RMF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - IEEE remainder
This is fully implemented.
Monadic operations:
MVF{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - move
MNF{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - move negated
These are fully implemented.
ABS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - absolute value
SQT{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - square root
RND{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - round
These are fully implemented.
URD{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - unnormalized round
NRM{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - normalize
These are implemented. URD is implemented using the same code as the RND
instruction. Since URD cannot return a unnormalized number, NRM becomes
a NOP.
Library calls:
POW{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - power
RPW{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse power
POL{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - polar angle (arctan2)
LOG{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base 10
LGN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base e
EXP{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - exponent
SIN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - sine
COS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - cosine
TAN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - tangent
ASN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arcsine
ACS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arccosine
ATN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arctangent
These are not implemented. They are not currently issued by the compiler,
and are handled by routines in libc. These are not implemented by the FPA11
hardware, but are handled by the floating point support code. They should
be implemented in future versions.
Signalling:
Signals are implemented. However current ELF kernels produced by Rebel.com
have a bug in them that prevents the module from generating a SIGFPE. This
is caused by a failure to alias fp_current to the kernel variable
current_set[0] correctly.
The kernel provided with this distribution (vmlinux-nwfpe-0.93) contains
a fix for this problem and also incorporates the current version of the
emulator directly. It is possible to run with no floating point module
loaded with this kernel. It is provided as a demonstration of the
technology and for those who want to do floating point work that depends
on signals. It is not strictly necessary to use the module.
A module (either the one provided by Russell King, or the one in this
distribution) can be loaded to replace the functionality of the emulator
built into the kernel.

View File

@@ -0,0 +1,67 @@
TODO LIST
---------
POW{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - power
RPW{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse power
POL{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - polar angle (arctan2)
LOG{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base 10
LGN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base e
EXP{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - exponent
SIN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - sine
COS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - cosine
TAN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - tangent
ASN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arcsine
ACS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arccosine
ATN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arctangent
These are not implemented. They are not currently issued by the compiler,
and are handled by routines in libc. These are not implemented by the FPA11
hardware, but are handled by the floating point support code. They should
be implemented in future versions.
There are a couple of ways to approach the implementation of these. One
method would be to use accurate table methods for these routines. I have
a couple of papers by S. Gal from IBM's research labs in Haifa, Israel that
seem to promise extreme accuracy (in the order of 99.8%) and reasonable speed.
These methods are used in GLIBC for some of the transcendental functions.
Another approach, which I know little about is CORDIC. This stands for
Coordinate Rotation Digital Computer, and is a method of computing
transcendental functions using mostly shifts and adds and a few
multiplications and divisions. The ARM excels at shifts and adds,
so such a method could be promising, but requires more research to
determine if it is feasible.
Rounding Methods
The IEEE standard defines 4 rounding modes. Round to nearest is the
default, but rounding to + or - infinity or round to zero are also allowed.
Many architectures allow the rounding mode to be specified by modifying bits
in a control register. Not so with the ARM FPA11 architecture. To change
the rounding mode one must specify it with each instruction.
This has made porting some benchmarks difficult. It is possible to
introduce such a capability into the emulator. The FPCR contains
bits describing the rounding mode. The emulator could be altered to
examine a flag, which if set forced it to ignore the rounding mode in
the instruction, and use the mode specified in the bits in the FPCR.
This would require a method of getting/setting the flag, and the bits
in the FPCR. This requires a kernel call in ArmLinux, as WFC/RFC are
supervisor only instructions. If anyone has any ideas or comments I
would like to hear them.
[NOTE: pulled out from some docs on ARM floating point, specifically
for the Acorn FPE, but not limited to it:
The floating point control register (FPCR) may only be present in some
implementations: it is there to control the hardware in an implementation-
specific manner, for example to disable the floating point system. The user
mode of the ARM is not permitted to use this register (since the right is
reserved to alter it between implementations) and the WFC and RFC
instructions will trap if tried in user mode.
Hence, the answer is yes, you could do this, but then you will run a high
risk of becoming isolated if and when hardware FP emulation comes out
-- Russell].