Import BSDDB 4.7.25 (as of svn r89086)
This commit is contained in:
190
libdb_java/README
Normal file
190
libdb_java/README
Normal file
@@ -0,0 +1,190 @@
|
||||
Berkeley DB's Java API
|
||||
$Id: README,v 12.2 2006/08/24 14:46:10 bostic Exp $
|
||||
|
||||
Berkeley DB's Java API is now generated with SWIG
|
||||
(http://www.swig.org). This document describes how SWIG is used -
|
||||
what we trust it to do, what things we needed to work around.
|
||||
|
||||
|
||||
Overview
|
||||
========
|
||||
|
||||
SWIG is a tool that generates wrappers around native (C/C++) APIs for
|
||||
various languages (mainly scripting languages) including Java.
|
||||
|
||||
By default, SWIG creates an API in the target language that exactly
|
||||
replicates the native API (for example, each pointer type in the API
|
||||
is wrapped as a distinct type in the language). Although this
|
||||
simplifies the wrapper layer (type translation is trivial), it usually
|
||||
doesn't result in natural API in the target language.
|
||||
|
||||
A further constraint for Berkeley DB's Java API was backwards
|
||||
compatibility. The original hand-coded Java API is in widespread use,
|
||||
and included many design decisions about how native types should be
|
||||
represented in Java. As an example, callback functions are
|
||||
represented by Java interfaces that applications using Berkeley DB
|
||||
could implement. The SWIG implementation was required to maintain
|
||||
backwards compatibility for those applications.
|
||||
|
||||
|
||||
Running SWIG
|
||||
============
|
||||
|
||||
The simplest use of SWIG is to simply run it with a C include file as
|
||||
input. SWIG parses the file and generates wrapper code for the target
|
||||
language. For Java, this includes a Java class for each C struct and
|
||||
a C source file containing the Java Native Interface (JNI) function
|
||||
calls for each native method.
|
||||
|
||||
The s_swig shell script in db/dist runs SWIG, and then post-processes
|
||||
each Java source file with the sed commands in
|
||||
libdb_java/java-post.sed. The Java sources are placed in
|
||||
java/src/com/sleepycat/db, and the native wrapper code is in a single
|
||||
file in libdb_java/db_java_wrap.c.
|
||||
|
||||
The post-processing step modifies code in ways that is difficult with
|
||||
SWIG (given my current level of knowledge). This includes changing
|
||||
some access modifiers to hide some of the implementation methods,
|
||||
selectively adding "throws" clauses to methods, and adding calls to
|
||||
"initialize" methods in Db and DbEnv after they are constructed (more
|
||||
below on what these aclls do).
|
||||
|
||||
In addition to the source code generated by SWIG, some of the Java
|
||||
classes are written by hand, and constants and code to fill statistics
|
||||
structures are generated by the script dist/s_java. The native
|
||||
statistics code is in libdb_java/java_stat_auto.c, and is compiled
|
||||
into the db_java_wrap object file with a #include directive. This
|
||||
allows most functions in that object to be static, which encourages
|
||||
compiler inlining and reduces the number of symbols we export.
|
||||
|
||||
|
||||
The Implementation
|
||||
==================
|
||||
|
||||
For the reasons mentioned above, Berkeley DB requires a more
|
||||
sophisticated mapping between the native API and Java, so additional
|
||||
SWIG directives are added to the input. In particular:
|
||||
|
||||
* The general intention is for db.i to contain the full DB API (just
|
||||
like db.h). As much as possible, this file is kept Java independent
|
||||
so that it can be updated easily when the API changes. SWIG doesn't
|
||||
have any builtin rules for how to handle function pointers in a
|
||||
struct, so each DB method must be added in a SWIG "%extend" block
|
||||
which includes the method signature and a call to the method.
|
||||
|
||||
* SWIG's automatically generated function names happen to collide
|
||||
with Berkeley DB's naming convention. For example, in a SWIG class
|
||||
called __db, a method called "open" would result in a wrapper
|
||||
function called "__db_open", which already exists in DB. This is
|
||||
another reason why making these static functions is important.
|
||||
|
||||
* The main Java support starts in db_java.i - this file includes all
|
||||
Java code that is explicitly inserted into the generated classes,
|
||||
and is responsible for defining object lifecycles (handling
|
||||
allocation and cleanup).
|
||||
|
||||
* Methods that need to be wrapped for special handling in Java code
|
||||
are renamed with a trailing zero (e.g., close becomes close0).
|
||||
This is invisible to applications.
|
||||
|
||||
* Most DB classes that are wrapped have method calls that imply the
|
||||
cleanup of any native resources associated with the Java object
|
||||
(for example, Db.close or DbTxn.abort). These methods are wrapped
|
||||
so that if the object is accessed after the native part has been
|
||||
destroyed, an exception is thrown rather than a trap that crashes
|
||||
the JVM.
|
||||
|
||||
* Db and DbEnv initialization is more complex: a global reference is
|
||||
stored in the corresponding struct so that native code can
|
||||
efficiently map back to Java code. In addition, if a Db is
|
||||
created without an environment (i.e., in a private environment),
|
||||
the initialization wraps the internal DbEnv to simplify handling
|
||||
of various Db methods that just call the corresponding DbEnv
|
||||
method (like err, errx, etc.). It is important that the global
|
||||
references are cleaned up before the DB and DB_ENV handles are
|
||||
closed, so the Java objects can be garbage collected.
|
||||
|
||||
* In the case of DbLock and DbLsn, there are no such methods. In
|
||||
these cases, there is a finalize method that does the appropriate
|
||||
cleanup. No other classes have finalize methods (in particular,
|
||||
the Dbt class is now implemented entirely in Java, so no
|
||||
finalization is necessary).
|
||||
|
||||
* Overall initialization code, including the System.loadLibrary call,
|
||||
is in java_util.i. This includes looking up all class, field and
|
||||
method handles once so that execution is not slowed down by repeated
|
||||
runtime type queries.
|
||||
|
||||
* Exception handling is in java_except.i. The main non-obvious design
|
||||
choice was to create a db_ret_t type for methods that return an
|
||||
error code as an int in the C API, but return void in the Java API
|
||||
(and throw exceptions on error).
|
||||
|
||||
* The only other odd case with exceptions is DbMemoryException -
|
||||
this is thrown as normal when a call returns ENOMEM, but there is
|
||||
special handling for the case where a Dbt with DB_DBT_USERMEM is
|
||||
not big enough to handle a result: in this case, the Dbt handling
|
||||
code calls the method update_dbt on the exception that is about to
|
||||
be thrown to register the failed Dbt in the exception.
|
||||
|
||||
* Statistics handling is in java_stat.i - this mainly just hooks into
|
||||
the automatically-generated code in java_stat_auto.c.
|
||||
|
||||
* Callbacks: the general approach is that Db and DbEnv maintain
|
||||
references to the objects that handle each callback, and have a
|
||||
helper method for each call. This is primarily to simplify the
|
||||
native code, and performs better than more complex native code.
|
||||
|
||||
* One difference with the new approach is that the implementation is
|
||||
more careful about calling DeleteLocalRef on objects created for
|
||||
callbacks. This is particularly important for callbacks like
|
||||
bt_compare, which may be called repeatedly from native code.
|
||||
Without the DeleteLocalRef calls, the Java objects that are
|
||||
created can not be collected until the original call returns.
|
||||
|
||||
* Most of the rest of the code is in java_typemaps.i. A typemap is a
|
||||
rule describing how a native type is mapped onto a Java type for
|
||||
parameters and return values. These handle most of the complexity
|
||||
of creating exactly the Java API we want.
|
||||
|
||||
* One of the main areas of complexity is Dbt handling. The approach
|
||||
taken is to accept whatever data is passed in by the application,
|
||||
pass that to native code, and reflect any changes to the native
|
||||
DBT back into the Java object. In other words, the Dbt typemaps
|
||||
don't replicate DB's rules about whether Dbts will be modified or
|
||||
not - they just pass the data through.
|
||||
|
||||
* As noted above, when a Dbt is "released" (i.e., no longer needed
|
||||
in native code), one of the check is whether a DbMemoryException
|
||||
is pending, and if so, whether this Dbt might be the cause. In
|
||||
that case, the Dbt is added to the exception via the "update_dbt"
|
||||
method.
|
||||
|
||||
* Constant handling has been simplified by making DbConstants an
|
||||
interface. This allows the Db class to inherit the constants, and
|
||||
most can be inlined by javac.
|
||||
|
||||
* The danger here is if applications are compiled against one
|
||||
version of db.jar, but run against another. This danger existed
|
||||
previously, but was partly ameliorated by a separation of
|
||||
constants into "case" and "non-case" constants (the non-case
|
||||
constants were arranged so they could not be inlined). The only
|
||||
complete solution to this problem is for applications to check the
|
||||
version returned by DbEnv.get_version* versus the Db.DB_VERSION*
|
||||
constants.
|
||||
|
||||
|
||||
Application-visible changes
|
||||
===========================
|
||||
|
||||
* The new API is around 5x faster for many operations.
|
||||
|
||||
* Some internal methods and constructors that were previously public
|
||||
have been hidden or removed.
|
||||
|
||||
* A few methods that were inconsistent have been cleaned up (e.g.,
|
||||
Db.close now returns void, was an int but always zero). The
|
||||
synchronized attributed has been toggled on some methods - this is
|
||||
an attempt to prevent multi-threaded applications shooting
|
||||
themselves in the foot by calling close() or similar methods
|
||||
concurrently from multiple threads.
|
||||
Reference in New Issue
Block a user