Gunther Schadow

Lightweight Data Bases in Java

JDBC is often touted as "the" solution for Java and Data Bases. However, this is as true as a full fleged SQL RDBMS is the solution for even the tinyest need for persistent lookup tables.

There are a number of light-weight data base systems in the hands of C and C++ (and Perl, and whatever) programmers. These are all those libraries that carry a "db" or "dbm" in their name. The most popular is the GNU version of dbm, "gdbm".

Dbm, gdbm and the like are simple one-file data bases that use a b-tree or hash algorithm to sort simple key-data pairs, where key and data can be any byte strings. These small libraries have the advantage that they are small, fast, no administrative overhead, no network required, no special drivers, no servers, no headache.

To mention the disadvantage, of course, these light-weight one file key-data pair data bases do not scale well. When the data get's more complex or you need a client-server structure and concurrent updated and the like, you are better off with a real RDBMS server.

My collection of lightweight Java uasable DBMs includes four items. Those fall into two categories: JNI wrappers around other DBM packages, usually written in C or C++. This is Berkeley DB, and GDBM. The alternative is 100% pure Java. The 100% pure Java solution is more flexible, but it comes with slight penalties in efficiency. A little slower and larger files than GDBM produces natively. O.K., here they are:

JDBM - Gunther Schadow's (myself) JNI wrapper around GDBM

[project home] [documentation] [source] [download]

This is my own JNI wrapper around GDBM, born from the immediate need for a lightweight DBM in Java and the inability to find JavaGDBM and not knowing about the alternatives. I like my JNI wrapper more than JavaGDBM, since it is smaller and more light-weight.

JavaGDBM - Martin Pool's JNI wrapper around GDBM

[project home] [documentation] [source] [download]

I have never tried this out. It is larger, seems more comprehensive. I don't know. The original project home page is gone.

W3C JDBM - 100% pure Java

[project home] [documentation] [source] [download]

This is the "jdbm" part of W3C's Jigsaw release cut out to allow reusing this very useful part without having to download the entire Jigsaw environment. I made a few additions, see the README file.

The SoLinger Java SDBM

A small/fast dbm that is ideal for embedded applications. It is probably most useful for interopability with Perl disk hashes, as they use Sdbm by default.

Notice that SDBM uses fixed-size keys and leverages "sparse files" for space efficiency (i.e., the file looks like it's several megabyte big but only a few blocks are actually allocated.) On platforms that don't support sparse files, (e.g., Windows) large databases will actually use this space.

The project is part of the SoLinger project "dedicated to gathering and implementing utility classes in Java. Current projects include a Java port of Sdbm, a Java port of CrackLib, and a JavaSpaces based password cracker.

[project home] [This entry was contributed by Justin Chapweske, I have not looked closely or used this software.]

Berkeley DB - A comprehensive DB package with Java interface.

[project home] [documentation] [source] [download]

This is far more than just GDBM. It is a sophisticated package that gets you farther, even in distributed environments with concurrent access etc. I have not tried this but it looks good. Has C, C++ and Java interfaces. It is not 100% pure Java, of course. And you incur some administrative overhead. At this point you should consider JDBC and some SQL RDBMS. I can't help you with that tradeoff, sorry.

DISCLAIMER

I am not responsible for any of this software, except for my JNI wrapper that I have written myself. All of this comes with NO WARRANTY NEITHER EXPRESS NOR IMPLIED, NO WARRANTY FOR MERCHANTIBILITY OR FITNESS OF PURPOSE, etc., etc. Please refer to the projects' homes for the latest versions. The stuff you get clicking [download] will give you my local copy which may be old stuff.

Enjoy,

-Gunther Schadow

PS: I have once written a C++ wrapper for gdbm, called "odbm" that allows you to have limited persistent C++ objects. Odbm adds multiple indexes to gdbm, all in the same file. Thus, you can have fast look up for any attribute of your class. There are some restrictions your objects must comply to. Object references and pointers are not preserved. See the odbm.h file or browse the entire source code. This source is part of a much larger project to implement HL7 in C++ and has some dependencies to the big source tree. See the ProtoGen/HL7 homepage.

[home]