NNAAMMEE db - the DB library overview and introduction DDEESSCCRRIIPPTTIIOONN The DB library is a family of groups of functions that provides a modular programming interface to transactions and record-oriented file access. The library includes support for transactions, locking, logging and file page caching, as well as various indexed access methods. Many of the functional groups (e.g., the file page caching functions) are useful independent of the other DB func- tions, although some functional groups are explicitly based on other functional groups (e.g., transactions and logging). For a general description of the DB package, see _d_b___i_n_t_r_o(3). The DB library does not provide user interfaces, data entry GUI's, SQL support or any of the other standard user-level database interfaces. What it does provide are the programmatic building blocks that allow you to easily embed database-style functionality and support into other objects or interfaces. AARRCCHHIITTEECCTTUURREE The DB library supports two different models of applica- tions: client-server and embedded. In the client-server model, a database server is created by writing an application that accepts requests via some form of IPC and issues calls to the DB functions based on those queries. In this model, applications are client programs that attach to the server and issue queries. The client-server model trades performance for protection, as it does not require that the applications share a protec- tion domain with the server, but IPC/RPC is generally slower than a function call. In addition, this model sim- plifies the creation of network client-server applica- tions. In the embedded model, an application links the DB library directly into its address space. This provides for faster access to database functionality, but means that the applications sharing log files, lock manager, transaction manager or memory pool manager have the ability to read, write, and corrupt each other's data. It is the application designer's responsibility to select the appropriate model for their application. Applications require a single include file, _<_d_b_._h_>, which must be installed in an appropriate location on the sys- tem. CC++++ The C++ classes provide a thin wrapper around the C API, with the major advantages being improved encapsulation and an optional exception mechanism for errors. The classes and methods are named in a fashion that directly corresponds to structures and functions in the C interface. Likewise, arguments to methods appear in the same order as the C interface, except to remove the explicit ``this'' pointer. The #defines used for flags are identical between the C and C++ interfaces. As a rule, each C++ object has exactly one structure from the underlying C API associated with it. The C structure is allocated with each constructor call and deallocated with each destructor call. Thus, the rules the user needs to follow in allocating and deallocating structures are the same between the C and C++ interfaces. To ensure portability to many platforms, both new and old, we make few assumptions about the C++ compiler and library. For example, we do not expect STL, templates or namespaces to be available. The newest C++ feature used is exceptions, which are used liberally to transmit error information. Even the use of exceptions can be disabled at runtime, by using _D_b_E_n_v_:_:_s_e_t___e_r_r_o_r___m_o_d_e_l() (see _D_b_E_n_v(3)). For a discussion of the exception mechanism, see _D_b_E_x_c_e_p_t_i_o_n(3). For the rest of this manual page, C interfaces are listed as the primary reference, and C++ interfaces following parenthetically, e.g., _d_b___o_p_e_n (_D_b_:_:_o_p_e_n). JJAAVVAA The Java classes provide a layer around the C API that is almost identical to the C++ layer. The classes and meth- ods are, for the most part identical to the C++ layer. Db constants and #defines are represented as "static final int" values. Errors conditions appear as Java exceptions. As in C++, each Java object has exactly one structure from the underlying C API associated with it. The Java struc- ture is allocated with each constructor or open call, but is deallocated only when the Java GC does so. Because the timing or ordering of GC is not predictable, the user should take care to do a close() when finished with any object that has such a method. SSUUBBSSYYSSTTEEMMSS The DB library is made up of five major subsystems, as follows: Access methods The access methods subsystem is made up of general- purpose support for creating and accessing files for- matted as B+tree's, hashed files, and fixed and vari- able length records. These modules are useful in the absence of transactions for processes that need fast, formatted file support. See _d_b___o_p_e_n(3) and _d_b___c_u_r_- _s_o_r(3) (_D_b(3) and _D_b_c(3)) for more information. Locking The locking subsystem is a general-purpose lock man- ager used by DB. This module is useful in the absence of the rest of the DB package for processes that require a fast, configurable lock manager. See _d_b___l_o_c_k(3) (_D_b_L_o_c_k_T_a_b(3) and _D_b_L_o_c_k(3)) for more information. Logging The logging subsystem is the logging support used to support the DB transaction model. It is largely spe- cific to the DB package, and unlikely to be used elsewhere. See _d_b___l_o_g(3) (_D_b_L_o_g(3)) for more infor- mation. Memory Pool The memory pool subsystem is the general-purpose shared memory buffer pool used by DB. This module is useful outside of the DB package for processes that require page-oriented, cached, shared file access. See _d_b___m_p_o_o_l(3) (_D_b_M_p_o_o_l(3) and _D_b_M_p_o_o_l_F_i_l_e(3)) for more information. Transactions The transaction subsystem implements the DB transac- tion model. It is largely specific to the DB pack- age. See _d_b___t_x_n(3) (_D_b_T_x_n_M_g_r(3) and _D_b_T_x_n(3)) for more information. There are several stand-alone utilities that support the DB environment. They are as follows: db_archive The _d_b___a_r_c_h_i_v_e utility supports database backup, archival and log file administration. See _d_b___a_r_c_h_i_v_e(1) for more information. db_recover The _d_b___r_e_c_o_v_e_r utility runs after an unexpected DB or system failure to restore the database to a consis- tent state. See _d_b___r_e_c_o_v_e_r(1) for more information. db_checkpoint The _d_b___c_h_e_c_k_p_o_i_n_t utility runs as a daemon process, monitoring the database log and periodically issuing checkpoints. See _d_b___c_h_e_c_k_p_o_i_n_t(1) for more informa- tion. db_deadlock The _d_b___d_e_a_d_l_o_c_k utility runs as a daemon process, periodically traversing the database lock structures and aborting transactions when it detects a deadlock. See _d_b___d_e_a_d_l_o_c_k(1) for more information. db_dump The _d_b___d_u_m_p utility writes a copy of the database to a flat-text file in a portable format. See _d_b___d_u_m_p(1) for more information. db_load The _d_b___l_o_a_d utility reads the flat-text file produced by _d_b___d_u_m_p, and loads it into a database file. See _d_b___l_o_a_d(1) for more information. db_stat The _d_b___s_t_a_t utility displays statistics for databases and database environments. See _d_b___s_t_a_t(1) for more information. NNAAMMIINNGG AANNDD TTHHEE DDBB EENNVVIIRROONNMMEENNTT The DB application environment is described by the _d_b___a_p_p_i_n_i_t(3) (_D_b_E_n_v(3)) manual page. The _d_b___a_p_p_i_n_i_t (_D_b_E_n_v_:_:_a_p_p_i_n_i_t) function is used to create a consistent naming scheme for all of the subsystems sharing a DB envi- ronment. If _d_b___a_p_p_i_n_i_t (_D_b_E_n_v_:_:_a_p_p_i_n_i_t) is not called by a DB application, naming is performed as specified by the manual page for the specific subsystem. DB applications that run with additional privilege should always call the _d_b___a_p_p_i_n_i_t (_D_b_E_n_v_:_:_a_p_p_i_n_i_t) function to initialize DB naming for their application. This ensures that the environment variables DB_HOME and TMPDIR will only be used if the application explicitly specifies that they are safe. AADDMMIINNIISSTTEERRIINNGG TTHHEE DDBB EENNVVIIRROONNMMEENNTT A DB environment consists of a database home directory and all the long-running daemons necessary to ensure continued functioning of DB and its applications. In the presence of transactions, the checkpoint daemon, _d_b___c_h_e_c_k_p_o_i_n_t, must be run as long as there are applications present (see _d_b___c_h_e_c_k_p_o_i_n_t(1) for details). When locking is being used, the deadlock detection daemon, _d_b___d_e_a_d_l_o_c_k, must be run as long as there are applications present (see _d_b___d_e_a_d_l_o_c_k(1) for details). The _d_b___a_r_c_h_i_v_e utility pro- vides information to facilitate log reclamation and cre- ation of database snapshots (see _d_b___a_r_c_h_i_v_e(1) for details). After application or system failure, the _d_b___r_e_c_o_v_e_r utility must be run before any applications are restarted to return the database to a consistent state (see _d_b___r_e_c_o_v_e_r(1) for details). The simplest way to administer a DB application environ- ment is to create a single ``home'' directory that houses all the files for the applications that are sharing the DB environment. In this model, the shared memory regions (i.e., the locking, logging, memory pool, and transaction regions) and log files will be stored in the specified directory hierarchy. In addition, all data files speci- fied using relative pathnames will be named relative to this home directory. When recovery needs to be run (e.g., after system or application failure), this directory is specified as the home directory to _d_b___r_e_c_o_v_e_r(1), and the system is restored to a consistent state, ready for the applications to be restarted. In situations where further customization is desired, such as placing the log files on a separate device, it is rec- ommended that the application installation process create a configuration file named ``DB_CONFIG'' in the database home directory, specifying the customization. See _d_b___a_p_p_i_n_i_t(3) (_D_b_E_n_v(3)) for details on this procedure. The DB architecture does not support placing the shared memory regions on remote filesystems, e.g., the Network File System (NFS) and the Andrew File System (AFS). For this reason, the database home directory must reside on a local filesystem. Databases, log files and temporary files may be placed on remote filesystems, although the application may incur a performance penalty for doing so. It is important to realize that all applications sharing a single home directory implicitly trust each other. They have access to each other's data as it resides in the shared memory buffer pool and will share resources such as buffer space and locks. At the same time, any applica- tions that access the same files mmuusstt share an environment if consistency is to be maintained across the different applications. EERRRROORR RREETTUURRNNSS Except for the historic _d_b_m and _h_s_e_a_r_c_h interfaces (see _d_b___d_b_m(3) and _d_b___h_s_e_a_r_c_h(3)), DB does not use the global variable _e_r_r_n_o to return error values. The return values for all DB functions can be grouped into three categories: 0 A return value of 0 indicates that the operation was successful. >0 A return value that is greater than 0 indicates that there was a system error. The _e_r_r_n_o value returned by the system is returned by the function, e.g., when a DB function is unable to allocate memory, the return value from the function will be ENOMEM. <0 A return value that is less than 0 indicates a condi- tion that was not a system failure, but was not an unqualified success, either. For example, a routine to retrieve a key/data pair from the database may return DB_NOTFOUND when the key/data pair does not appear in the database, as opposed to the value of 0, which would be returned if the key/data pair were found in the database. All such special values returned by DB functions are less than 0 in order to avoid conflict with possible values of _e_r_r_n_o. There are two special return values that are somewhat sim- ilar in meaning, are returned in similar situations, and therefore might be confused: DB_NOTFOUND and DB_KEYEMPTY. The DB_NOTFOUND error return indicates that the requested key/data pair did not exist in the database or that start- or end-of-file has been reached. The DB_KEYEMPTY error return indicates that the requested key/data pair logi- cally exists but was never explicitly created by the application (the recno access method will automatically create key/data pairs under some circumstances, see _d_b___o_p_e_n(3) (_D_b(3)) for more information), or that the requested key/data pair was deleted and is currently in a deleted state. SSIIGGNNAALLSS When applications using DB receive signals, it is impor- tant that they exit gracefully, discarding any DB locks that they may hold. This is normally done by setting a flag when a signal arrives, and then checking for that flag periodically within the application. Specifically, the signal handler should not attempt to release locks and/or close the database handles itself. This is not guaranteed to work correctly and the results are unde- fined. If an application exits holding a lock, the situation is no different than if the application crashed, and all applications participating in the database environment must be shutdown, and then recovery must be performed. If this is not done, the locks that the application held can cause unresolvable deadlocks inside the database, and applications may then hang. MMUULLTTII--TTHHRREEAADDIINNGG See _d_b___t_h_r_e_a_d(3) for information on using DB in threaded applications. DDAATTAABBAASSEE AANNDD PPAAGGEE SSIIZZEESS DB stores database file page numbers as unsigned 32-bit numbers and database file page sizes as unsigned 16-bit numbers. This results in a maximum database size of 2^48. The minimum database page size is 512 bytes, resulting in a minimum maximum database size of 2^41. DB is potentially further limited if the host system does not have filesystem support for files larger than 2^32, including seeking to absolute offsets within such files. The maximum btree depth is 255. BBYYTTEE OORRDDEERRIINNGG The database files created by DB can be created in either little or big-endian formats. By default, the native for- mat of the machine on which the database is created will be used. Any format database can be used on a machine with a different native format, although it is possible that the application will incur a performance penalty for the run-time conversion. EEXXTTEENNDDIINNGG DDBB DB includes tools to simplify the development of applica- tion-specific logging and recovery. Specifically, given a description of the information to be logged, these tools will automatically create logging functions (functions that take the values as parameters and construct a single record that is written to the log), read functions (func- tions that read a log record and unmarshall the values into a structure that maps onto the values you chose to log), a print function (for debugging), templates for the recovery functions, and automatic dispatching to your recovery functions. EEXXAAMMPPLLEESS There are a number of examples included with the DB library distribution, intended to demonstrate various ways of using the DB library. Some applications require the use of formatted files to store data, but do not require concurrent access and can cope with the loss of data due to catastrophic failure. Generally, these applications create short-lived databases that are discarded or recreated when the system fails. Such applications need only use the DB access methods. The DB access methods will use the memory pool subsystem, but the application is unlikely to do so explicitly. See the files _e_x_a_m_p_l_e_s_/_e_x___a_c_c_e_s_s_._c, _e_x_a_m_p_l_e_s_/_e_x___b_t_r_e_c_._c, _e_x_a_m_- _p_l_e_s___c_x_x_/_A_c_c_e_s_s_E_x_a_m_p_l_e_._c_p_p and _j_a_v_a_/_s_r_c_/_c_o_m_/_s_l_e_e_p_y_- _c_a_t_/_e_x_a_m_p_l_e_s_/_A_c_c_e_s_s_E_x_a_m_p_l_e_._j_a_v_a in the DB source distribu- tion for C, C++, and Java language code examples of how such applications might use the DB library. Some applications require the use formatted files to store data, but also need to use _d_b___a_p_p_i_n_i_t(3) (_D_b_E_n_v_:_:_a_p_p_i_n_i_t(3)) for environment initialization. See the files _e_x_a_m_p_l_e_s_/_e_x___a_p_p_i_n_i_t_._c, _e_x_a_m_p_l_e_s___c_x_x_/_A_p_p_i_n_i_t_E_x_a_m_- _p_l_e_._c_p_p or _j_a_v_a_/_s_r_c_/_c_o_m_/_s_l_e_e_p_y_c_a_t_/_e_x_a_m_p_l_e_s_/_A_p_p_i_n_i_t_E_x_a_m_- _p_l_e_._j_a_v_a in the DB source distribution for C, C++ and Java language code examples of how such an application might use the DB library. Some applications use the DB access methods, but are also concerned about catastrophic failure, and therefore need to transaction protect the underlying DB files. See the files _e_x_a_m_p_l_e_s_/_e_x___t_p_c_b_._c, _e_x_a_m_p_l_e_s___c_x_x_/_T_p_c_b_E_x_a_m_p_l_e_._c_p_p or _j_a_v_a_/_s_r_c_/_c_o_m_/_s_l_e_e_p_y_c_a_t_/_e_x_a_m_p_l_e_s_/_T_p_c_b_E_x_a_m_p_l_e_._j_a_v_a in the DB source distribution for C, C++ and Java language code examples of how such an application might use the DB library. Some applications will benefit from the ability to buffer input files other than the underlying DB access method files. See the files _e_x_a_m_p_l_e_s_/_e_x___m_p_o_o_l_._c or _e_x_a_m_- _p_l_e_s___c_x_x_/_M_p_o_o_l_E_x_a_m_p_l_e_._c_p_p in the DB source distribution for C and C++ language code examples of how such an appli- cation might use the DB library. Some applications need a general-purpose lock manager sep- arate from locking support for the DB access methods. See the files _e_x_a_m_p_l_e_s_/_e_x___l_o_c_k_._c, _e_x_a_m_p_l_e_s___c_x_x_/_L_o_c_k_E_x_a_m_p_l_e_._c_p_p or _j_a_v_a_/_s_r_c_/_c_o_m_/_s_l_e_e_p_y_c_a_t_/_e_x_a_m_p_l_e_s_/_L_o_c_k_E_x_a_m_p_l_e_._j_a_v_a in the DB source distribution for C, C++ and Java language code examples of how such an application might use the DB library. Some applications will use the DB access methods in a threaded fashion, including trickle flushing of the under- lying buffer pool and deadlock detection. See the file _e_x_a_m_p_l_e_s_/_e_x___t_h_r_e_a_d_._c in the DB source distribution for a C language code example of how such an application might use the DB library. Note that the Java API assumes a threaded environment and performs all thread-specific initializa- tion automatically. CCOOMMPPAATTIIBBIILLIITTYY The DB 2.0 library provides backward compatible interfaces for the historic UNIX _d_b_m(3), _n_d_b_m(3) and _h_s_e_a_r_c_h(3) interfaces. See _d_b___d_b_m(3) and _d_b___h_s_e_a_r_c_h(3) for further information on these interfaces. It also provides a back- ward compatible interface for the historic DB 1.85 release. DDBB 22..00 ddooeess nnoott pprroovviiddee ddaattaabbaassee ccoommppaattiibbiilliittyy ffoorr aannyy ooff tthhee aabboovvee iinntteerrffaacceess,, aanndd eexxiissttiinngg ddaattaabbaasseess mmuusstt bbee ccoonnvveerrtteedd mmaannuuaallllyy.. To convert existing databases from the DB 1.85 format to the DB 2.0 format, review the _d_b___d_u_m_p_1_8_5(1) and _d_b___l_o_a_d(1) manual pages. The name space in DB 2.0 has been changed from that of previous DB versions, notably version 1.85, for portabil- ity and consistency reasons. The only name collisions in the two libraries are the names used by the _d_b_m(3), _n_d_b_m(3), _h_s_e_a_r_c_h(3) and the DB 1.85 compatibility inter- faces. To include both DB 1.85 and DB 2.0 in a single library, remove the _d_b_m(3), _n_d_b_m(3) and _h_s_e_a_r_c_h(3) inter- faces from either of the two libraries, and the DB 1.85 compatibility interface from the DB 2.0 library. This can be done by editing the library Makefiles and reconfiguring and rebuilding the DB 2.0 library. Obviously, if you use the historic interfaces, you will get the version in the library from which you did not remove it. Similarly, you will not be able to access DB 2.0 files using the DB 1.85 compatibility interface, since you have removed that from the library as well. It is possible to simply relink applications written to the DB 1.85 interface against the DB 2.0 library. Recom- pilation of such applications is slightly more complex. When the DB 2.0 library is installed, it installs two include files, _d_b_._h and _d_b___1_8_5_._h. The former file is likely to replace the DB 1.85 version's include file which had the same name. If this did not happen, recompiling DB 1.85 applications to use the DB 2.0 library is simple: recompile as done historically, and load against the DB 2.0 library instead of the DB 1.85 library. If, however, the DB 2.0 installation process has replaced the system's _d_b_._h include file, replace the application's include of _d_b_._h with inclusion of _d_b___1_8_5_._h, recompile as done histor- ically, and then load against the DB 2.0 library. Applications written using the historic interfaces of the DB library should not require significant effort to port to the DB 2.0 interfaces. While the functionality has been greatly enhanced in DB 2.0, the historic interface and functionality and is largely unchanged. Reviewing the application's calls into the DB library and updating those calls to the new names, flags and return values should be sufficient. While loading applications that use the DB 1.85 interfaces against the DB 2.0 library, or converting DB 1.85 function calls to DB 2.0 function calls will work, reconsidering your application's interface to the DB database library in light of the additional functionality in DB 2.0 is recom- mended, as it is likely to result in enhanced application performance. SSEEEE AALLSSOO:: AADDMMIINNIISSTTRRAATTIIVVEE AANNDD OOTTHHEERR UUTTIILLIITTIIEESS _d_b___a_r_c_h_i_v_e(1), _d_b___c_h_e_c_k_p_o_i_n_t(1), _d_b___d_e_a_d_l_o_c_k(1), _d_b___d_u_m_p(1), _d_b___l_o_a_d(1), _d_b___r_e_c_o_v_e_r(1), _d_b___s_t_a_t(1) SSEEEE AALLSSOO:: CC AAPPII _d_b___a_p_p_i_n_i_t(3), _d_b___c_u_r_s_o_r(3), _d_b___d_b_m(3), _d_b___l_o_c_k(3), _d_b___l_o_g(3), _d_b___m_p_o_o_l(3), _d_b___o_p_e_n(3), _d_b___t_x_n(3) SSEEEE AALLSSOO:: CC++++ aanndd JJaavvaa AAPPII _D_b(3), _D_b_c(3), _D_b_E_n_v(3), _D_b_E_x_c_e_p_t_i_o_n(3), _D_b_I_n_f_o(3), _D_b_L_o_c_k(3), _D_b_L_o_c_k_T_a_b(3), _D_b_L_o_g(3), _D_b_L_s_n(3), _D_b_M_p_o_o_l(3), _D_b_M_p_o_o_l_F_i_l_e(3), _D_b_t(3), _D_b_T_x_n(3), _D_b_T_x_n_M_g_r(3) SSEEEE AALLSSOO:: AADDDDIITTIIOONNAALL RREEFFEERREENNCCEESS _L_I_B_T_P_: _P_o_r_t_a_b_l_e_, _M_o_d_u_l_a_r _T_r_a_n_s_a_c_t_i_o_n_s _f_o_r _U_N_I_X, Margo Seltzer, Michael Olson, USENIX proceedings, Winter 1992.