Copyright (C) 2003, 2004 Jeffrey I Cohen.  All rights reserved, worldwide.

NAME

  Genezzo - an extensible database system

DESCRIPTION

  The Genezzo modules implement a hierarchy of persistent
  hashes using a fixed amount of memory and disk.  This system is
  designed to be easily configured and extended with custom functions,
  persistent storage representations, and novel data access methods.
  In its current incarnation it supports a limited subset of SQL with
  a command line tool [gendba] and a partial DBI-style interface.

REQUIREMENTS

 Perl 5.6    

INSTALLATION

  If you can read this, then you figured out how to uncompress and
  untar the archive.  

  perl Makefile.PL  # constructs the makefile
  make              
  make test         
  make install      # (as superuser [optional])


  If you install Genezzo, you should be able to run the line-mode tool
  directly by typing "gendba.pl" at the command line.

  You also should be able to run scripts from the directory containing
  this file, something like:

  perl -Iblib/lib lib/Genezzo/gendba.pl


USAGE

  line mode available via (/usr/bin) gendba.pl.  
  Try:

    gendba.pl -help
    gendba.pl -man 

  or:

    perl -Iblib/lib lib/Genezzo/gendba.pl -help
    perl -Iblib/lib lib/Genezzo/gendba.pl -man

  for more information.

  The GenDBI module supports a subset of DBI-style calls.  See
  dbi_example.pl for some sample usage.

  Try:

    perl -Iblib/lib lib/Genezzo/dbi_example.pl # runs an example
    perldoc Genezzo::GenDBI

  for more information.

EXAMPLES

  The simplest way to create an instance of a Genezzo database is to use:

    gendba.pl -init

  This command will create a new database and login to the
  command-line.  Some very basic SQL create/drop/describe table,
  select, insert, update and delete syntax is supported, but unlike
  standard SQL, table and column names are case-sensitive.  More
  complex SQL, such as joins, parses, but is ignored.  The only
  supported functions are count(*) and ecount, a non-blocking count
  estimation function.  The database also supports commit to force
  changes to disk, but no rollback.  NOTE: Data definition (such as
  create table or ct) must be manually committed to keep the database
  in a consistent state.  Uncommitted inserts and updates will only be
  block-consistent -- there is no guarantee that the data will get
  flushed to disk, and no guarantee whether the changes will or will
  not take effect.

    rem  Some simple SELECTs
    select * from _col1;
    select rid, rownum, tname, colname from _col1;
    select count(*) from _col1;
    select ecount from _col1;

    rem  SELECTs with WHERE, perl and SQL style.
    rem  This functionality is somewhat fragile
    rem
    rem  note use of /x in regexp - fix problem when parser adds extra space
    select * from _tab1 where tname =~ m/col/x;
    select * from _tab1 where tid < 5;
    select * from _tab1 where (numcols > 3) && (numcols < 6);
    select tid as Table_ID, tname Name from _tab1;

    rem  Basic INSERT
    insert into test1 values (a,b,c,d);
    insert into test1(col2, col1) values (a,b,c,d);

    rem CREATE TABLE and INSERT...SELECT
    create table test2(col1 char, col2 char);
    insert into test2 (col1) select col1 from test1;

    rem  DELETE with WHERE
    delete from test1 where (col1 < "bravo") && (col2 > 5);

    rem  UPDATE with WHERE (no subqueries supported)
    update test2 set col2 = "foo" where col2 is null;

    commit

  By default, the database will autostart -- the gendba.pl command will
  connect to an active, updateable instance.  The shutdown command will
  reset the database to a read-only mode that only provides read access
  to certain dictionary tables.  The startup command will restart the
  database.

  The Genezzo parser also supports the Feeble query language, as
  described by gendba.pl -man.  Briefly, the commands are ct (create
  table), dt (drop table), i (insert), u (update), d (delete), s
  (select).  The following example creates a table EMP with two
  columns, ENAME (as character) and ID (as number), inserts five rows,
  and counts them.

    ct EMP ENAME=c ID=n
    i EMP bob 1 orville 2 wilbur 3 harry 4 luke 5
    s EMP count
    s EMP ecount
    commit

  While the SQL update and delete commands support a WHERE clause, the
  Feeble "u" and "d" commands only work via rowid.  For example:

    gendba 10> s EMP rid rownum *
    rid     rownum  ENAME   ID
    ___     ______  _____   __

    1/33/1  1       bob     1
    1/33/2  2       orville 2
    1/33/3  3       wilbur  3
    1/33/4  4       harry   4
    1/33/5  5       luke    5

    5 rows selected.

    gendba 11> d EMP 1/33/1
    deleted 1 rows from table EMP.

    gendba 12> u EMP 1/33/5 margo 66
    updated 1 rows in table EMP.

    gendba 13> !10
    s EMP rid rownum *
    rid     rownum  ENAME   ID
    ___     ______  _____   __

    1/33/2  1       orville 2
    1/33/3  2       wilbur  3
    1/33/4  3       harry   4
    1/33/5  4       margo   66

    4 rows selected.

  The default database is stored in a fixed-size 160K file.  You can
  change the default filesize and the blocksize at database
  initialization time.  For example:

    gendba.pl -init -define blocksize=8k -define dbsize=10M

  to create a database with a 10 megabyte datafile using 8K blocks.
  By default, this file is a fixed size, but it can be set to grow
  using the "increase_by" settings added in version 0.18.  Also, you can
  add additional files to your database using the "addfile" command.  By
  default, each file is double the size of the previous.  Type "addfile
  help" for more options.


NEW FEATURES - Indexed Tables

  Version 0.13 introduces an indexed-table type.  An indexed table
  treats the first column in the table definition as a unique, not-null
  primary key.  

    gendba 9> ct EMP index id=n name=c
    Create Table : EMP 
    with unique index option
    tablename : EMP
    column id : n (primary key)
    column name : c
    table EMP created 

    gendba 10> i EMP 1 bob 2 orville 3 wilbur 4 harry 5 luke
    inserted 5 rows into table EMP.

    gendba 11> s EMP rid *
    rid     id      name
    ___     __      ____

    1       1       bob
    2       2       orville
    3       3       wilbur
    4       4       harry
    5       5       luke

    5 rows selected.

    gendba 12> i EMP 1 bob 2 orville 3 wilbur 4 harry 5 luke
    whisper: duplicate key 1
    Failed to insert row 1 in table EMP at lib/Genezzo/gendba.pl line 230
    inserted 0 rows into table EMP.

    gendba 13> select tid, tname, otype from _tab1 where tname = "EMP"
    tid     tname   otype
    ___     _____   _____

    14      EMP     IDXTAB

    1 row selected.


  Notice that the first column is marked as a primary key when the
  table is created, and the rids for indexed tables are the primary key,
  not a physical block address.  

  Indexed tables have an object type of IDXTAB in the data dictionary
  -- conventional tables are marked as TABLE.

  WHERE clause processing has been enhanced to use equality predicates
  (e.g. "SELECT * from EMP where id = 2") to drive index scans.  Any
  series of "AND"ed predicates is a candidate for an efficient index
  search, but the use of an OR currently requires a full table scan.

  CAVEATS:  The index performance is pretty poor, for a couple reasons.
  It should improve a lot.  

NEW FEATURES - Primary Key Indexes

  Version 0.14 introduces "ci" - create index.
  Users can now define their own primary key (unique and not-null)
  indexes.  The dictionary automatically maintains indexes on its own
  tables.  In order to create an index called EMP_PK on the ID column
  of table EMP, the command is:
  
      ci EMP_PK EMP ID

  Version 0.15 supports SQL-style Create Index.  The alternative method
  to create the EMP_PK index is:

      Create Index EMP_PK on EMP (ID)

  You can have arbitrary combinations of numeric and character keys.
  RESTRICTIONS: only one index per table currently.  Index keys must
  be less than one-half of the db blocksize.  

  In version 0.15, the query engine will use the index to drive the
  row fetch for a SELECT if the WHERE clause specifies an equality 
  predicate on the primary key.  For example:

      Select * from EMP where ID = 5
 
  uses an index.  In general, additional predicates may accompany the
  equality expression as long as they are ANDed:

      Select * from EMP where ID = 5 and ENAME =~ m/BOB/x

  However, the presence of ORs causes the plan to revert to a full
  table scan:

      Select * from EMP where ID = 5 or ID = 6


NEW FEATURES - automatic tablespace and file growth

  In version 0.18, datafiles are no longer a fixed size.  The column
  "increase_by" in table _tsfiles controls the behavior: if this column
  is blank, null, or zero, the datafile size is fixed.  If
  "increase_by" is a non-zero number, the datafile will attempt to grow by 
  that many bytes if it runs out of free extents.  If "increase_by" is a
  percentage like "50%", the datafile is extended by the current size
  multiplied by this percentage.  The AddFile command now takes
  "increase_by" as an optional argument.  

  Also in version 0.18, each tablespace can be configured to
  automatically create new datafiles when the current datafile runs out
  of free extents.  The column "addfile" in table _tspace controls the
  behavior: if the column is blank or null, then the tablespace must be
  extended manually.  If the "addfile" column contains a valid argument 
  list for the AddFile command, then this command will get invoked when 
  all of the current datafiles are full.  Note that the datafiles can be 
  of a fixed size, or use the new "increase_by" parameter to grow to some
  maximum size (currently 2G).

  The script "bigdb.sql" can be used to reset your database for
  automatic growth.  It can be invoked as "@bigdb.sql" from the
  gendba command line.
  
NEW FEATURES - Non-Unique Indexes

  In version 0.20, all user-defined indexes are non-unique.  Unique
  indexes should be created as unique constraints (fixed in version 0.22).

NEW FEATURES - ALTER TABLE ADD CONSTRAINT

  Version 0.20 supports ALTER TABLE ADD CONSTRAINT <name> CHECK (<check-text>).

  The check-text is any basic where clause predicate.  Example:

    gendba 3> create table testcon (col1 c, col2 n);
    gendba 4> alter table testcon add constraint cn1 
               check ((col2 > 5) && col1 =~ m/(a|b|c)/x )

    gendba 5> insert into testcon values ("a", 7);
    inserted 1 row into table testcon.

    gendba 6> insert into testcon values ("b", 8);
    inserted 1 row into table testcon.

    gendba 7> insert into testcon values ("d", 8);
    violated constraint cn1
    whisper: undo insert!!
    Failed to insert row 1 in table testcon at lib/Genezzo/gendba.pl line 244
    inserted 0 rows into table testcon.

    gendba 8> insert into testcon values ("a", 1);
    violated constraint cn1
    whisper: undo insert!!
    Failed to insert row 1 in table testcon at lib/Genezzo/gendba.pl line 244
    inserted 0 rows into table testcon.

  Version 0.22 supports 
    ALTER TABLE ADD CONSTRAINT <name> UNIQUE (<column-list>)
    and
    ALTER TABLE ADD CONSTRAINT <name> PRIMARY KEY (<column-list>

  Version 0.22 only supports a single constraint or single index per
  table, but version 0.23 supports multiple constraints and indexes
  on a table.
  
  Need to extend to fully support NULL/NOT NULL, FOREIGN KEY, plus
  ENABLE/DISABLE.

NEW FEATURES - System/User Extensibility

  Version 0.25 of Genezzo supports database extensibility.  See
  Genezzo::Havok for more details.  The script "havok.sql" can be used
  add extensibility tables to your database.  It can be invoked as
  "@havok.sql" from the gendba command line.  In particular, the table
  user_extend can be used to load new functions or perl modules into
  Genezzo.  For example, the havok.sql script loads "Text::Soundex", so
  queries can use Text::Soundex::soundex in the WHERE clause.

  Version 0.28 of Genezzo slightly modifies the havok and user_extend
  tables, and it provides the capability to import functions into the
  default Genezzo namespace.  For example, if the "soundex" function is
  imported, then the query can just call soundex(), versus using the 
  fully-qualified name Text::Soundex::soundex.

INCOMPATIBLE CHANGES

  Version 0.29 dictionary and block format is incompatible with
  previous releases.  No upgrade path is supported.

SEE ALSO

  The project homepage at http://www.genezzo.com


AUTHOR

  Original author: Jeffrey I Cohen <jcohen@genezzo.com>

  Copyright (c) 2003, 2004 Jeffrey I Cohen.  All rights reserved.

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

Address bug reports and comments to: jcohen@genezzo.com

For more information, please visit the Genezzo homepage 
at http://www.genezzo.com