Quick Guide to Perl DBI module

DBI is the Database independent interface for Perl. It is a Perl module available on CPAN to deal with database interactions with various database engines.

This page may be fleshed out a bit more sometime, but the quick synopsis is:

use DBI;
# get a connection:
$dbh = DBI->connect('DBI:mysql:database_name:server', 'dbuser', 'dbpass');
 
$dbh = DBI->connect($dsn, $user, $password,
                      { RaiseError => 1, AutoCommit => 1 });
 
 
# prepare a statement and execute it:
$sth = $dbh->prepare('SELECT * FROM users WHERE active = 1');
$sth->execute();
 
# loop through data:
while (my $row = $sth->fetchrow_hashref()) {
    printf "User: %s\n", $row->{username};
}
 
# prepare a statement with placeholders, and use it multiple
# times.  This is well worth doing for statements which will
# be reused, as it saves the overhead of preparing the 
# statement every time:
$sth = $dbh->prepare('SELECT * FROM users WHERE id = ?');
 
my @ids = (5,6,10);
 
for my $id (@ids) {
    my $row = $sth->execute($id);
}
 
# do a one-off query with no results to fetch:
$dbh->do('DELETE FROM users WHERE active = 0');
 
# using transactions: (for DBs which support them only, obviously)
# set autocommit off in the connect, call, then:
 
$dbh->begin_work;  # starts the transaction
# ... do some queries ....
$dbh->commit;      # commits the changes, or
$dbh->rollback;    # rolls back

One day, I'll flesh this out a lot more.

Fetch Methods

Most (OK, practically all) of the following is ripped from the fine DBI documentation on CPAN.

fetchrow_arrayref

$ary_ref = $sth->fetchrow_arrayref;
$ary_ref = $sth->fetch;    # alias

Fetches the next row of data and returns a reference to an array holding the field values. Null fields are returned as undef values in the array. This is the fastest way to fetch data, particularly if used with $sth→bind_columns.

If there are no more rows or if an error occurs, then fetchrow_arrayref returns an undef. You should check $sth→err afterwards (or use the RaiseError attribute) to discover if the undef returned was due to an error.

Note that the same array reference is returned for each fetch, so don't store the reference and then use it after a later fetch. Also, the elements of the array are also reused for each row, so take care if you want to take a reference to an element. See also “bind_columns”.

fetchrow_array

@ary = $sth->fetchrow_array;

An alternative to fetchrow_arrayref. Fetches the next row of data and returns it as a list containing the field values. Null fields are returned as undef values in the list.

If there are no more rows or if an error occurs, then fetchrow_array returns an empty list. You should check $sth→err afterwards (or use the RaiseError attribute) to discover if the empty list returned was due to an error.

If called in a scalar context for a statement handle that has more than one column, it is undefined whether the driver will return the value of the first column or the last. So don't do that. Also, in a scalar context, an undef is returned if there are no more rows or if an error occurred. That undef can't be distinguished from an undef returned because the first field value was NULL. For these reasons you should exercise some caution if you use fetchrow_array in a scalar context.

fetchrow_hashref

$hash_ref = $sth->fetchrow_hashref;
$hash_ref = $sth->fetchrow_hashref($name);

An alternative to fetchrow_arrayref. Fetches the next row of data and returns it as a reference to a hash containing field name and field value pairs. Null fields are returned as undef values in the hash.

If there are no more rows or if an error occurs, then fetchrow_hashref returns an undef. You should check $sth→err afterwards (or use the RaiseError attribute) to discover if the undef returned was due to an error.

The optional $name parameter specifies the name of the statement handle attribute. For historical reasons it defaults to “NAME”, however using either “NAME_lc” or “NAME_uc” is recomended for portability.

The keys of the hash are the same names returned by $sth→{$name}. If more than one field has the same name, there will only be one entry in the returned hash for those fields.

Because of the extra work fetchrow_hashref and Perl have to perform, it is not as efficient as fetchrow_arrayref or fetchrow_array.

(Dave: however, I find fetchrow_hashref leads to easily understandable, maintainable code, so it's my preference. It's easy to follow something like:

# (prepare + execute a query)
while (my $row = $sth->fetchrow_hashref) {
    print $row->{fieldname};
}

By default a reference to a new hash is returned for each row. It is likely that a future version of the DBI will support an attribute which will enable the same hash to be reused for each row. This will give a significant performance boost, but it won't be enabled by default because of the risk of breaking old code.

fetchall_arrayref

$tbl_ary_ref = $sth->fetchall_arrayref;
$tbl_ary_ref = $sth->fetchall_arrayref( $slice );
$tbl_ary_ref = $sth->fetchall_arrayref( $slice, $max_rows  );

The fetchall_arrayref method can be used to fetch all the data to be returned from a prepared and executed statement handle. It returns a reference to an array that contains one reference per row.

If there are no rows to return, fetchall_arrayref returns a reference to an empty array. If an error occurs, fetchall_arrayref returns the data fetched thus far, which may be none. You should check $sth→err afterwards (or use the RaiseError attribute) to discover if the data is complete or was truncated due to an error.

If $slice is an array reference, fetchall_arrayref uses “fetchrow_arrayref” to fetch each row as an array ref. If the $slice array is not empty then it is used as a slice to select individual columns by perl array index number (starting at 0, unlike column and parameter numbers which start at 1).

With no parameters, or if $slice is undefined, fetchall_arrayref acts as if passed an empty array ref.

If $slice is a hash reference, fetchall_arrayref uses “fetchrow_hashref” to fetch each row as a hash reference. If the $slice hash is empty then fetchrow_hashref() is simply called in a tight loop and the keys in the hashes have whatever name lettercase is returned by default from fetchrow_hashref. (See “FetchHashKeyName” attribute.) If the $slice hash is not empty, then it is used as a slice to select individual columns by name. The values of the hash should be set to 1. The key names of the returned hashes match the letter case of the names in the parameter hash, regardless of the “FetchHashKeyName” attribute.

For example, to fetch just the first column of every row:

$tbl_ary_ref = $sth->fetchall_arrayref([0]);

To fetch the second to last and last column of every row:

$tbl_ary_ref = $sth->fetchall_arrayref([-2,-1]);

To fetch all fields of every row as a hash ref:

$tbl_ary_ref = $sth->fetchall_arrayref({});

To fetch only the fields called “foo” and “bar” of every row as a hash ref (with keys named “foo” and “BAR”):

$tbl_ary_ref = $sth->fetchall_arrayref({ foo=>1, BAR=>1 });

The first two examples return a reference to an array of array refs. The third and forth return a reference to an array of hash refs.

fetchall_hashref

$hash_ref = $sth->fetchall_hashref($key_field);

The fetchall_hashref method can be used to fetch all the data to be returned from a prepared and executed statement handle. It returns a reference to a hash containing a key for each distinct value of the $key_field column that was fetched. For each key the corresponding value is a reference to a hash containing all the selected columns and their values, as returned by fetchrow_hashref().

If there are no rows to return, fetchall_hashref returns a reference to an empty hash. If an error occurs, fetchall_hashref returns the data fetched thus far, which may be none. You should check $sth→err afterwards (or use the RaiseError attribute) to discover if the data is complete or was truncated due to an error.

The $key_field parameter provides the name of the field that holds the value to be used for the key for the returned hash. For example:

$dbh->{FetchHashKeyName} = 'NAME_lc';
$sth = $dbh->prepare("SELECT FOO, BAR, ID, NAME, BAZ FROM TABLE");
$sth->execute;
$hash_ref = $sth->fetchall_hashref('id');
print "Name for id 42 is $hash_ref->{42}->{name}\n";

The $key_field parameter can also be specified as an integer column number (counting from 1). If $key_field doesn't match any column in the statement, as a name first then as a number, then an error is returned.

For queries returing more than one 'key' column, you can specify multiple column names by passing $key_field as a reference to an array containing one or more key column names (or index numbers). For example:

$sth = $dbh->prepare("SELECT foo, bar, baz FROM table");
$sth->execute;
$hash_ref = $sth->fetchall_hashref( [ qw(foo bar) ] );
print "For foo 42 and bar 38, baz is $hash_ref->{42}->{38}->{baz}\n";

The fetchall_hashref() method is normally used only where the key fields values for each row are unique. If multiple rows are returned with the same values for the key fields then later rows overwrite earlier ones.

rows

$rows = $sth->rows;

Returns the number of rows affected by the last row affecting command, or -1 if the number of rows is not known or not available.

Generally, you can only rely on a row count after a non-SELECT execute (for some specific operations like UPDATE and DELETE), or after fetching all the rows of a SELECT statement.

For SELECT statements, it is generally not possible to know how many rows will be returned except by fetching them all. Some drivers will return the number of rows the application has fetched so far, but others may return -1 until all rows have been fetched. So use of the rows method or $DBI::rows with SELECT statements is not recommended.

One alternative method to get a row count for a SELECT is to execute a “SELECT COUNT(*) FROM ...” SQL statement with the same ”...” as your query and then fetch the row count from that. (Dave: although, of course, unless you're in a transaction, the data could change between that query and your actual fetching query, so you may get a different number of rows).

Using Transactions

To use transaction support in DBI, first disable auto-commit by setting AutoCommit to 0:

$dbh->{AutoCommit} = 0;

If you don't have DBI's RaiseError flag enabled, you'll have to then check whether AutoCommit was successfully changed with something like:

if ($database->dbh->{AutoCommit}) {
    warn 'Could not disable AutoCommit, transactions are unavailable';
}

Once you've done that, you can do a bunch of queries as normal, and then commit or rollback with:

$dbh->commit();
# or
$dbh->rollback();

WARNING - the DBD::mysql documentation says:

If you detect an error while changing the AutoCommit mode, you should
no longer use the database handle. In other words, you should disconnect
and reconnect again, because the transaction mode is unpredictable.
Alternatively you may verify the transaction mode by checking the value
of the server variable autocommit. However, such behaviour isn't portable.

If you're using MySQL, transaction support was introduced in MySQL 4. The table type (engine) in use must be InnoDB or another engine which supports transactions - the default MySQL engine MyISAM does not. Also, MySQL must be in the right mood to do a transaction (rather than silently ignoring you) and the wind must be blowing from a precise south-south-east direction at around 17mph.

In other words, MySQL might do a transaction for you, alternatively it might not.

Getting last inserted row ID

DBI offers a last_insert_id() method to get the ID of the last row you inserted (assuming that there is one column which is set to auto_increment).

$rv = $dbh->last_insert_id($catalog, $schema, $table, $field);
# or giving optional attributes:
$rv = $dbh->last_insert_id($catalog, $schema, $table, $field, \%attr);

DBI docs say:

Returns a value 'identifying' the row just inserted, if possible. Typically this would be a value assigned by the database server to a column with an auto_increment or serial type. Returns undef if the driver does not support the method or can't determine the value.

The $catalog, $schema, $table, and $field parameters may be required for some drivers (see below). If you don't know the parameter values and your driver does not need them, then use undef for each.

There are several caveats to be aware of with this method if you want to use it for portable applications:

  • For some drivers the value may only available immediately after the insert statement has executed (e.g., mysql, Informix).
  • For some drivers the $catalog, $schema, $table, and $field parameters are required, for others they are ignored (e.g., mysql).
  • Drivers may return an indeterminate value if no insert has been performed yet.
  • For some drivers the value may only be available if placeholders have not been used (e.g., Sybase, MS SQL). In this case the value returned would be from the last non-placeholder insert statement.
  • Some drivers may need driver-specific hints about how to get the value. For example, being told the name of the database 'sequence' object that holds the value. Any such hints are passed as driver-specific attributes in the \%attr parameter.
  • If the underlying database offers nothing better, then some drivers may attempt to implement this method by executing “select max($field) from $table”. Drivers using any approach like this should issue a warning if AutoCommit is true because it is generally unsafe - another process may have modified the table between your insert and the select. For situations where you know it is safe, such as when you have locked the table, you can silence the warning by passing Warn ⇒ 0 in \%attr.

Further documentation

For the full details, see the comprehensive documentation on CPAN:

  • DBI - http:search.cpan.org/~timb/DBI/DBI.pm * DBD::mysql - http:search.cpan.org/~capttofu/DBD-mysql/lib/DBD/mysql.pm
  • DBD::Pg (for PostgreSQL) - http:search.cpan.org/~dbdpg/DBD-Pg/Pg.pm * DBD::Oracle - http:search.cpan.org/~pythian/DBD-Oracle/Oracle.pm
  • DBD::DBM - http:search.cpan.org/~timb/DBI-1.53/lib/DBD/DBM.pm * DBD::SQLite - http:search.cpan.org/~msergeant/DBD-SQLite/lib/DBD/SQLite.pm
  • DBD::Sybase - http:search.cpan.org/~mewp/DBD-Sybase/Sybase.pm * DBD::CSV - http:search.cpan.org/~jzucker/DBD-CSV/lib/DBD/CSV.pm

Other amusing related modules:

  • DBD::Excel - http:search.cpan.org/~kwitknr/DBD-Excel/Excel.pm * DBD::Google - http:search.cpan.org/~darren/DBD-Google/lib/DBD/Google.pm

~~DISCUSSION~~

 
perl/dbi.txt · Last modified: 2010/02/26 10:45 (external edit)
 
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki