RODBC-package {RODBC} | R Documentation |
Package RODBC implements ODBC database connectivity.
ODBC aims to provide a common API for access to SQL-based database management systems such as MySQL, PostgreSQL, Microsoft Access and SQL Server, Oracle and SQLite. It originated on Windows, but ODBC driver managers unixODBC (http://www.unixODBC.org) and iODBC (http://www.iODBC.org) are nowadays available on a wide range of platforms (and iODBC ships with Mac OS X by default). The connection to the particular DBMS needs an ODBC driver: these may come with the DBMS or the ODBC driver manager or be provided separately by the DBMS developers, and there are third-party developers such as Access Technologies, Easysoft and OpenLink.
Microsoft provides drivers on Windows for non-SQL database systems such as DBase and FoxPro, and even for flat files and Excel spreadsheets.
Two groups of functions are provided. The mainly internal
odbc*
commands implement low-level access to the ODBC functions
of similar name. The sql*
functions operate at a higher level
to read, save, copy and manipulate data between data frames and sql
tables. Many connections can be open at once to any combination of
DSN/hosts.
The functions try to cope with the peculiar way the Excel ODBC
driver handles table names. However, SQL expects both table and
column names to be alphanumeric plus _
, and RODBC does not
support vendor extensions. Most of the functions will drop
other characters from table and column names.
options(dec=)
can be used to set the decimal point to be
used when reading numbers from character data on the database: the
default is taken from the current locale via
Sys.localeconv
.
‘Schemas’ are collections of tables within a database that are
supported by some DBMSs: often a separate schema is associated with
each user (and ‘schema’ in ODBC3 replaced ‘owner’ in
ODBC2). In SQL 92, schemas are collected in a ‘catalog’ which
is often inplemented as a database. Where schemas are implemented,
there is a current schema used to find unqualified table names,
and tables in other schemas can be referred to as
schema.table
.
Note that ‘schema’ is used in another sense in the database literature, for the design of a database and in particular of tables, views and privileges.
We give details of some ODBC drivers' interpretations of
catalog
and schema
current at the time of writing (mid
2009). MySQL does not use schemas (except as an alias for
‘database’), and its ODBC driver uses the catalog
to
refer to a database. SQL Server 2008 uses both catalog
and
schema
, catalog
for the database and schema
for
the type of object, e.g. "sys"
for most of the system
tables/views and "dbo"
for user tables. PostgreSQL introduced
schemas in version 7.3 in a slightly different way: they need to be
created with a CREATE SCHEMA
query: tables are by default in
the public
schema, and unqualified table names are searched for
amongst a ‘search path’ of schemas (by default, containing
public
). Oracle uses schemas as synonymous with ‘owner’
(also known as ‘user’). SQLite uses neither. The Excel and
Access ODBC drivers do not use schemas, but do use catalog
to
refer to other database/spreadsheet files.
RODBC usually works with tables in the current schema, and so
tables in other schemas can only be used in a few functions
(sqlClear
, sqlDrop
and sqlFetch
) and in SQL
queries passed to sqlQuery
. What the ‘dotted name’
notation means depends on the DBMS: the SQL 92 meaning is
schema.table
and this is accepted by Oracle, SQL
Server and PostgreSQL. However, MySQL uses
database.table
. Microsoft SQL Server allows
(depending on the version) up to four components:
linked_server.catalog.schema.table
.
PostgreSQL does allow database.schema.table
,
but this is not useful as database
must be the currently
connected database.
Functions sqlTables
, sqlColumns
and
sqlPrimaryKeys
have arguments catalog
and schema
which in principle allow tables in other schema to be listed or
examined: however these are ignored in many current ODBC drivers.
For other uses, the trick is to select the schema(s) you want to use,
which is done via an SQL statement sent by sqlQuery
.
For Oracle you can set the schema (owner) in use by
"ALTER SESSION SET CURRENT_SCHEMA = schema"
whereas for PostgreSQL the incantation is
"SET search_path TO schema1,schema2"
.
In MySQL a database/schema can be selected by a
"USE database"
query. In DB2, creating an alias in
the current schema can be used to access tables in other schemas.
Michael Lapsley and Brian Ripley
odbcConnect
, sqlFetch
,
sqlSave
, sqlTables
, odbcGetInfo
## Not run: channel <- odbcConnect("test") ## some drivers need column names lower-cased or upper-cased here. sqlSave(channel, USArrests, rownames = "State", verbose = TRUE) ## the Oracle driver needs column and table names double-quoted. sqlQuery(channel, paste("select State, Murder from USArrests", "where Rape > 30 order by Murder")) sqlFetch(channel, "USArrests", rownames = "State") sqlDrop(channel, "USArrests") close(channel) ## End(Not run)