RODBC-package {RODBC}R Documentation

ODBC Database Connectivity

Description

Package RODBC implements ODBC database connectivity.

ODBC aims to provide a common API for access to SQL-based database management systems such as MySQL, PostgreSQL, Microsoft Access and SQL Server, Oracle and SQLite. It originated on Windows, but ODBC driver managers unixODBC (http://www.unixODBC.org) and iODBC (http://www.iODBC.org) are nowadays available on a wide range of platforms (and iODBC ships with Mac OS X by default). The connection to the particular DBMS needs an ODBC driver: these may come with the DBMS or the ODBC driver manager or be provided separately by the DBMS developers, and there are third-party developers such as Access Technologies, Easysoft and OpenLink.

Microsoft provides drivers on Windows for non-SQL database systems such as DBase and FoxPro, and even for flat files and Excel spreadsheets.

Details

Two groups of functions are provided. The mainly internal odbc* commands implement low-level access to the ODBC functions of similar name. The sql* functions operate at a higher level to read, save, copy and manipulate data between data frames and sql tables. Many connections can be open at once to any combination of DSN/hosts.

The functions try to cope with the peculiar way the Excel ODBC driver handles table names. However, SQL expects both table and column names to be alphanumeric plus _, and RODBC does not support vendor extensions. Most of the functions will drop other characters from table and column names.

options(dec=) can be used to set the decimal point to be used when reading numbers from character data on the database: the default is taken from the current locale via Sys.localeconv.

Schemas and catalogs

‘Schemas’ are collections of tables within a database that are supported by some DBMSs: often a separate schema is associated with each user (and ‘schema’ in ODBC3 replaced ‘owner’ in ODBC2). In SQL 92, schemas are collected in a ‘catalog’ which is often inplemented as a database. Where schemas are implemented, there is a current schema used to find unqualified table names, and tables in other schemas can be referred to as schema.table.

Note that ‘schema’ is used in another sense in the database literature, for the design of a database and in particular of tables, views and privileges.

We give details of some ODBC drivers' interpretations of catalog and schema current at the time of writing (mid 2009). MySQL does not use schemas (except as an alias for ‘database’), and its ODBC driver uses the catalog to refer to a database. SQL Server 2008 uses both catalog and schema, catalog for the database and schema for the type of object, e.g. "sys" for most of the system tables/views and "dbo" for user tables. PostgreSQL introduced schemas in version 7.3 in a slightly different way: they need to be created with a CREATE SCHEMA query: tables are by default in the public schema, and unqualified table names are searched for amongst a ‘search path’ of schemas (by default, containing public). Oracle uses schemas as synonymous with ‘owner’ (also known as ‘user’). SQLite uses neither. The Excel and Access ODBC drivers do not use schemas, but do use catalog to refer to other database/spreadsheet files.

RODBC usually works with tables in the current schema, and so tables in other schemas can only be used in a few functions (sqlClear, sqlDrop and sqlFetch) and in SQL queries passed to sqlQuery. What the ‘dotted name’ notation means depends on the DBMS: the SQL 92 meaning is schema.table and this is accepted by Oracle, SQL Server and PostgreSQL. However, MySQL uses database.table. Microsoft SQL Server allows (depending on the version) up to four components:
linked_server.catalog.schema.table. PostgreSQL does allow database.schema.table, but this is not useful as database must be the currently connected database.

Functions sqlTables, sqlColumns and sqlPrimaryKeys have arguments catalog and schema which in principle allow tables in other schema to be listed or examined: however these are ignored in many current ODBC drivers.

For other uses, the trick is to select the schema(s) you want to use, which is done via an SQL statement sent by sqlQuery. For Oracle you can set the schema (owner) in use by
"ALTER SESSION SET CURRENT_SCHEMA = schema" whereas for PostgreSQL the incantation is
"SET search_path TO schema1,schema2". In MySQL a database/schema can be selected by a "USE database" query. In DB2, creating an alias in the current schema can be used to access tables in other schemas.

Author(s)

Michael Lapsley and Brian Ripley

See Also

odbcConnect, sqlFetch, sqlSave, sqlTables, odbcGetInfo

Examples

## Not run: 
channel <- odbcConnect("test")
## some drivers need column names lower-cased or upper-cased here.
sqlSave(channel, USArrests, rownames = "State", verbose = TRUE)
## the Oracle driver needs column and table names double-quoted.
sqlQuery(channel, paste("select State, Murder from USArrests",
                        "where Rape > 30 order by Murder"))
sqlFetch(channel, "USArrests", rownames = "State")
sqlDrop(channel, "USArrests")
close(channel)
## End(Not run)

[Package RODBC version 1.2-6 Index]