API Reference

pymapd.connect(uri=None, user=None, password=None, host=None, port=9091, dbname=None, protocol='binary')

Crate a new Connection.

Parameters:
uri : str
user : str
password : str
host : str
port : int
dbname : str
protocol : {‘binary’, ‘http’, ‘https’}
Returns:
conn : Connection

Examples

You can either pass a string uri or all the individual components

>>> connect('mapd://mapd:HyperInteractive@localhost:9091/mapd?'
...         'protocol=binary')
Connection(mapd://mapd:***@localhost:9091/mapd?protocol=binary)
>>> connect(user='mapd', password='HyperInteractive', host='localhost',
...         port=9091, dbname='mapd')
class pymapd.Connection(uri=None, user=None, password=None, host=None, port=9091, dbname=None, protocol='binary')

Connect to your OmniSci database.

close()

Disconnect from the database

commit()

This is a noop, as OmniSci does not provide transactions.

Implementing to comply with the specification.

create_table(table_name, data, preserve_index=False)

Create a table from a pandas.DataFrame

Parameters:
table_name : str
data : DataFrame
preserve_index : bool, default False

Whether to create a column in the table for the DataFrame index

cursor()

Create a new Cursor object attached to this connection.

deallocate_ipc(df, device_id=0)

Deallocate a DataFrame using CPU shared memory.

Parameters:
device_id : int

GPU which contains TDataFrame

deallocate_ipc_gpu(df, device_id=0)

Deallocate a DataFrame using GPU memory.

Parameters:
device_id : int

GPU which contains TDataFrame

execute(operation, parameters=None)

Execute a SQL statement

Parameters:
operation : str

A SQL statement to exucute

Returns:
c : Cursor
get_table_details(table_name)

Get the column names and data types associated with a table.

Parameters:
table_name : str
Returns:
details : List[tuples]

Examples

>>> con.get_table_details('stocks')
[ColumnDetails(name='date_', type='STR', nullable=True, precision=0,
               scale=0, comp_param=32),
 ColumnDetails(name='trans', type='STR', nullable=True, precision=0,
               scale=0, comp_param=32),
 ...
]
get_tables()

List all the tables in the database

Examples

>>> con.get_tables()
['flights_2008_10k', 'stocks']
load_table(table_name, data, method='infer', preserve_index=False, create='infer')

Load data into a table

Parameters:
table_name : str
data : pyarrow.Table, pandas.DataFrame, or iterable of tuples
method : {‘infer’, ‘columnar’, ‘rows’}

Method to use for loading the data. Three options are available

  1. pyarrow and Apache Arrow loader
  2. columnar loader
  3. row-wise loader

The Arrow loader is typically the fastest, followed by the columnar loader, followed by the row-wise loader. If a DataFrame or pyarrow.Table is passed and pyarrow is installed, the Arrow-based loader will be used. If arrow isn’t available, the columnar loader is used. Finally, data is an iterable of tuples the row-wise loader is used.

preserve_index : bool, default False

Whether to keep the index when loading a pandas DataFrame

create : {“infer”, True, False}

Whether to issue a CREATE TABLE before inserting the data.

  • infer : check to see if the table already exists, and create a table if it does not
  • True : attempt to create the table, without checking if it exists
  • False : do not attempt to create the table
load_table_arrow(table_name, data, preserve_index=False)

Load a pandas.DataFrame or a pyarrow Table or RecordBatch to the database using Arrow columnar format for interchange

Parameters:
table_name : str
data : pandas.DataFrame, pyarrow.RecordBatch, pyarrow.Table
preserve_index : bool, default False

Whether to include the index of a pandas DataFrame when writing.

Examples

>>> df = pd.DataFrame({"a": [1, 2, 3], "b": ['d', 'e', 'f']})
>>> con.load_table_arrow('foo', df, preserve_index=False)
load_table_columnar(table_name, data, preserve_index=False, chunk_size_bytes=0)

Load a pandas DataFrame to the database using OmniSci’s Thrift-based columnar format

Parameters:
table_name : str
data : DataFrame
preserve_index : bool, default False

Whether to include the index of a pandas DataFrame when writing.

chunk_size_bytes : integer, default 0

Chunk the loading of columns to prevent large Thrift requests. A value of 0 means do not chunk and send the dataframe as a single request

Examples

>>> df = pd.DataFrame({"a": [1, 2, 3], "b": ['d', 'e', 'f']})
>>> con.load_table_columnar('foo', df, preserve_index=False)
load_table_rowwise(table_name, data)

Load data into a table row-wise

Parameters:
table_name : str
data : Iterable of tuples

Each element of data should be a row to be inserted

Examples

>>> data = [(1, 'a'), (2, 'b'), (3, 'c')]
>>> con.load_table('bar', data)
render_vega(vega, compression_level=1)

Render vega data on the database backend, returning the image as a PNG.

Parameters:
vega : dict

The vega specification to render.

compression_level: int

The level of compression for the rendered PNG. Ranges from 0 (low compression, faster) to 9 (high compression, slower).

select_ipc(operation, parameters=None, first_n=-1)

Execute a SELECT operation using CPU shared memory

Parameters:
operation : str

A SQL select statement

parameters : dict, optional

Parameters to insert for a parametrized query

Returns:
df : pandas.DataFrame

Notes

This method requires pandas and pyarrow to be installed

select_ipc_gpu(operation, parameters=None, device_id=0, first_n=-1)

Execute a SELECT operation using GPU memory.

Parameters:
operation : str

A SQL statement

parameters : dict, optional

Parameters to insert into a parametrized query

device_id : int

GPU to return results to

Returns:
gdf : pygdf.GpuDataFrame

Notes

This requires the option pygdf and libgdf libraries. An ImportError is raised if those aren’t available.

class pymapd.Cursor(connection, columnar=True)

A database cursor.

arraysize

The number of rows to fetch at a time with fetchmany. Default 1.

See also

fetchmany

close()

Close this cursor.

description

Read-only sequence describing columns of the result set. Each column is an instance of Description describing

  • name
  • type_code
  • display_size
  • internal_size
  • precision
  • scale
  • null_ok

We only use name, type_code, and null_ok; The rest are always None

execute(operation, parameters=None)

Execute a SQL statement.

Parameters:
operation : str

A SQL query

parameters : dict

Parameters to substitute into operation.

Returns:
self : Cursor

Examples

>>> c = conn.cursor()
>>> c.execute("select symbol, qty from stocks")
>>> list(c)
[('RHAT', 100.0), ('IBM', 1000.0), ('MSFT', 1000.0), ('IBM', 500.0)]

Passing in parameters:

>>> c.execute("select symbol qty from stocks where qty <= :max_qty",
...           parameters={"max_qty": 500})
[('RHAT', 100.0), ('IBM', 500.0)]
executemany(operation, parameters)

Execute a SQL statement for many sets of parameters.

Parameters:
operation : str
parameters : list of dict
Returns:
results : list of lists
fetchmany(size=None)

Fetch size rows from the results set.

fetchone()

Fetch a single row from the results set

Exceptions

Define exceptions as specified by the DB API 2.0 spec.

Includes some helper methods for translating thrift exceptions to the ones defined here.

exception pymapd.exceptions.Error

Base class for all pymapd errors.

exception pymapd.exceptions.InterfaceError

Raised whenever you use pymapd interface incorrectly.

exception pymapd.exceptions.DatabaseError

Raised when the database encounters an error.

exception pymapd.exceptions.OperationalError

Raised for non-programmer related database errors, e.g. an unexpected disconnect.

exception pymapd.exceptions.IntegrityError

Raised when the relational integrity of the database is affected.

exception pymapd.exceptions.InternalError

Raised for errors internal to the database, e.g. and invalid cursor.

exception pymapd.exceptions.ProgrammingError

Raised for programming errors, e.g. syntax errors, table already exists.

exception pymapd.exceptions.NotSupportedError

Raised when an API not supported by the database is used.