DB-API X¶
dbapix
is a unification of, and extension to, several DB-API 2.0 drivers.
We currently wrap:
"postgres"
via psycopg2"sqlite"
via sqlite3"mysql"
via PyMySQL or MySQL-Python
The first goal is to provide a pool
of normalized
DB-API 2.0 connections
and cursors
,
and for them to have the same Python API and semantics regardless of the driver.
The second goal is to provide some common niceties:
f-string-like parameter binding;
high-level functions, e.g.
insert
,update
, etc.;dict-like rows;
interaction with 3rd-party libraries, e.g. Pandas;
automatic SSH tunnels;
a registry of connections.
Examples¶
Basics¶
Create a connection pool, and “borrow” a connection from it (which will be automatically returned when exiting the context):
from dbapix import create_engine
# Create an engine with the name of the driver. Here we're connecting to
# a PostgreSQL database via the `psycopg2` package.
engine = create_engine('postgres',
host='localhost',
database='example',
)
# Context managers provide reliable resource management; the connection
# will be returned to the pool when exiting this context.
with engine.connect() as con:
foo_id = 123
# If not provided explicitly, parameters are pulled
# from the calling scope for an f-string-like experience
# but where Bobby Tables won't cause trouble.
cur = con.cursor()
cur.execute('''SELECT bar FROM foo WHERE id = {foo_id}''')
for row in cur:
# Values are accessible by index or name.
print(row['bar'])
You can also manually manage connections, and directly turn results into Pandas dataframes:
from dbapix import create_engine
# Lets use SQLite this time, via the `sqlite3` driver.
engine = create_engine('sqlite', 'mydatabase.db')
# Take ownership of a connection. This is now our responsibility to
# either close, or return to the pool via `engine.put_connection(con)`.
con = engine.get_connection()
# Connections have an `execute` method which creates a cursor for us.
cur = con.execute('''
SELECT foo, sum(bar) AS bar
FROM myawesometable
GROUP BY baz
''')
# Turn the result into a Pandas DataFrame!
df = cur.as_dataframe()
SSH Tunnels¶
If you need an SSH tunnel to connect, you can give a set of kwargs to be passed to a SSHTunnelForwarder for the Postgres and MySQL engines:
from dbapix import create_engine
engine = create_engine('postgres',
database='only_on_remote',
tunnel=dict(
host='database.example.com',
),
)
# The tunnel will be created at the first connection.
engine.close() # Shut it down explicitly.
See SocketEngine
for more.
Registries¶
If you have multiple databases that you’re shifting bettween, e.g. staging
and production, we have a Registry
that you can use to contain
the connection details.
I might make a common file with that registry, e.g. mydbs.py
:
from dbapix.registry import Registry
registry = Registry()
registry.register('staging', 'postgres',
host='stage.example.com',
database='myapp-stage',
user='devuser',
password='hunter2',
)
registry.register('production', 'postgres',
# Need an SSH tunnel to production!
tunnel=dict(
host='gateway.prod.example.com', # The SSH host.
remote_bind_host='database.prod.example.com' # The database host.
ssh_pkey=os.path.expanduser('~/.ssh/id_rsa'), # Key isn't pulled in by default.
)
database='myapp',
user='appuser',
password='HuNt3r2',
)
# "Export" this registry method as the API of my module.
create_engine = registry.create_engine
Then in my scripts, e.g. awesome_data_science.py
:
import mydbs
engine = mydbs.create_engine('staging')
with engine.connect() as con:
cur = con.execute('select 1 as foo')
df = cur.as_dataframe()
# Do some science!