Thursday, June 21, 2007

Postgres Schemas: Seeing The Light

So last night I finally got around to pondering Postgres schemas (hey, you have your idea of fun reading, I have mine). I've seen them mentioned in the docs before, and they looked important, yet I've always managed to avoid them.

Here's what the docs basically say about them:

A database contains one or more named schemas, which in turn contain tables. Schemas also contain other kinds of named objects, including data types, functions, and operators. The same object name can be used in different schemas without conflict; for example, both schema1 and myschema may contain tables named mytable. Unlike databases, schemas are not rigidly separated: a user may access objects in any of the schemas in the database he is connected to, if he has privileges to do so.

Basically, schemas are a namespace mechanism. It allows you to have two tables named foo without them stomping on each other.

Then it hit me, in other languages a scoping mechanism is absolutely critical. You couldn't reasonably develop Java without a packaging structure? After all, what if I wanted to name an object Url and other code in a large system wanted to use that name too?

So, if namespace mechanisms are so key for other languages, why can I totally ignore the one that comes with Postgres?

I think that boils down to the basic way I've always approach database development (as have others around me): each project deserves its own clean set of tables. You already know that every web app you build is going to need a users table. And probably a groups table. And probably a login_history table. Yet, my natural approach would be to re-design these tables into the schema of each system.

This completely violates the DRY principle, the nearly one sacred rule that most developers can agree on.

Using schemas, the right thing to do in the above case is to have a pre-built set of tables and stored procedures all in auth schema. So, starting a new project should be as easy as:

psql postgres
CREATE DATABASE my_new_project
\c my_new_project
\i shared_schemas/auth.sql
-- lots of noisy messages go by

When the above is complete, you would have access to an auth.users table, and auth.groups table, and functions like auth.authenticate(username).

The bottom line: same re-use game can and should be played with all database objects, including tables, views and functions and schemas are the path to getting there.

This, I got to try.

1 comment:

  1. Anonymous12:00 AM

    I pishposhed your statements on the importance of namespaces this morning.
    I just got done with a 30 minute debugging and renaming session thanks to the fact that two JavaScript includes both defined the function "isNum". Suffice it to say now there is a "ben_isNum".

    I now am convinced beyond a shadow of a doubt that you write your blog for nobody else but me. Everyone else can stop reading now.