开发者

Relational data model to Google datastore mapping

First off, I come from a RDBMS/SQL/C++/Java/Python background and I'm a newbie

to Gaelyk, the Google API and the Google datastore.

I like to model (using flowcharts for code and DB modeling tools for the database)

before I code.

I've used Erwin heavily in the past to do DB modeling.

In Erwin, I've designed a logical / physical data model of a database I'd like to

implement using the Google datastore and Gaelyk with the Google AppEngine SDK.

I wanted to design the data layout before coding anything.

My design tool of choice has been Erwin Data Modeler.

When I looked at the Google datastore, I saw that there

are no relational constraints, and joins are done via

WHERE clause :bind variables.

How can I map my existing model (with PKs/FKs, dependent entities, heavy relational links) to the Google datastore?

Is there a modeling tool that will allow me to design for the Google datastore?

Is the DB design supposed to flow from the Gaelyk MVC pattern and direct coding?

I'm not used to this as I come from an RDBMS background where you model heavily

and all good things come from good relational design.

Also, before coding a database client app in an imperative language (C++, C, Java, Python),

I like to write开发者_运维技巧 pseudocode, BUT first and foremost comes the DB design (if the app

has a DB back-end)

Am I doing this all wrong? It looks like there's a set of tools available to me

to start coding, but the design tool set is not there.

Addendum:

Here is the logical model I'm trying to map

Relational data model to Google datastore mapping

How would I map a circular relationship

account --(1:m)-- following --(m:1)-- following_account_id --(1:1)-- account_id?


In general, the guiding principle of the App Engine datastore - and all nonrelational databases - is "optimize for reads". In short that means denormalize, denormalize, denormalize. In some cases, that will make updates harder - for example, if you make your username the primary key of your accounts table, and a user wants to change usernames - and in some cases that will require duplicating data, such as storing persistent counts. All of this is worthwhile, though, since it gives much better read performance and scalability, and in a typical webapp, reads outnumber writes by factors of hundreds to one.

Looking at your model in particular, it's very normalized - more so than most RDBMS models I've seen, even. Some suggestions:

  • Roll up things like 'user_name_id' into your main accounts table.
  • For things like 'following', use a list property if the number of people someone follows is typically small (<1000), or the fan-out pattern otherwise.
  • Pick a reasonable primary key for each table where practical, such as username or email, and use that as a key name. This allows looking up records with get operations instead of queries, which are substantially faster.
  • When a lookup table such as 'account type' is necessary, make sure the foreign key is sufficiently descriptive you only have to look up the corresponding record for administrative actions. Better, store small, infrequently changing details like this outside the datastore, so they can be accessed instantly.
  • For things like tags, use list properties to reduce the number of times you have to lookup related entities, and to make indexing easier.

This only scratches the surface, of course, and there's a lot of collected wisdom here on SO, in the groups, and on blogs like mine. Feel free to come back and ask specific questions about data modelling!

To answer your other questions, no, there are no GAE-specific data modelling tools I'm aware of, but you can use a standard diagramming tool as you already are. Models are indeed defined in code, since the datastore is schemaless, but that doesn't have to be a barrier to the order in which you implement things.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜