开发者

Understanding Google App Engine datastore

i am in the early stages of designing a VERY large system (its an enterprise level point of sale system). as some of you know the data models on these things can get very complicated. i want to run this thing on google app engine because i want to put more of my resources to developing the software rather than building and maintaining an infrastructure.

in that spirit of things, ive been doing a lot of reading on GAE and DataStore. im an old school relational database modeler and ive seen several different concepts of what a schemaless database is and i think ive figured out what datastore is but i want to make sure i have it right

so, if im right gae is a sorta table based system. so if i create a java entity

class user
public string firstname
public string lastname

and deploy it, the "table" user is automatically created and running. then in subsquent releases if i modify class user

class user
public string firstname
public string lastname
public date addDate

and deploy it, the "table" user is automatically updated with the new field.

now, in relating data, as i understand it, its very similar to some of the massively complex systems like SAP where the data is in fact very organized, but due to the volume its referential integrity is a function of the application, not the database engine. so i would have code that looks like this

class user
public long id
public string firstname
public string lastname

class phone
public string phonenumber
public user userentity

and to pull up the phone numbers for a user from scratch instead of

select phone from phone开发者_开发百科 inner join user as phone.userentity = user where user.id = 5
(lay off i know the syntax is incorrect but you get the point) 

i would do something like

select user from user where user.id = 5
then
select phone from phone where phone.userentity = user

and that would retrieve all the phone numbers for the user.

so, as i understand, its not so much a huge change in how to think about structuring data and organizing data, as its a big change on how to access it. i do joins manually with code instead of joins automatically with the database engine. beyond that its the same. am i correct or am i clueless.


There are really no tables at all. If you make some users with only a first and last name, and then later add addDate, then your original entities will still not have an addDate property. None of the user entities are connected at all, in any way. They are not in a table of Users.

You can access all of the objects you wrote to the database that have the name "User" because appengine keeps big, long lists (indexes) of all of the objects that have each name. So, any object you put in there that has the name (kind) "User" will get an entry in this list. Later, you can read that index to get the location of each of your objects, and use those locations (keys) to fetch the objects. They are not in a table, they're just floating around. Some of them have some properties in common, but this is a coincidence, and not a requirement.

If you want to fetch all of the User objects that have a certain name (Select * from User where firstname="Joe") then you have to maintain another big long index of keys. This index has the firstname property as well as the key of an entity on each row. Later you can scan the index for a certain firstname, get all the keys, and then go look up the actual entities you stored with those keys. All of THOSE entities will have the firstname property (because you wouldn't enter an entity without the firstname property on your firstname index), but they may not have any other fields in common, because they are not in a table that enforces any data structure at all.

These complications affect the way data is accessed pretty dramatically, and really affect things like transactions and complex queries. You're basically right that you don't have to change your thinking too much, but you should definitely understand how indexes and transactions work before planning your data structures. It is not always simple to efficiently tack on extra queries that you didn't think of before you got started, and it's pretty expensive to maintain these indexes, so the fewer you can get by with the better.


Great introduction to Google datastore is written by the creator of objectify framework: Fundamental Concepts of the Datastore

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜