开发者

Using synchronization in Data Access Layer

Suppose we are developing class which implements simple CRUD operations for working with DB. This class also maintain cache for increasing performance.

public class FooTableGateway {
   Map<Integer, Foo> id2foo = new HashMap<Integer, Foo> ();
   public void getFoo (int id) {
      if (id2foo.containsKey (id) {
          return id2foo.get (id);
      }
      String query = "select ...";
      Connection cn = null;
      Statement st = null;
      ResultSet rs = null;
      try {
          cn = DBUtils.getConnection ();
          st = cn.createStatement ();
          rs = st.executeQuery (query);

          if (!rs.next ()) {
              return null;
          }
          Foo foo = new Foo (rs.getString (1)...);
          id2foo.put (id, foo);
          return foo;
      } catch (SQLException e) {
          ..
      } finally {
          ..
      }
   }

   public boolean addFoo (Foo foo) {
      if (id2foo.values ().contains (foo) {
           return false;
      }
      String query = "insert into ...";
      Connection cn = null;
      Statement st = null;
      ResultSet rs = null;
      try {
          cn = DBUtils.getConnection ();
          st = cn.createStatement ();
          int num = st.executeUpdate (query.toString (开发者_开发百科),
                  Statement.RETURN_GENERATED_KEYS);
          rs = st.getGeneratedKeys ();
          rs.next ();  
          foo.setId (rs.getInt (1);
          id2foo.put (foo.getId (), foo);
          return true;
      } catch (SQLException e) {
          ..
          return false;
      } finally {
          ..
      }    
   }

   public void updateFoo (Foo foo) {
      //something similar
      ..
   }

   public boolean deleteFoo (int id) {
      //something similar
      ..
   }

}

The question is: which part of code should be synchronized? (we are developing web-application, of course).

If I'll synchronize all calls to the cache-collection then I'm not even sure that using cache would improve performance.


The question is: which part of code should be synchronized?

As always, you need to synchronize access to data, which is modified by one thread and simultaneous either modified or even just read by another thread.

In this example, that shared data is your id2foo Dictionary. So, put a lock around the following statements:

  • One here:

      if (id2foo.containsKey (id) {
          return id2foo.get (id);
      }
    
  • Another here:

      id2foo.put (id, foo);
    

To maximize concurrency (i.e. to minimize lock contention) you should make the lifetime of these locks as short as possible: i.e. only around the few statements which I listed above, and not around the entire getFoo and addFoo methods.

Beware however that doing this with a cache might get you stale data; that can happen anyway with a database (depending on the 'transaction isolation level'), but beware.

If I'll synchronize all calls to the cache-collection then I'm not even sure that using cache would improve performance.

In my opinion it is likely to improve performance: assuming you're not storing too much data in the cache, it can take much less time to read from the cache than to read from the database, even if you need to wait for a lock on the cache, especially if the lock on the cache is short-lived as I've suggested it should be.

If you want to be fancy, you can use a multi-reader/single-writer lock, so that there's no contention when multiple threads are reading from the cache while no-one is writing to the cache.


ChrisW nailed it with his answer - you need to protect the shared state from being accessed + modified by multiple threads. Your shared state in this example is the instance level Map

Map<Integer, Foo> id2foo = new HashMap<Integer, Foo> ();

that you are using as a cache. Synchronnizing the access and modification of this will make it thread safe.

Another approach you could take is to use some of the higher level non-blocking utilities available in the Java Concurrent Utils api.

Specifically, take a look at ConcurrentHashMap which allows concurrent reads without blocking and adjustably concurrent updates.

In your case, this would be a drop in replacement for HashMap. ConcurrentMap defines the atomic non blocking V putiFAbsent(T key, V value) method for adding to the cache, and you can read from it safely from multiple threads without locking.


Wow... that is a lot of code for a single method. I would really advice to break it down into methods and objects doing on thing at a time.

In the code given above you should synchronize on the cache-collection while reading, writing and removing; which would lock the cache so parallel reading isn't possible.

Writing a performant thread-safe cache isn't easy (especially if you're going to need it to be clustered now or in the future). You should really have a look at existing ones like EHCache (http://ehcache.org/) or JBoss Cache (http://jboss.org/jbosscache/).


You could also have look JDK 5+ RWLs. Citing Wikipedia:

In this pattern, multiple readers can read the data in parallel but an exclusive lock is needed while writing the data. When a writer is writing the data, readers will be blocked until the writer is finished writing.

Make sure to have a look at the potential pitfalls using R/W Locks though e.g. this Java Specialists' Newsletter.


There are too many problems with this code as written.

I don't think a DAO should have anything to do with acquiring a Connection to the database; it should be passed in or injected into the class. A DAO has no way of knowing whether it's being used in a larger transaction context. A separate service layer, whose methods correspond to use cases that know about units of work, should be the one responsible for acquiring the connection, setting the transaction and isolation, marshaling DAOs and business entities to fulfill the use case, commit or rollback the transaction, and clean up the resources.

You've got a lot going on here: persistence, caching, etc. Your life will be better if you can start to peel away some of these responsibilities and put them elsewhere. I think your Gateway is doing too much.

UPDATE:

The Map you've tossed into your class tells me that this is a huge mistake. I don't see any SoftReferences to help the garbage collector out. I don't see any effort to limit the size of the cache or refresh values when they've been updated. This is an approach that's begging for trouble. Writing a cache is a big endeavor. If you don't believe me, download the source for EhCache and compare it to your Map. It's not trivial.

No logic for declarative transactions - another huge mistake.

With all due respect, I'd reconsider this implementation.

A better suggestion would be to learn Spring and/or Hibernate.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜