Why are circular references considered harmful? [closed]
Why is it a bad design for an object to refer to another object that refers back to the first one?
Circular dependencies between classes are not necessarily harmful. Indeed, in some cases they are desirable. For example, if your application dealt with pets and their owners, you would expect the Pet class to have a method to get the pet's owner, and the Owner class to have a method that returns the list of pets. Sure, this can make memory management more difficult (in a non-GC'ed language). But if the circularity is inherent in the problem, then trying to get rid of it is probably going to lead to more problems.
On the other hand, circular dependencies between modules are harmful. It is generally indicative of a poorly thought-out module structure, and/or failure to stick to the original modularization. In general, a code-base with uncontrolled cross-dependencies will be harder to understand and harder to maintain than one with a clean, layered module structure. Without decent modules, it can be much harder to predict the effects of a change. And that makes maintenance harder, and leads to "code decay" resulting from ill-conceived patching.
(Also, build tools like Maven won't handle modules (artefacts) with circular dependencies.)
Circular references are not always harmful - there are some use cases where they can be quite useful. Doubly-linked lists, graph models, and computer language grammars come to mind. However, as a general practice, there are several reasons why you may want to avoid circular references between objects.
Data and graph consistency. Updating objects with circular references can create challenges in ensuring that at all points in time the relationships between objects are valid. This type of problem often arises in object-relational modeling implementations, where it's not uncommon to find bidirectional, circular references between entities.
Ensuring atomic operations. Ensuring that changes to both objects in a circular reference are atomic can become complicated - particularly when multithreading is involved. Ensuring consistency of an object graph that is accessible from multiple threads requires special synchronization structures and locking operations to ensure that no thread sees an incomplete set of changes.
Physical separation challenges. If two different classes A and B reference each other in a circular fashion, it can become challenging to separate these classes into independent assemblies. It's certainly possible to create a third assembly with interfaces IA and IB that A and B implement; allowing each to reference the other through those interfaces. It's also possible to use weakly typed references (e.g. object) as a way to break the circular dependency, but then access to the method and properties of such an object couldn't be easily accessed - which can defeat the purpose of having a reference.
Enforcing immutable circular references. Languages like C# and VB provide keywords to allow references within an object to be immutable (readonly). Immutable references allow a program to ensure that a reference refers to the same object for the lifetime of the object. Unfortunately, it's not easy to use the compiler-enforced immutability mechanism to ensure that circular references cannot be changes. It can only be done if one object instantiates the other (see C# example below).
class A { private readonly B m_B; public A( B other ) { m_B = other; } } class B { private readonly A m_A; public A() { m_A = new A( this ); } }
Program readibility and maintainability. Circular references are inherently fragile and easy to break. This stems partly from the fact that reading and understanding code that includes circular references is harder than code that avoids them. Ensuring that your code is easy to understand and maintain helps to avoid bugs and allows changes to be made more easily and safely. Objects with circular references are more difficult to unit test because they cannot be tested in isolation from one another.
Object lifetime management. While .NET's garbage collector is capable of identifying and dealing with circular references (and correctly disposing such objects), not all languages/environments can. In environments that use reference counting for their garbage collection scheme (e.g. VB6, Objective-C, some C++ libraries) it is possible for circular references to result in memory leaks. Since each object holds on to the other, their reference counts will never reach zero, and hence will never become candidates for collection and cleanup.
Because now they're really one single object. You can't test either one in isolation.
If you modify one, it's likely that you affect its companion as well.
From Wikipedia:
Circular dependencies can cause many unwanted effects in software programs. Most problematic from a software design point of view is the tight coupling of the mutually dependent modules which reduces or makes impossible the separate re-use of a single module.
Circular dependencies can cause a domino effect when a small local change in one module spreads into other modules and has unwanted global effects (program errors, compile errors). Circular dependencies can also result in infinite recursions or other unexpected failures.
Circular dependencies may also cause memory leaks by preventing certain very primitive automatic garbage collectors (those that use reference counting) from deallocating unused objects.
Such an object can be difficult to be created and destroyed, because in order to do either non-atomicly you have to violate referential integrity to first create/destroy one, then the other (for example, your SQL database might balk at this). It might confuse your garbage collector. Perl 5, which uses simple reference counting for garbage collection, cannot (without help) so its a memory leak. If the two objects are of different classes now they are tightly coupled and cannot be separated. If you have a package manager to install those classes the circular dependency spreads to it. It must know to install both packages before testing them, which (speaking as a maintainer of a build system) is a PITA.
That said, these can all be overcome and its often necessary to have circular data. The real world is not made up of neat directed graphs. Many graphs, trees, hell, a double-linked list is circular.
It hurts code readability. And from circular dependencies to spaghetti code there is just a tiny step.
Here are a couple of examples that may help illustrate why circular dependencies are bad.
Problem #1: What gets initialized/constructed first?
Consider the following example:
class A
{
public A()
{
myB.DoSomething();
}
private B myB = new B();
}
class B
{
public B()
{
myA.DoSomething();
}
private A myA = new A();
}
Which constructor is called first? There's really no way to be sure because it's completely ambiguous. One or the other of the DoSomething methods is going to be called on an object that's uninitialized., resulting in incorrect behavior and very likely an exception being raised. There are ways around this problem, but they're all ugly and they all require non-constructor initializers.
Problem #2:
In this case, I've changed to a non-managed C++ example because the implementation of .NET, by design, hides the problem away from you. However, in the following example the problem will become pretty clear. I'm well aware that .NET doesn't really use reference counting under the hood for memory management. I'm using it here solely to illustrate the core problem. Note also that I've demonstrated here one possible solution to problem #1.
class B;
class A
{
public:
A() : Refs( 1 )
{
myB = new B(this);
};
~A()
{
myB->Release();
}
int AddRef()
{
return ++Refs;
}
int Release()
{
--Refs;
if( Refs == 0 )
delete(this);
return Refs;
}
B *myB;
int Refs;
};
class B
{
public:
B( A *a ) : Refs( 1 )
{
myA = a;
a->AddRef();
}
~B()
{
myB->Release();
}
int AddRef()
{
return ++Refs;
}
int Release()
{
--Refs;
if( Refs == 0 )
delete(this);
return Refs;
}
A *myA;
int Refs;
};
// Somewhere else in the code...
...
A *localA = new A();
...
localA->Release(); // OK, we're done with it
...
At first glance, one might think that this code is correct. The reference counting code is pretty simple and straightfoward. However, this code results in a memory leak. When A is constructed, it initially has a reference count of "1". However, the encapsulated myB variable increments the reference count, giving it a count of "2". When localA is released, the count is decremented, but only back to "1". Hence, the object is left hanging and never deleted.
As I mentioned above, .NET doesn't really use reference counting for its garbage collection. But it does use similar methods to determine if an object is still being used or if it's OK to delete it, and almost all such methods can get confused by circular references. The .NET garbage collector claims to be able to handle this, but I'm not sure I trust it because this is a very thorny problem. Go, on the other hand, gets around the problem by simply not allowing circular references at all. Ten years ago I would have preferred the .NET approach for its flexibility. These days, I find myself preferring the Go approach for its simplicity.
It is completely normal to have objects with circular references e.g. in a domain model with bidirectional associations. An ORM with a properly written data access component can handle that.
Refer to Lakos’s book, in C++ software design, cyclic physical dependency is undesirable. There are several reasons:
- It makes them hard to test and impossible to reuse independently.
- It makes them difficult for people to understand and maintain.
- It will increase the link-time cost.
Circular references seem to be a legitimate domain modelling scenario. An example is Hibernate and many other ORM tools encourage this cross association between entities to enable bi-directional navigation. Typical example in an online Auction system, a Seller entitiy may maintain a reference to the List of entities He/She is selling. And every Item can maintain a reference to it's corresponding seller.
The .NET garbage collector can handle circular references so there is no fear of memory leaks for applications working on the .NET framework.
精彩评论