Python中带有显式__del__方法的对象需要手动释放循环引用

Python中有自动gc,这个gc在一般情况下也可以清除循环引用的对象。

不过有个例外的情况:一个对象显式定义了__del__方法。

例如如下的代码:

#!/usr/bin/env python

class Foo:
    def __init__(self):
        self._bar = {"test": self.test}
        print "construct"

    def test(self):
        print "test"

    def __del__(self):
        print "del"

f = Foo()
del f

运行结果是不会打印”del”的。

文档里面也有写到:
http://docs.python.org/library/gc.html#gc.garbage

gc.garbage
A list of objects which the collector found to be unreachable but could not be freed (uncollectable objects). By default, this list contains only objects with __del__() methods. [1] Objects that have __del__() methods and are part of a reference cycle cause the entire reference cycle to be uncollectable, including objects not necessarily in the cycle but reachable only from it. Python doesn’t collect such cycles automatically because, in general, it isn’t possible for Python to guess a safe order in which to run the __del__() methods. If you know a safe order, you can force the issue by examining the garbage list, and explicitly breaking cycles due to your objects within the list. Note that these objects are kept alive even so by virtue of being in the garbage list, so they should be removed from garbage too. For example, after breaking cycles, do del gc.garbage[:] to empty the list. It’s generally better to avoid the issue by not creating cycles containing objects with __del__() methods, and garbage can be examined in that case to verify that no such cycles are being created.

当一个对象显式定义了__del__方法,而且里面有循环引用,Python不会自动回收这个对象。如果这种情况没有正确处理,会造成内存泄漏。

解决的办法是在__del__中手动解除循环引用,或者干脆避免这种有循环引用的写法。

这个问题导致数据库连接不能释放,终于在周末晚上数据库连接爆了,丢脸啊。

PS: 这次事件也再一次显示“分层架构互不信任防雪崩策略”的重要性。这个出事的DB是一个slave,它的max_connection和max_user_connection是一样的,而且连接超时竟然是8小时……如果max_user_connection设小点,顶多是使用这个用户的系统挂掉,而不会是所有使用这个DB的系统都挂掉,把影响隔离起来。