利用C++ RAII技术自动回收堆内存

浏览数：18 / 时间：2015年06月08日

　　在C++的编程过程中，我们经常需要申请一块动态内存，然后当用完以后将其释放。通常而言，我们的代码是这样的：
　　1: void func()
　　2: {
　　3: //allocate a dynamic memory
　　4: int *ptr = new int;
　　5:
　　6: //use ptr
　　7:
　　8: //release allocated memory
　　9: delete ptr;
　　10: ptr = NULL;
　　11: }
　　如果这个函数func()逻辑比较简单，问题不大，但是当中间的代码有可能抛出异常时，上面的代码就会产生内存泄露(memory leak)，如下面代码中第11行和12行将不会被执行。当然有码友会说用try-catch包起来就可以了，对，没错，但是代码中到处的try-catch也挺被人诟病的：
　　1: void func()
　　2: {
　　3: //allocate a dynamic memory
　　4: int *ptr = new int;
　　5:
　　6: throw “error”; //just an example
　　7:
　　8: //use ptr
　　9:
　　10: //release allocated memory
　　11: delete ptr;
　　12: ptr = NULL;
　　13: }
　　而且当函数有多个返回路径时，需要在每个return前都要调用delete去释放资源，代码也会变的不优雅了。
　　1: void func()
　　2: {
　　3: //allocate a dynamic memory
　　4: int *ptr = new int;
　　5:
　　6: if (...)
　　7: {
　　8: //...a
　　9:
　　10: //release allocated memory
　　11: delete ptr;
　　12: ptr = NULL;
　　13: return;
　　14: } else if (....)
　　15: {
　　16: //...b
　　17:
　　18: //release allocated memory
　　19: delete ptr;
　　20: ptr = NULL;
　　21: return;
　　22: }
　　23:
　　24: //use ptr
　　25:
　　26: //release allocated memory
　　27: delete ptr;
　　28: ptr = NULL;
　　29: }
　　鉴于此，我们就要想办法利用C++的一些语言特性，在函数退栈时能够将局部申请的动态内存自动释放掉。熟悉C++的码友们都知道，当一个对象退出其定义的作用域时，会自动调用它的析构函数。也就是说如果我们在函数内定义一个局部对象，在函数返回前，甚至有异常产生时，这个局部对象的析构函数都会自动调用。如果我们能够将释放资源的代码交付给这个对象的析构函数，我们就可以实现资源的自动回收。这类技术，通常被称为RAII (初始化中获取资源)。
　　什么是RAII以及几个例子
　　在C++等面向对象语言中，为了管理局部资源的分配以及释放(resource allocation and deallocation)，实现异常安全(exception-safe)、避免内存泄露等问题，C++之父Bjarne Stroustrup发明了一种叫做”初始化中获取资源“ (RAII, Resource Acquisition Is Initialization，也可以叫做Scope-Bound Resource Management)的技术。简单来说，它的目的就是利用一个局部对象，在这个对象的构造函数内分配资源，然后在其析构函数内释放资源。这样，当这个局部对象退出作用域时，它所对应的的资源即可自动释放。在实现上，它通常有三个特点：
　　创建一个特殊类，在其构造函数初申请资源; www.tygj123.com
　　封装目标对象，将申请资源的目标对象作为这个特殊类的成员变量;
　　在这个类的析构函数内，释放资源。
　　一个典型的例子就是标准库中提供的模板类std::auto_ptr。如在《C++程序设计语言》(《The C++ Programming Language, Special Edition》, Bjarne Stroustrup著，裘宗燕译)中第327页所描述的。
　　1: template
　　2: class std::auto_ptr {
　　3:
　　4: public:
　　5: //在构造函数中，获得目标指针的管理权
　　6: explicit auto_ptr(X *p = 0) throw() { ptr = p; }
　　7: //在析构函数中，释放目标指针
　　8: ~auto_ptr() throw() { delete ptr; }
　　9:
　　10: //...
　　11:
　　12: //重装*和->运算符，使auto_ptr对象像目标指针ptr一样使用
　　13: X& operator*() const throw() { return *ptr; }
　　14: X* operator->() const throw() { return ptr; }
　　15:
　　16: //放弃对目标指针的管理权
　　17: X* release() throw() { X* t = ptr; ptr = 0; return t; }
　　18:
　　19: private:
　　20: X *ptr;
　　21: };
　　想要使用它，非常简单，例如
　　1: #include
　　2:
　　3: void func()
　　4: {
　　5: std::auto_ptr p(new int);
　　6:
　　7: //use p just like ptr
　　8:
　　9: return;
　　10: }
　　另一个例子，是利用GCC中的cleanup attribute。它可以指定一个函数，在该变量退出作用域时可以执行。例如Wikipedia上提到的宏
　　1: #define RAII_VARIABLE(vartype,varname,initval,dtor) \
　　2: void _dtor_ ## varname (vartype * v) { dtor(*v); } \
　　3: vartype varname __attribute__((cleanup(_dtor_ ## varname))) = (initval)
　　我们可以这样使用，例如
　　1: void example_usage() {
　　2: RAII_VARIABLE(FILE*, logfile, fopen("logfile.txt", "w+"), fclose);
　　3: fputs("hello logfile!", logfile);
　　4: }
　　还有一个例子，是在刘未鹏的博客文章”C++11 (及现代C++风格)和快速迭代式开发“中的”资源管理“一节中看到的，他借助C++11的std::function实现了这一特性。感兴趣的码友可以到他博客内阅读。
　　笔者采用的方法
　　对于new/delete，使用上面提到的std::auto_ptr就可以了，但是对于new/delete[]一个动态的一维数组，甚至二维数组，auto_ptr就无能为力了。而且在一些项目中，特别是一些有着悠久历史的代码中，还存在着使用malloc, new混用的现象。所以笔者设计了一个auto_free_ptr类，实现目标资源的自动回收。它的实现比较简单，只利用了RAII的第三个特点——”在类的析构函数内释放资源”，但有一个优点是可以在申请堆内存代码前使用 www.yztrans.com
　　代码如下，
　　1: //auto_free_ptr is only used for automation free memory
　　2: template
　　3: class auto_free_ptr
　　4: {
　　5: public:
　　6: typedef enum {invalid, new_one, new_array, alloc_mem} EFLAG;
　　7: auto_free_ptr() { initialize(); }
　　8: ~auto_free_ptr(){ free_ptr(); }
　　9:
　　10: ///set the pointer needed to automatically free
　　11: inline void set_ptr(T** new_ptr_address, EFLAG new_eflag)
　　12: { free_ptr(); p_ptr = new_ptr_address; eflag = new_eflag; }
　　13:
　　14: ///give up auto free memory
　　15: inline void give_up() { initialize(); }
　　16:
　　17: protected:
　　18: inline void initialize() { p_ptr = NULL; eflag = invalid; }
　　19: inline void free_ptr() throw()
　　20: {
　　21: if(!p_ptr || !(*p_ptr)) return;
　　22:
　　23: switch(eflag)
　　24: {
　　25: case alloc_mem: { free(*p_ptr), (*p_ptr) = NULL, p_ptr = NULL; break; }
　　26: case new_one: { delete (*p_ptr), (*p_ptr) = NULL, p_ptr = NULL; break; }
　　27: case new_array: { delete[] (*p_ptr),(*p_ptr) = NULL, p_ptr = NULL; break; }
　　28: }
　　29: }
　　30:
　　31: protected:
　　32: T** p_ptr; //!< pointer to the address of the set pointer needed to automatically free
　　33: EFLAG eflag; //!< the type of allocation
　　34:
　　35: private:
　　36: DISABLE_COPY_AND_ASSIGN(auto_free_ptr);
　　37: };
　　为了使用方便，封装两个宏：
　　1: // auto-free macros are mainly used to free the allocated memory by some local variables in the internal of function-body
　　2: #define AUTO_FREE_ENABLE( class, ptrName, ptrType ) \
　　3: auto_free_ptr auto_free_##ptrName; \
　　4: auto_free_##ptrName.set_ptr(&ptrName,auto_free_ptr::ptrType)
　　5:
　　6: #define AUTO_FREE_DISABLE( ptrName ) auto_free_##ptrName.give_up()
　　使用起来很简单，例如
　　1: void func(int nLftCnt, int nRhtCnt)
　　2: {
　　3: if (!nLftCnt && !nRhtCnt)
　　4: return;
　　5:
　　6: unsigned *pLftHashs = NULL;
　　7: unsigned *pRhtHashs = NULL;
　　8:
　　9: //在申请堆内存之前，使用auto_free_ptr
　　10: AUTO_FREE_ENABLE(unsigned, pLftHashs, new_array);
　　11: AUTO_FREE_ENABLE(unsigned, pRhtHashs, new_array);
　　12:
　　13: //....
　　14:
　　15: if (nLftCnt)
　　16: {
　　17: pLftHashs = new unsigned[nLftCnt];
　　18: //...a
　　19: }
　　20:
　　21: if (nRhtCnt)
　　22: {
　　23: pRhtHashs = new unsigned[nRhtCnt];
　　24: //...b
　　25: }
　　26:
　　27: //....
　　28:
　　29: if (...)
　　30: {
　　31: //因为下面这个函数可以释放资源，所以在它前面放弃对目标指针的管理权
　　32: AUTO_FREE_DISABLE(pLftHashs);
　　33: AUTO_FREE_DISABLE(pRhtHashs);
　　34:
　　35: //这个函数可以释放资源
　　36: free_hash_arrays(pLftHashs, pRhtHashs);
　　37: }
　　38: }
　　同样的，有时我们需要申请一个动态二维数组，所以也实现一个对应的auto_free_2D_ptr
　　1: //auto_free_2D_ptr is only used for automation free memory of 2D array
　　2: template
　　3: class auto_free_2D_ptr
　　4: {
　　5: public:
　　6: typedef enum {invalid, new_one, new_array, alloc_mem} EFLAG;
　　7: auto_free_2D_ptr() { initialize(); }
　　8: ~auto_free_2D_ptr() { free_ptr(); }
　　9:
　　10: ///set the pointer needed to automatically free
　　11: inline void set_ptr( T** new_ptr_address,EFLAG new_eflag, int new_length_row )
　　12: { free_ptr(); p_ptr = new_ptr_address; eflag = new_eflag; length_row = new_length_row; }
　　13:
　　14: //give up auto free memory
　　15: inline void give_up() { initialize(); }
　　16:
　　17: protected:
　　18: inline void initialize() { p_ptr = NULL; eflag = invalid; length_row = 0;}
　　19: inline void free_ptr() throw()
　　20: {
　　21: if(!p_ptr || !(*p_ptr)) return;
　　22:
　　23: for(int i = 0; i < length_row; i++)
　　24: {
　　25: if(!(*p_ptr)[i]) continue;
　　26: switch(eflag)
　　27: {
　　28: case alloc_mem: { free((*p_ptr)[i]); break; }
　　29: case new_one: { delete (*p_ptr)[i]; break; }
　　30: case new_array: { delete[] (*p_ptr)[i]; break; }
　　31: }
　　32: (*p_ptr)[i] = NULL;
　　33: }
　　34: switch(eflag)
　　35: {
　　36: case alloc_mem: { free((*p_ptr)); break; }
　　37: default: { delete[] (*p_ptr); break; }
　　38: }
　　39: (*p_ptr) = NULL, p_ptr = NULL;
　　40: }
　　41:
　　42: protected:
　　43: T** p_ptr; //!< pointer to the address of the set pointer needed to automatically free
　　44: EFLAG eflag; //!< the type of allocation
　　45: int length_row; //!< the row length such as ptr[length_row][length_col]
　　46:
　　47: private:
　　48: DISABLE_COPY_AND_ASSIGN(auto_free_2D_ptr);
　　49: };
　　50:
　　51: #define AUTO_FREE_2D_ENABLE( class, ptrName, ptrType, rowNum ) \
　　52: auto_free_2D_ptr auto_free_##ptrName; \
　　53: auto_free_##ptrName.set_ptr(&ptrName,auto_free_2D_ptr::ptrType, rowNum)
　　54:
　　55: #define AUTO_FREE_2D_DISABLE( ptrName ) AUTO_FREE_DISABLE( ptrName )
　　下面是个例子
　　1: void func(int row, int col)
　　2: {
　　3: if (!row && !col)
　　4: return;
　　5:
　　6: int **ptr = new int*[ row ];
　　7: for( int r = 0; r < row; ++r ) { ptr[r] = new int[ col ];}
　　8:
　　9: AUTO_FREE_2D_ENABLE( int, ptr, new_array, row );
　　10:
　　11: //....
　　12: }
　　到这里就结束了，有些码友可能会说，何必这么麻烦，boost内有很多智能指针供选择，用share_ptr, scoped_ptr, scoped_array，unique_ptr, auto_ptr 中的一个不就行了吗? 没错!如果你正在开发的代码中，允许用boost，并且在相关程序接口统一都用智能指针来管理、不会用到源对象指针的话，当然优先选boost，但是当你的代码中由于历史原因，有些接口不可变更，且new/delete, malloc/free都存在，而且依然需要使用源对象指针来完成大部分工作时，不妨试试我设计的这个阉割版的scoped_ptr/scoped_array。总之，根据自己的实际情况来选择合适的方案，如果标准方案不适用，就自己写一个。