Category的本质

一写在开头

Category大家应该用过，它主要是用在为对象的不同类型的功能分块，比如说人这个对象，我们可以为其创建三个分类，分别对应学习，工作，休息。

下面创建了一个Person类和两个Person类的分类。分别是Person+Test和Person+Eat。这三个类中各有一个方法。

//Person类
- (void)run;
- (void)run{
    
    NSLog(@"run");
}

//Person+Test分类
- (void)test;
- (void)test{
    
    NSLog(@"test");
}

//Person+Eat分类
- (void)eat;
- (void)eat{
    
    NSLog(@"eat");
}

当我们需要使用这些分类的时候只需要引入这些分类的头文件即可：

#import "Person+Test.h"
#import "Person+Eat.h"

Person *person = [[Person alloc] init];
 [person run];
 [person test];
 [person eat];

我们都知道，函数调用的本质是消息机制。[person run]的本质就是objc_mgs(person, @selector(run)),这个很好理解，由于对象方法是存放在类对象中的，所以向person对象发送消息就是通过person对象的isa指针找到其类对象，然后在类对象中找到这个对象方法。

[person test]和[person run]都是调用分类的对象方法，本质应该一样。[person test]的本质就是objc_mgs(person， @selector(test))，给实例对象发送消息，person对象通过自己的isa指针找到类对象，然后在自己的类对象中查找这个实例方法，那么问题来了， person类对象中有没有存储分类中的这个对象方法呢？Person+Test这个分类会不会有自己的分类的类对象，将分类的对象方法存储在这个类对象中呢？

我们要清楚的一点是每个类只有一个类对象，不管这个类有没有分类。所以分类中的对象方法研究存储在Person类的类对象中。后面我们会通过源码证实这一点。

二底层结构

我们在第一部分讲了，分类中的对象方法和类方法最终会合并到类中，分类中的对象方法合并到类的类对象中，分类中的类方法合并到类的元类对象中。那么这个合并是什么时候发生的呢？是在编译器编译器就帮我们合并好了吗？ 实际上是在运行期，进行的合并。

下面我们通过将Objective-c的代码转化为c++的源码窥探一下Category的底层结构。我们在命令行进入到存放Person+Test.m这个文件的文件夹中，然后在命令行输入clang -rewrite-objc Person+Test.m,这样Person+Test.m这个文件就被转化为了c++的源码Person+Test.cpp。

我们打开这个.cpp文件，由于这个文件非常长，所以我们直接拖到最下面，找到_category_t这个结构体。这个结构体就是每一个分类的结构：

struct _category_t {
    const char *name; //类名
    struct _class_t *cls;
    const struct _method_list_t *instance_methods; //对象方法列表
    const struct _method_list_t *class_methods; //实例方法列表
    const struct _protocol_list_t *protocols;  //协议列表
    const struct _prop_list_t *properties;  //属性列表
};

我们接着往下找到这个结构体的初始化：

static struct _category_t _OBJC_$_CATEGORY_Person_$_Test __attribute__ ((used, section ("__DATA,__objc_const"))) = 
{
    "Person",
    0, // &OBJC_CLASS_$_Person,
    (const struct _method_list_t *)&_OBJC_$_CATEGORY_INSTANCE_METHODS_Person_$_Test,
    (const struct _method_list_t *)&_OBJC_$_CATEGORY_CLASS_METHODS_Person_$_Test,
    0,
    0,
};

通过结构体名称_OBJC_$_CATEGORY_Person_$_Test我们可以知道这是Person+Test这个分类的初始化。类名对应的是"Person",对象方法列表这个结构体对应的是&_OBJC_$_CATEGORY_INSTANCE_METHODS_Person_$_Test,类方法列表这个结构体对应的是&_OBJC_$_CATEGORY_CLASS_METHODS_Person_$_Test,其余的初始化都是空。

然后我们找到&_OBJC_$_CATEGORY_INSTANCE_METHODS_Person_$_Test这个结构体：

static struct /*_method_list_t*/ {
    unsigned int entsize;  // sizeof(struct _objc_method)
    unsigned int method_count;
    struct _objc_method method_list[1];
} _OBJC_$_CATEGORY_INSTANCE_METHODS_Person_$_Test __attribute__ ((used, section ("__DATA,__objc_const"))) = {
    sizeof(_objc_method),
    1,
    {{(struct objc_selector *)"test", "v16@0:8", (void *)_I_Person_Test_test}}
};

可以看到这个结构体中包含一个对象方法test，这正是Person+Test这个分类中的对象方法。

然后我们再找到&_OBJC_$_CATEGORY_CLASS_METHODS_Person_$_Test这个结构体：

static struct /*_method_list_t*/ {
    unsigned int entsize;  // sizeof(struct _objc_method)
    unsigned int method_count;
    struct _objc_method method_list[1];
} _OBJC_$_CATEGORY_CLASS_METHODS_Person_$_Test __attribute__ ((used, section ("__DATA,__objc_const"))) = {
    sizeof(_objc_method),
    1,
    {{(struct objc_selector *)"test2", "v16@0:8", (void *)_C_Person_Test_test2}}
};

同样可以看到这个结构体，它包含一个类方法test2，这个同样是Person+Test中的类方法。

三利用runtime进行合并

由于整个合并的过程是通过runtime进行实现的，所以我们要了解这个过程就要通过查看runtime源码去了解。下面是查看runtime源码的过程：

1.找到objc-os.mm这个文件，这个文件是runtime的入口文件。
2.在objc-os.mm中找到_objc_init(void)这个方法，这个方法是运行时的初始化。
3.在_objc_init(void)中会调用_dyld_objc_notify_register(↦_images, load_images, unmap_image);,这个函数会传入map_images这个参数，我们点进这个参数。
4.点击进去map_images我们发现其中调用了map_images_nolock(count, paths, mhdrs);这个函数，我们点进这个函数。
5.map_images_nolock(unsigned mhCount, const char * const mhPaths[], const struct mach_header * const mhdrs[])这个函数非常长，我们直接拉到这个函数最下面，找到_read_images(hList, hCount, totalClasses, unoptimizedTotalClasses);这个函数，点击进去。
6.void _read_images(header_info **hList, uint32_t hCount, int totalClasses, int unoptimizedTotalClasses)这个方法大概就是读取模块的意思了。

这个函数也是非常长，我们大概在中间位置找到了这样一行注释

// Discover categories.

这个基本上就是我们要找的处理Category的模块了。

我们在这行注释下面找到这几行代码：

if (cls->isRealized()) {
       remethodizeClass(cls);
       classExists = YES;
                }
                
 if (cls->ISA()->isRealized()) {
       remethodizeClass(cls->ISA());  //class的ISA指针指向的是元类对象
               }

这个代码里面有一个关键函数remethodizeClass,通过函数名我们大概猜测这个方法是重新组织类中的方法，如果传入的是类，则重新组织对象方法，如果传入的是元类，则重新组织类方法。

7.然后我们点进这个方法里面查看：

static void remethodizeClass(Class cls)
{
    category_list *cats;
    bool isMeta;
    
    runtimeLock.assertWriting();
    
    isMeta = cls->isMetaClass();
    
    // Re-methodizing: check for more categories
    if ((cats = unattachedCategoriesForClass(cls, false/*not realizing*/))) {
        if (PrintConnecting) {
            _objc_inform("CLASS: attaching categories to class '%s' %s", 
                         cls->nameForLogging(), isMeta ? "(meta)" : "");
        }
        
        attachCategories(cls, cats, true /*flush caches*/);        
        free(cats);
    }
}

我们看到这段代码的核心是调用了attachCategories(cls, cats, true /*flush caches*/);这个方法。这个方法中传入了一个类cls和所有的分类cats。

8.我们点进attachCategories(cls, cats, true /*flush caches*/);这个方法。这个方法基本上就是核心方法了。

static void 
attachCategories(Class cls, category_list *cats, bool flush_caches)
{
    if (!cats) return;
    if (PrintReplacedMethods) printReplacements(cls, cats);
    
    bool isMeta = cls->isMetaClass();
    
    // fixme rearrange to remove these intermediate allocations
    //方法数组
    method_list_t **mlists = (method_list_t **)
        malloc(cats->count * sizeof(*mlists));
    //属性数组
    property_list_t **proplists = (property_list_t **)
        malloc(cats->count * sizeof(*proplists));
    //协议数组
    protocol_list_t **protolists = (protocol_list_t **)
        malloc(cats->count * sizeof(*protolists));
        
    // Count backwards through cats to get newest categories first
    int mcount = 0;
    int propcount = 0;
    int protocount = 0;
    int i = cats->count;
    bool fromBundle = NO;
    while (i--) {
        //取出某个分类
        auto& entry = cats->list[i];
//确定是对象方法还是类方法
        method_list_t *mlist = entry.cat->methodsForMeta(isMeta);
        if (mlist) {
            mlists[mcount++] = mlist;
            fromBundle |= entry.hi->isBundle();
        }
        
        property_list_t *proplist = 
            entry.cat->propertiesForMeta(isMeta, entry.hi);
        if (proplist) {
            proplists[propcount++] = proplist;
        }
        
        protocol_list_t *protolist = entry.cat->protocols;
        if (protolist) {
            protolists[protocount++] = protolist;
        }
    }
//得到类对象里面的数据
    auto rw = cls->data();
    
    prepareMethodLists(cls, mlists, mcount, NO, fromBundle);
    //将所有分类的对象方法，附加到类对象的方法列表中
    rw->methods.attachLists(mlists, mcount);
    free(mlists);
    if (flush_caches  &&  mcount > 0) flushCaches(cls);
//将所有分类的协议，附加到类对象的协议列表中
    rw->properties.attachLists(proplists, propcount);
    free(proplists);
    
    rw->protocols.attachLists(protolists, protocount);
    free(protolists);
}

bool isMeta = cls->isMetaClass();判断是类还是元类。
创建总的方法数组，属性数组，协议数组

//方法数组
    method_list_t **mlists = (method_list_t **)
        malloc(cats->count * sizeof(*mlists));
    //属性数组
    property_list_t **proplists = (property_list_t **)
        malloc(cats->count * sizeof(*proplists));
    //协议数组
    protocol_list_t **protolists = (protocol_list_t **)
        malloc(cats->count * sizeof(*protolists));

这里mlists，proplists，protolists都是用两个修饰的，说明是申请了一个二维数组。这三个二维数组里面的一级对象分别是方法列表，属性列表，以及协议列表。由于每一个分类Category都有一个方法列表，一个属性列表，一个协议列表，方法列表中装着这个分类的方法，属性列表中装着这个分类的属性。 所以mlists也就是装着所有分类的所有方法。

给前面创建的数组赋值

 while (i--) {
        //取出某个分类
        auto& entry = cats->list[i];
//确定是对象方法还是类方法
        method_list_t *mlist = entry.cat->methodsForMeta(isMeta);
        if (mlist) {
            mlists[mcount++] = mlist;
            fromBundle |= entry.hi->isBundle();
        }
        
        property_list_t *proplist = 
            entry.cat->propertiesForMeta(isMeta, entry.hi);
        if (proplist) {
            proplists[propcount++] = proplist;
        }
        
        protocol_list_t *protolist = entry.cat->protocols;
        if (protolist) {
            protolists[protocount++] = protolist;
        }
    }

这段代码就很清楚了，通过一个while循环遍历所有的分类，然后获取该分类的所有方法，赋值给前面创建的大数组。

rw = cls->data();得到类对象里面的所有数据。
rw->methods.attachLists(mlists, mcount);将所有分类的方法，附加到类的方法列表中。
9.我们点进这个方法里面看看具体的实现：

void attachLists(List* const * addedLists, uint32_t addedCount) {
        if (addedCount == 0) return;
        
        if (hasArray()) {
            // many lists -> many lists
            uint32_t oldCount = array()->count;
            uint32_t newCount = oldCount + addedCount;
            setArray((array_t *)realloc(array(), array_t::byteSize(newCount)));
            array()->count = newCount;
            memmove(array()->lists + addedCount, array()->lists, 
                    oldCount * sizeof(array()->lists[0]));
            memcpy(array()->lists, addedLists, 
                   addedCount * sizeof(array()->lists[0]));
        }
        else if (!list  &&  addedCount == 1) {
            // 0 lists -> 1 list
            list = addedLists[0];
        } 
        else {
            // 1 list -> many lists
            List* oldList = list;
            uint32_t oldCount = oldList ? 1 : 0;
            uint32_t newCount = oldCount + addedCount;
            setArray((array_t *)malloc(array_t::byteSize(newCount)));
            array()->count = newCount;
            if (oldList) array()->lists[addedCount] = oldList;
            memcpy(array()->lists, addedLists, 
                   addedCount * sizeof(array()->lists[0]));
        }
    }

传进来的这个addedLists参数就是前面得到的这个类的所有分类的对象方法或者类方法，而addedCount就是addedLists这个数组的个数。假设这个类有两个分类，且每个分类有两个方法，那么addedLists的结构大概就应该是这样的：

[

[method, method]

]

addedCount = 2

我们看一下这个类的方法列表之前的结构：

FfUzimZ.png!web

所以oldCount = 1

setArray((array_t *)realloc(array(), array_t::byteSize(newCount)));
array()->count = newCount;

这一句是重新分配内存，由于要把分类的方法合并进来，所以以前分配的内存就不够了，重新分配后的内存：

a6nUbqZ.png!web

memmove(array()->lists + addedCount, array()->lists, 
                    oldCount * sizeof(array()->lists[0]));

memmove这个函数是把第二个位置的对象移动到第一个位置。这里也就是把这个类本来的方法列表移动到第三个位置。

memcpy(array()->lists, addedLists, 
                   addedCount * sizeof(array()->lists[0]));

memcpy这个函数是把第二个位置的对象拷贝到第一个位置，也就是把addedLists拷贝到第一个位置，拷贝之后的内存应该是这样的：

aaiyyan.png!web

至此就把分类中的方法列表合并到了类的方法列表中。

通过上面的合并过程我们也明白了，当分类和类中有同样的方法时，类中的方法并没有被覆盖，只是分类的方法被放在了类的方法前面，导致先找到了分类的方法，所以分类的方法就被执行了。

四总结

1.通过runtime加载某个类的所有Category数据。

2.把所有Category的方法，属性，协议数据合并到一个大数组中，后面参与斌编译的Category数据，会在数组的前面。

3.将合并后的分类数据(方法，属性，协议)，插入到类原来数据的前面。

作者：雪山飞狐_91ae

链接：https://www.jianshu.com/p/da463f413de7

Recommend

React ref 的前世今生

Packaging Applications for Docker and Kubernetes: Metaparticle vs Pulumi vs Ball...

Uefi-jitfuck – A JIT Brainfuck compiler running on x86_64 UEFI

Java并发阻塞队列之ArrayBlockingQueue

redis系列--你真的入门了吗？redis4.0入门～

30个极大提高开发效率的Visual Studio Code插件

Our Use of the Haskell and Elm Programming Languages

Graal: The Quest for Source Code Knowledge

利用深度学习自动评价数百万张酒店图像

Finite State Machines with React

About Joyk