7

Friday Q&A 2009-03-13: Intro to the Objective-C Runtime

 3 years ago
source link: https://www.mikeash.com/pyblog/friday-qa-2009-03-13-intro-to-the-objective-c-runtime.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Intro to the Objective-C Runtime
mikeash.com: just this guy, you know?
Friday Q&A 2009-03-13: Intro to the Objective-C Runtime
by Mike Ash  

Welcome back to another Friday Q&A, on another Friday the 13th. This week I'm going to take Oliver Mooney's suggestion and talk about the Objective-C runtime, how it works, and what it can do for you.

Many Cocoa programmers are only vaguely aware of the Objective-C runtime. They know it's there (and some don't even know this!), that it's important, and you can't run Objective-C without it, but that's about where it stops.

Today I want to run through exactly how Objective-C works at the runtime level and what kinds of things you can do with it.

(Note: I'll be talking only about Apple's runtime on 10.5 and later. The runtime on 10.4 and earlier is missing many APIs, instead forcing direct structure access, and the runtimes for GNU and Cocotron are different beasts entirely.)

Objects
In Objective-C we work with objects all the time, but what is an object? Well, let's take a look and construct something that will tell us about them.

First, we know that objects are referred to using pointers, like NSObject *. And we know that we create them using the +alloc method. The documentation for that just says that it calls +allocWithZone:. Following the chain of documentation a bit further, we discover NSDefaultMallocZone and see that they're just allocated using malloc. Easy!

But what do they look like when they're allocated? Let's find out:

    #import <Foundation/Foundation.h>
    
    @interface A : NSObject { @public int a; } @end
    @implementation A @end
    @interface B : A { @public int b; } @end
    @implementation B @end
    @interface C : B { @public int c; } @end
    @implementation C @end
    
    int main(int argc, char **argv)
    {
        [NSAutoreleasePool new];
        
        C *obj = [[C alloc] init];
        obj->a = 0xaaaaaaaa;
        obj->b = 0xbbbbbbbb;
        obj->c = 0xcccccccc;
        
        NSData *objData = [NSData dataWithBytes:obj length:malloc_size(obj)];
        NSLog(@"Object contains %@", objData);
        
        return 0;
    }

We construct a class hierarchy that just has some instance variables, then we put obvious values into each ivar. Then we extract the data in nice printable form using malloc_size to get the right length, and use NSData to print a nice hex representation. Here's what we get:

    2009-01-27 15:58:04.904 a.out[22090:10b] Object contains <20300000 aaaaaaaa bbbbbbbb cccccccc>

We can see here that the class just gets laid out sequentially in memory. First you have A's ivar, then B's, then C's. Easy!

But what's this 20300000 thing at the beginning? Well, it comes before A's ivar, so it must be NSObject's. Let's look at NSObject's definition:

    /***********	Base class		***********/
    
    @interface NSObject  {
        Class	isa;
    }
Sure enough, there's another ivar. But what's this Class business? If we tell Xcode to take us to the definition we find ourselves in /usr/include/objc/objc.h which contains:
    typedef struct objc_class *Class;

And following it further we get to /usr/include/objc/runtime.h which contains:

    struct objc_class {
        Class isa;
    
    #if !__OBJC2__
        Class super_class                                        OBJC2_UNAVAILABLE;
        const char *name                                         OBJC2_UNAVAILABLE;
        long version                                             OBJC2_UNAVAILABLE;
        long info                                                OBJC2_UNAVAILABLE;
        long instance_size                                       OBJC2_UNAVAILABLE;
        struct objc_ivar_list *ivars                             OBJC2_UNAVAILABLE;
        struct objc_method_list **methodLists                    OBJC2_UNAVAILABLE;
        struct objc_cache *cache                                 OBJC2_UNAVAILABLE;
        struct objc_protocol_list *protocols                     OBJC2_UNAVAILABLE;
    #endif
    
    } OBJC2_UNAVAILABLE;

So a Class is a pointer to a structure which... starts with another Class.

Let's look at another root class, NSProxy:

    @interface NSProxy  {
        Class	isa;
    }
It's there too. Let's look in one more place, the definition of id, the Objective-C type for "any object":
    typedef struct objc_object {
        Class isa;
    } *id;

There it is again. Clearly every single Objective-C object must start with Class isa, even class objects. But what is it?

As the name and type imply, the isa ivar indicates what class a particular object is. Every Objective-C object must begin with an isa pointer, otherwise the runtime won't know how to work with it. Everything about a particular object's type is wrapped up in that one little pointer. The remainder of an object is basically just a big blob and as far as the runtime is concerned, it is irrelevant. It's up to the individual classes to give that blob meaning.

Classes
What exactly do classes contain, then? The "unavailable" structure members give a good clue. (They're there for compatibility with the pre-Leopard runtime, and you shouldn't use them if you're targeting Leopard, but it still tells us what kind of information is there.) First comes the isa, which allows a class to act like an object as well. There's a pointer to the superclass, giving the proper class hierarchy. Some other basic information about the class follows. At the end is the really interesting stuff. There's a list of instance variables, a list of methods, and a list of protocols. All of this stuff is accessible at runtime, and can be modified at runtime too.

I skipped right over the cache member because it's not really useful for runtime manipulation, but it's an interesting exposure of an implementation detail. Every time you send a message ([foo bar]) the runtime has to look up the actual code to invoke by rummaging through the list of methods in the target object's class. However, methods are stored in big linear lists by default, so this is really slow. The cache is just a hash table mapping selectors to code. The first time you send a message you'll get a slow, time-consuming lookup, but the result is put in the hash table. Subsequent calls will find the entry in the hash table, making the process go much faster.

Looking at the rest of runtime.h you'll see a lot of functions for accessing and manipulating these properties. Each function is prefixed with what it operates on. General runtime functions start with objc_, functions that operate on a class start with class_, and so forth. For example, you can call class_getInstanceMethod to get information about a particular method, like the argument/return types. Or you can call class_addMethod to add a new method to an existing class at runtime. You can even create a whole new class at runtime by using objc_allocateClassPair.

Practical Applications
There are tons of useful things that can be done with this kind of runtime meta-information, but here are some ideas.

  1. Automatic ivar/method searches. Apple's Key-Value Coding does this kind of thing already: you give it a name, and it looks up a method or ivar based on that name and does some stuff with it. You can do that kind of thing yourself, in case you need to look up an ivar based on a name or something of the sort.
  2. Automatically register/invoke subclasses. Using objc_getClassList you can get a list of all classes currently known to the runtime, and by tracing out the class hierarchy, you can identify which ones subclass a given class. This can let you write subclasses to handle specialized data formats or other such situations and let the superclass look them up without having to tediously register every subclass manually.
  3. Automatically call a method on every class. This can be useful for custom unit testing frameworks and the like. Similar to #2, but look for a method being implemented rather than a particular class hierarchy.
  4. Override methods at runtime. The runtime provides a complete set of tools for re-pointing methods to custom implementations so that you can change what classes do without touching their source code.
  5. Automatically deallocate synthesized properties. The @synthesize keyword is handy for making the compiler generate setters/getters but it still forces you to write cleanup code in -dealloc. By reading meta-information about the class's properties, you can write code that will go through and clean up all synthesized properties automatically instead of having to write code for each case.
  6. Bridging. By dynamically generating classes at runtime, and by looking up the necessary properties on demand, you can create a bridge between Objective-C and another (sufficiently dynamic) language.
  7. Much more. Don't feel limited to the above, come up with your own ideas!

Wrapping Up
Objective-C is a powerful language and the comprehensive runtime API is an extremely useful part of it. While it may be a bit ugly groveling around in all that C code, it's really not that difficult to work with, and it's well worth the power it provides.

That's it for this week's Friday Q&A. Please send in your suggestions, either by posting them below or by e-mail (tell me if you don't want me to use your name). Friday Q&A runs on your suggestions so please write in!

Have a favorite use of the ObjC runtime? Something you dislike about it? Have a tip to share? Post it all below.

Did you enjoy this article? I'm selling whole books full of them! Volumes II and III are now out! They're available as ePub, PDF, print, and on iBooks and Kindle. Click here for more information.

Comments:

Nice post Mike! I would be lax in my duties if I did not mention that while the Cocotron runtime is a different implementation the goal is to be compatible with Apple's API. The function based interface of ObjC 2..0 makes this all nicer and easier as we can be a little different under the hood if needed.
If anyone is interested, I have an implementation of "#5, Automatically deallocating synthesized properties."

http://vgable.com/blog/2008/12/20/automatically-freeing-every-property/

In retrospect I was probably overly conservative with pointers, but I've yet to use pointer-properties.
Thanks for the note about Cocotron. Your web site sure has a lot of stuff on it.
So where is the the version 2 of the runtime storing all the data about the class if not in the objc_class struct?
The 2.0 runtime is still storing everything in the objc_class struct. This is a necessity in 32-bit mode, as old code which directly manipulates these structs still has to work, so all of the data still has to be in the same place. In 64-bit mode it could be moved around, but I'll bet that it's still there. They've simply removed it from the definition to make all of this stuff "private".
It seems to me that the #if !__OBJC2__ ... #endif would erase all those declarations in the objc_class struct if compiled with the version 2 runtime. Am I missing something?
What you're missing is that they're just declarations. If you remove them, the data is still there, the compiler just won't let you access it by name anymore. The declarations in the public headers are not related to the actual structures created and used by the runtime in its implementation.
The declarations in the public headers are not related to the actual structures created and used by the runtime in its implementation.
Doesn't this mean that the struct available in runtime.h is a different size than whatever the runtime is using internally? Won't this break pointer arithmetic for struct objc_class* types? Is that just something that nothing outside of the runtime itself should ever need to do?
Yes to both. But it's not an issue, because you'll never have an array of struct objc_class. The semantics are like ObjC object pointers, and should be treated as opaque references rather than something you can do arithmetic with. You can't do pointer arithmetic on NSObject * either, and it doesn't matter for the same reasons.
Mike, thanks for writing this article! It answered a lot of questions I had. Apple seem to have done a fair bit of work getting the runtime to be more accessible to other languages. The provision of functions to interface with runtime structures, and BridgeSupport coupled with libffi, makes interfacing much easier than in previous OS X incarnations.
nice post!!!!
specially about bridging!!
wow. explains a lot. thanks for sharing, looking forward for more articles from you.
Great write-up, I am a big believer in commenting on blogs to inform the blog writers know that they’ve added something worthwhile to the world wide web!..
I was looking at some of your posts on this website and I conceive this web site is really instructive! Keep putting up..
Thanks for the nice blog. It was very useful for me. I'm happy I found this blog. Thank you for sharing with us,I too always learn something new from your post.
This is just the information I am finding everywhere. Thanks for your blog, I just subscribe your blog. This is a nice blog..
You have a good point here!I totally agree with what you have said!!Thanks for sharing your views...hope more people will read this article!!!
We have brought very beautiful and hot girls to you in delhi, you can fulfill your desires with their soft body. At a very cheap rate in a luxury hotel in delhi.
I would like to thank you for the efforts you have made in writing this article. I am hoping the same best work from you in the future as well..
Please continue this great work and I look forward to more of your awesome blog posts.
I really appreciate the kind of topics you post here. Thanks for sharing us a great information that is actually helpful. Good day!
Thanks for taking the time to discuss this, I feel strongly about it and love learning more on this topic. If possible, as you gain expertise, would you mind updating your blog with more information? It is extremely helpful for me.

Comments RSS feed for this page

Add your thoughts, post a comment:

Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.

Name:Web site:The Answer to the Ultimate Question of Life, the Universe, and Everything?Comment:Formatting: <i> <b> <blockquote> <code>. URLs are automatically hyperlinked.Code syntax highlighting thanks to Pygments.

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK