Weak 弱引用实现原理
Objective-C __weak
弱引用实现原理
前言
为什么要理解weak修饰符的原理?这是在阅读源码和其它博客之前我的疑问,有的时候,问题的答案只有在做了之后才会出现
- 考虑YYCache这段代码
@interface _YYLinkedMapNode : NSObject {
__unsafe_unretained _YYLinkedMapNode *_prev; // retained by dic
__unsafe_unretained _YYLinkedMapNode *_next; // retained by dic
id _key;
id _value;
NSUInteger _cost;
NSTimeInterval _time;
}
@end
为什么要用__unsafe_unretained修饰?在学习完weak源码后,其实就有了答案: __weak修饰符的性能会弱于__unsafe_unretained,因为runtime会维护一系列的函数、数据结构和锁来实现weak,而__unsafe_unretained则不用 所以下面的场景可以考虑使用__unsafe_unretained:
- 大量调用,性能敏感
- 生命周期可控,确定不会出现访问悬挂指针(对象在被释放后指针还有可能被访问)
1. objc_initWeak
__weak Object *obj = foo;
obj = bar;
// 第一行代码会被Clang编译器解析为objc_initWeak(&obj, foo);
// 第二行代码会被Clang编译器解析为objc_storeWeak(&obj, bar);
id
objc_initWeak(id *location, id newObj)
{
if (!newObj) {
*location = nil;
return nil;
}
return storeWeak<DontHaveOld, DoHaveNew, DoCrashIfDeallocating>
(location, (objc_object*)newObj);
}
- initWeak很简单,判断新值是不是nil,调用storeWeak
- 和直接调用storeWeak的区别是,这里的haveOld会不同,initWeak函数调用的时候haveOld == false,而storeWeak的时候haveOld为true(这样要注意,initWeak的参数名不同,haveOld是storeWeak函数的参数名)
- C++模版函数声明,这三个参数都是编译器常量,会进行CPU条件预测优化
2. objc_storeWeak
enum CrashIfDeallocating {
DontCrashIfDeallocating = false, DoCrashIfDeallocating = true
};
template <HaveOld haveOld, HaveNew haveNew,
enum CrashIfDeallocating crashIfDeallocating>
static id
storeWeak(id *location, objc_object *newObj)
{
ASSERT(haveOld || haveNew);
if (!haveNew) ASSERT(newObj == nil);
Class previouslyInitializedClass = nil;
id oldObj;
SideTable *oldTable;
SideTable *newTable;
// Acquire locks for old and new values.
// Order by lock address to prevent lock ordering problems.
// Retry if the old value changes underneath us.
retry:
if (haveOld) {
oldObj = *location;
oldTable = &SideTables()[oldObj];
} else {
oldTable = nil;
}
if (haveNew) {
newTable = &SideTables()[newObj];
} else {
newTable = nil;
}
SideTable::lockTwo<haveOld, haveNew>(oldTable, newTable);
if (haveOld && *location != oldObj) {
SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);
goto retry;
}
// Prevent a deadlock between the weak reference machinery
// and the +initialize machinery by ensuring that no
// weakly-referenced object has an un-+initialized isa.
if (haveNew && newObj) {
Class cls = newObj->getIsa();
if (cls != previouslyInitializedClass &&
!((objc_class *)cls)->isInitialized())
{
SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);
class_initialize(cls, (id)newObj);
// If this class is finished with +initialize then we're good.
// If this class is still running +initialize on this thread
// (i.e. +initialize called storeWeak on an instance of itself)
// then we may proceed but it will appear initializing and
// not yet initialized to the check above.
// Instead set previouslyInitializedClass to recognize it on retry.
previouslyInitializedClass = cls;
goto retry;
}
}
// Clean up old value, if any.
if (haveOld) {
weak_unregister_no_lock(&oldTable->weak_table, oldObj, location);
}
// Assign new value, if any.
if (haveNew) {
newObj = (objc_object *)
weak_register_no_lock(&newTable->weak_table, (id)newObj, location,
crashIfDeallocating ? CrashIfDeallocating : ReturnNilIfDeallocating);
// weak_register_no_lock returns nil if weak store should be rejected
// Set is-weakly-referenced bit in refcount table.
if (!_objc_isTaggedPointerOrNil(newObj)) {
newObj->setWeaklyReferenced_nolock();
}
// Do not set *location anywhere else. That would introduce a race.
*location = (id)newObj;
}
else {
// No new value. The storage is not changed.
}
SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);
// This must be called without the locks held, as it can invoke
// arbitrary code. In particular, even if _setWeaklyReferenced
// is not implemented, resolveInstanceMethod: may be, and may
// call back into the weak reference machinery.
callSetWeaklyReferenced((id)newObj);
return (id)newObj;
}
-
创建oldTable和newTable的自动变量,根据判断把oldObj和newObj分别存入对应的table,这里有错误,后面解释
- 这里要注意的是,当OldObj有值的时候,这是一次赋值操作(非initWeak进入),所以此时location指向的还是旧值的地址,因为新的赋值还没有执行完成
- 对oldTable和newTable加锁,防止竞态
- 判断isa指针是否初始化完成,判断initialize方法有没有调用完成;如果没有,先解锁,防止initialize的时候执行了storeWeak,initialize可能会和storeWeak用到同样的SideTable,这个时候就会发生死锁;解锁后执行initialize方法,然后retry
- 不能直接从lockTwo开始,因为如果其它线程替换了obj,可能拿到错误的值
- 如果haveOld,解绑
- 有新值,把新的对象注册进sideTable里,如果对象不是Tagged Pointer,把对象refcount表中被weak引用标记位置1
- 发送对象被设置为weak的通知
3. SideTable
struct SideTable {
spinlock_t slock;
RefcountMap refcnts;
weak_table_t weak_table;
void lock() { slock.lock(); }
void unlock() { slock.unlock(); }
void reset() { slock.reset(); }
// Address-ordered lock discipline for a pair of side tables.
template<HaveOld, HaveNew>
static void lockTwo(SideTable *lock1, SideTable *lock2);
template<HaveOld, HaveNew>
static void unlockTwo(SideTable *lock1, SideTable *lock2);
};
- 三个字段分别是:自旋锁,引用计数的hash表,weak引用计数的全局表
在之前的代码可以看到,SideTable的赋值是这样的
oldTable = &SideTables()[oldObj];
在objc4的源码中,SideTables的实现,和参考文章不同的是,objc4的实现已经删除了关键字reinterpret_cast,直接返回SideTablesMap.get()
static objc::ExplicitInit<StripedMap<SideTable>> SideTablesMap;
OBJC_EXTERN void *const objc_debug_side_tables_map = &SideTablesMap;
static StripedMap<SideTable>& SideTables() {
return SideTablesMap.get();
}
抛开C++关键字,和debug相关的代码,这段代码将会是这样
static <StripedMap<SideTable>> SideTablesMap;
static StripedMap<SideTable>& SideTables() {
return SideTablesMap.get();
}
可以看到,SideTablesMap和名称一样,就是一个SripedMap,里面存储多个SideTable
template<typename T>
class StripedMap {
#if TARGET_OS_IPHONE && !TARGET_OS_SIMULATOR
enum { StripeCount = 8 };
#else
enum { StripeCount = 64 };
#endif
PaddedT array[StripeCount];
static unsigned int indexForPointer(const void *p) {
uintptr_t addr = reinterpret_cast<uintptr_t>(p);
return ((addr >> 4) ^ (addr >> 9)) % StripeCount;
}
public:
T& operator[] (const void *p) {
return array[indexForPointer(p)].value;
}
const T& operator[] (const void *p) const {
return const_cast<StripedMap<T>>(this)[p];
}
};
- StripedMap
:分片数组,可以根据void \*也就是对象指针,自动哈希到某一个分片,每个分片存储一个SideTable - 分片数量由平台决定,比如这里iOS是8
- 每个分片都是PaddedT,内部有alignas(CacheLineSize)保证每个分片独占cache line,防止伪共享
- 通过重载operator[],给定一个对象指针,stripedMap就能直接返回对应分片的SideTable引用
- indexForPoint:给对象指针(内存地址)做移位,异或,取模运算,把类似地址的对象均匀分配到不同的分片
3.1 根据SideTable原理纠正之前的错误理解,加深记忆
oldTable = &SideTables()[oldObj];
之前对这行代码的解释是把oldObj存储到对应的SideTable,看完完整代码后,这行代码的实际用处是
- 把oldObj作为key,从全局的StripedMap
分片哈希表中,取出oldObj映射到的分片的SideTable的引用,然后返回该引用的地址
weak_register_no_lock(&newTable->weak_table, (id)newObj, location,
crashIfDeallocating ? CrashIfDeallocating : ReturnNilIfDeallocating);
- 而这行代码,才是根据newObj获取到的SideTable进行注册
3.2 StripedMap分片SideTable的意义
- 将对象映射到对应分片的SideTable,实现高并发下的数据隔离和高效访问
- 通过对单个分片SideTable加锁,而不是全局锁提高性能
3.3 RefcountMap
- refcountMap是一个哈希表,用来记录每个对象的扩展引用计数,就是对象地址 -> 引用计数的映射表,这些情况会用到refcountMap
- 当引用计数超过isa能承受的最大范围,会溢出到SideTable
- 对象被标记了has_assoc等属性,不能再依靠isa bits管理的时候
- refcountMap并不直接参与weak对象的创建,因为weak对象的创建不会增加引用计数
3.4 weak_table_t和weak_entry_t
struct weak_table_t {
weak_entry_t *weak_entries;
size_t num_entries;
uintptr_t mask;
uintptr_t max_hash_displacement;
};
- weak_table_t:SideTable里面专门用来存储所有weak引用关系的哈希表
- weak_entries:指向weak_entry_t的指针数组
- num_entries:记录有多少entry
- max/max_has_displacement:哈希表的优化字段
weak_table_t用于查找/管理所有活跃的weak引用的对象,内部为哈希表,保证高并发,高性能的插入,查找和删除
struct weak_entry_t {
DisguisedPtr<objc_object> referent;
union {
struct {
weak_referrer_t *referrers;
uintptr_t out_of_line_ness : 2;
uintptr_t num_refs : PTR_MINUS_2;
uintptr_t mask;
uintptr_t max_hash_displacement;
};
struct {
// out_of_line_ness field is low bits of inline_referrers[1]
weak_referrer_t inline_referrers[WEAK_INLINE_COUNT];
};
};
bool out_of_line() {
return (out_of_line_ness == REFERRERS_OUT_OF_LINE);
}
weak_entry_t& operator=(const weak_entry_t& other) {
memcpy(this, &other, sizeof(other));
return *this;
}
weak_entry_t(objc_object *newReferent, objc_object **newReferrer)
: referent(newReferent)
{
inline_referrers[0] = newReferrer;
for (int i = 1; i < WEAK_INLINE_COUNT; i++) {
inline_referrers[i] = nil;
}
}
};
-
weak_entry_t:表示一组指向同一个对象的所有weak指针的集合
- referent被指向的对象
3.5 场景
- 假设有对象A,有4个地方声明了__weak,实际上就是4个变量内存地址指向A
- weak_table_t里有一个指向A的weak_entry_t
- weak_entry_t的referrers数组存储着这4个指针变量的地址
- 对象A被释放的时候,遍历referrers,把4个指针全部置为nil
4. weak_unregister_no_lock
void
weak_unregister_no_lock(weak_table_t *weak_table, id referent_id,
id *referrer_id)
{
objc_object *referent = (objc_object *)referent_id;
objc_object **referrer = (objc_object **)referrer_id;
weak_entry_t *entry;
if (!referent) return;
if ((entry = weak_entry_for_referent(weak_table, referent))) {
remove_referrer(entry, referrer);
bool empty = true;
if (entry->out_of_line() && entry->num_refs != 0) {
empty = false;
}
else {
for (size_t i = 0; i < WEAK_INLINE_COUNT; i++) {
if (entry->inline_referrers[i]) {
empty = false;
break;
}
}
}
if (empty) {
weak_entry_remove(weak_table, entry);
}
}
// Do not set *referrer = nil. objc_storeWeak() requires that the
// value not change.
}
// 以这行代码辅助理解
// Clean up old value, if any.
if (haveOld) {
weak_unregister_no_lock(&oldTable->weak_table, oldObj, location);
}
- referent:weak引用的目标对象,类型为objc_object
- referrer:weak变量的位置,也就是这个对象的地址
-
流程如下:
- 先找到对象对应的entry,也就是之前场景举例的弱引用表
- 在entry的referrer列表中移除当前这个weak变量的地址
- 检查要不要保留entry,如果不需要就删除
5. weak_register_no_lock
id
weak_register_no_lock(weak_table_t *weak_table, id referent_id,
id *referrer_id, WeakRegisterDeallocatingOptions deallocatingOptions)
{
objc_object *referent = (objc_object *)referent_id;
objc_object **referrer = (objc_object **)referrer_id;
if (_objc_isTaggedPointerOrNil(referent)) return referent_id;
// ensure that the referenced object is viable
if (deallocatingOptions == ReturnNilIfDeallocating ||
deallocatingOptions == CrashIfDeallocating) {
bool deallocating;
if (!referent->ISA()->hasCustomRR()) {
deallocating = referent->rootIsDeallocating();
}
else {
// Use lookUpImpOrForward so we can avoid the assert in
// class_getInstanceMethod, since we intentionally make this
// callout with the lock held.
auto allowsWeakReference = (BOOL(*)(objc_object *, SEL))
lookUpImpOrForwardTryCache((id)referent, @selector(allowsWeakReference),
referent->getIsa());
if ((IMP)allowsWeakReference == _objc_msgForward) {
return nil;
}
deallocating =
! (*allowsWeakReference)(referent, @selector(allowsWeakReference));
}
if (deallocating) {
if (deallocatingOptions == CrashIfDeallocating) {
_objc_fatal("Cannot form weak reference to instance (%p) of "
"class %s. It is possible that this object was "
"over-released, or is in the process of deallocation.",
(void*)referent, object_getClassName((id)referent));
} else {
return nil;
}
}
}
// now remember it and where it is being stored
weak_entry_t *entry;
if ((entry = weak_entry_for_referent(weak_table, referent))) {
append_referrer(entry, referrer);
}
else {
weak_entry_t new_entry(referent, referrer);
weak_grow_maybe(weak_table);
weak_entry_insert(weak_table, &new_entry);
}
// Do not set *referrer. objc_storeWeak() requires that the
// value not change.
return referent_id;
}
- 对taggedPointer等小对象不做weak管理
- 检查referent对象是不是处于不能被处理的状态,比如对象正在释放,还有对象是不是有自己的引用计数定义,比如Proxy
- 创建/查找entry,并追踪这个referent对象
6. 总结
- weak有两种创建方式,首次赋值,赋值,分别对应initWeak和storeWeak
- storeWeak会处理和initialize方法可能的死锁情况,确保initialize流程和weak写入之间的安全顺序
- StripedMap:给对象地址做哈希分片,映射到多个SideTable分片,减少了多线程下的锁竞争和并发瓶颈,提升性能
- SideTable维护了多个哈希表,其中一个是weak_entry_t的哈希表,用于存储referent(对象)到weak_entry_t中的映射
- weak_entry_t维护
了referent对象全部的weak引用
- register/unregister:建立entry的映射/删除entry的映射
7. 指针,对象地址,SideTable和weak_entry_t的关系
weak_unregister_no_lock函数,分析一下这几个关键对象的关系
NSObject *objc = [[NSObject alloc] init];
在上面的代码中,obj是NSObject *类型的指针,存储着一块objc_object对象的地址;
void
weak_unregister_no_lock(weak_table_t *weak_table, id referent_id,
id *referrer_id)
{
objc_object *referent = (objc_object *)referent_id;
objc_object **referrer = (objc_object **)referrer_id;
weak_entry_t *entry;
if (!referent) return;
if ((entry = weak_entry_for_referent(weak_table, referent))) {
remove_referrer(entry, referrer);
bool empty = true;
if (entry->out_of_line() && entry->num_refs != 0) {
empty = false;
}
else {
for (size_t i = 0; i < WEAK_INLINE_COUNT; i++) {
if (entry->inline_referrers[i]) {
empty = false;
break;
}
}
}
if (empty) {
weak_entry_remove(weak_table, entry);
}
}
}
struct weak_entry_t {
DisguisedPtr<objc_object> referent;
struct {
weak_referrer_t inline_referrers[WEAK_INLINE_COUNT];
};
};
static void remove_referrer(weak_entry_t *entry, objc_object **old_referrer)
{
if (! entry->out_of_line()) {
for (size_t i = 0; i < WEAK_INLINE_COUNT; i++) {
if (entry->inline_referrers[i] == old_referrer) {
entry->inline_referrers[i] = nil;
return;
}
}
REPORT_WEAK_ERROR("Attempted to unregister unknown __weak variable "
"at %p. This is probably incorrect use of "
"objc_storeWeak() and objc_loadWeak().",
old_referrer);
return;
}
size_t begin = w_hash_pointer(old_referrer) & (entry->mask);
size_t index = begin;
size_t hash_displacement = 0;
while (entry->referrers[index] != old_referrer) {
index = (index+1) & entry->mask;
if (index == begin) bad_weak_table(entry);
hash_displacement++;
if (hash_displacement > entry->max_hash_displacement) {
REPORT_WEAK_ERROR("Attempted to unregister unknown __weak variable "
"at %p. This is probably incorrect use of "
"objc_storeWeak() and objc_loadWeak().",
old_referrer);
return;
}
}
entry->referrers[index] = nil;
entry->num_refs--;
}
- referent_id:在这里是指针,存储着对象的地址
- referrer_id:指向指针的指针,也就是referent_id这个指针的地址
- 在代码里可以看到SideTable根据对象地址,找到对应的entry,删除remove_referrer就是在entry_t中存储的referrers数组中,删除这个指针
- 最后如果referrers为空,这个时候没有指向这个entry的指针了,就删除整个entry