Golang的内存管理(下篇)

mob604756f0bbf4 · 大约5小时之前 · 48 次点击 · 预计阅读时间 7 分钟 · 大约8小时之前开始浏览

4.3 size 介于 16 和 32K
对于 size 介于 16 ~ 32K byte 的内存分配先计算应该分配的 sizeclass，然后去 mcache 里面 alloc[sizeclass] 申请，如果 mcache.alloc[sizeclass] 不足以申请，则 mcache 向 mcentral 申请，然后再分配。mcentral 给 mcache 分配完之后会判断自己需不需要扩充，如果需要则想 mheap 申请。


func mallocgc(...) {
      ...
      } else {            
          var sizeclass uint8

          //计算 sizeclass
            if size <= smallSizeMax-8 {
                sizeclass = size_to_class8[(size+smallSizeDiv-1)/smallSizeDiv]
            } else {
                sizeclass = size_to_class128[(size-smallSizeMax+largeSizeDiv-1)/largeSizeDiv]
            }
            size = uintptr(class_to_size[sizeclass])
            span := c.alloc[sizeclass]          
            //从对应的 span 里面分配一个 object 
            v := nextFreeFast(span)           
            if v == 0 {
                v, span, shouldhelpgc = c.nextFree(sizeclass)
            }
            x = unsafe.Pointer(v)            
            if needzero && span.needzero != 0 {
                memclrNoHeapPointers(unsafe.Pointer(v), size)
            }
        }
}

我们首先看一下如何计算 sizeclass 的，预先定义了两个数组：size_to_class8 和 size_to_class128。数组 size_to_class8，其第 i 个值表示地址区间 ( (i-1)8, i8 ] (smallSizeDiv = 8) 对应的 sizeclass，size_to_class128 类似。小于 1024 - 8 = 1016 （smallSizeMax=1024），使用 size_to_class8，否则使用数组 size_to_class128。看一下数组具体的值：0, 1, 2, 3, 3, 4, 4…。举个例子，比如要分配 17 byte 的内存（16 byte 以下的使用 mcache.tiny 分配），sizeclass = size_to_calss8[(17+7)/8] = size_to_class8[3] = 3。不得不说这种用空间换时间的策略确实提高了运行效率。

计算出 sizeclass，那么就可以去 mcache.alloc[sizeclass] 分配了，注意这是一个 mspan 指针，真正的分配函数是 nextFreeFast() 函数。如下。

// nextFreeFast returns the next free object if one is quickly available.// Otherwise it returns 0.func nextFreeFast(s *mspan) gclinkptr {
    theBit := sys.Ctz64(s.allocCache) // Is there a free object in the allocCache?
    if theBit < 64 {
        result := s.freeindex + uintptr(theBit)        
            if result < s.nelems {
            freeidx := result + 1
            if freeidx%64 == 0 && freeidx != s.nelems {                return 0
            }
            s.allocCache >>= (theBit + 1)
            s.freeindex = freeidx
            v := gclinkptr(result*s.elemsize + s.base())
            s.allocCount++       
            return v
        }
    }    
    return 0
}

allocCache 这里是用位图表示内存是否可用，1 表示可用。然后通过 span 里面的 freeindex 和 elemsize 来计算地址即可。

如果 mcache.alloc[sizeclass] 已经不够用了，则从 mcentral 申请内存到 mcache。

// nextFree returns the next free object from the cached span if one is available.
// Otherwise it refills the cache with a span with an available object and
// returns that object along with a flag indicating that this was a heavy
// weight allocation. If it is a heavy weight allocation the caller must
// determine whether a new GC cycle needs to be started or if the GC is active
// whether this goroutine needs to assist the GC.
func (c *mcache) nextFree(sizeclass uint8) (v gclinkptr, s *mspan, shouldhelpgc bool) {
    s = c.alloc[sizeclass]
    shouldhelpgc = false
    freeIndex := s.nextFreeIndex()
    if freeIndex == s.nelems {        // The span is full.
        if uintptr(s.allocCount) != s.nelems {            
            println("runtime: s.allocCount=", s.allocCount, "s.nelems=", s.nelems)
            throw("s.allocCount != s.nelems && freeIndex == s.nelems")
        }
        systemstack(func() {  // 这个地方 mcache 向 mcentral 申请
            c.refill(int32(sizeclass))
        })
        shouldhelpgc = true
        s = c.alloc[sizeclass]        
            // mcache 向 mcentral 申请完之后，再次从 mcache 申请
        freeIndex = s.nextFreeIndex()
    }

    ...
}
// nextFreeIndex returns the index of the next free object in s at
// or after s.freeindex.
// There are hardware instructions that can be used to make this
// faster if profiling warrants it.
// 这个函数和 nextFreeFast 有点冗余了
func (s *mspan) nextFreeIndex() uintptr {
    ...
}

mcache 向 mcentral，如果 mcentral 不够，则向 mheap 申请。

func (c *mcache) refill(sizeclass int32) *mspan {
    ...    // 向 mcentral 申请
    s = mheap_.central[sizeclass].mcentral.cacheSpan()
    ...  return s
}
// Allocate a span to use in an MCache.
func (c *mcentral) cacheSpan() *mspan {
  ...  // Replenish central list if empty.
    s = c.grow()
}
func (c *mcentral) grow() *mspan {
    npages := uintptr(class_to_allocnpages[c.sizeclass])
    size := uintptr(class_to_size[c.sizeclass])
    n := (npages << _PageShift) / size  //这里想 mheap 申请
    s := mheap_.alloc(npages, c.sizeclass, false, true)
    ...  return s
}

如果 mheap 不足，则想 OS 申请。接上面的代码 mheap_.alloc()


func (h *mheap) alloc(npage uintptr, sizeclass int32, large bool, needzero bool) *mspan {
    ...    var s *mspan
    systemstack(func() {
        s = h.alloc_m(npage, sizeclass, large)
    })
  ...
}
func (h *mheap) alloc_m(npage uintptr, sizeclass int32, large bool) *mspan {
    ... 
  s := h.allocSpanLocked(npage)
  ...
}
func (h *mheap) allocSpanLocked(npage uintptr) *mspan {
  ...
    s = h.allocLarge(npage)
    if s == nil {
        if !h.grow(npage) {
            return nil
        }
        s = h.allocLarge(npage)        
        if s == nil {            
            return nil
        }
    }
  ...
}
func (h *mheap) grow(npage uintptr) bool {    
    // Ask for a big chunk, to reduce the number of mappings
    // the operating system needs to track; also amortizes
    // the overhead of an operating system mapping.
    // Allocate a multiple of 64kB.
    npage = round(npage, (64<<10)/_PageSize)
    ask := npage << _PageShift    
    if ask < _HeapAllocChunk {
        ask = _HeapAllocChunk
    }

    v := h.sysAlloc(ask)
    ...
}

整个函数调用链如上所示，最后 sysAlloc 会调用系统调用（mmap 或者 VirtualAlloc，和初始化那部分有点类似）去向操作系统申请。

5. 内存回收
5.1 mcache 回收
mcache 回收可以分两部分：第一部分是将 alloc 中未用完的内存归还给对应的 mcentral。

func freemcache(c *mcache) {
    systemstack(func() {
        c.releaseAll()
        ...

        lock(&mheap_.lock)
        purgecachedstats(c)
        mheap_.cachealloc.free(unsafe.Pointer(c))
        unlock(&mheap_.lock)
    })
}
func (c *mcache) releaseAll() {
    for i := 0; i < _NumSizeClasses; i++ {
        s := c.alloc[i]        if s != &emptymspan {
            mheap_.central[i].mcentral.uncacheSpan(s)
            c.alloc[i] = &emptymspan
        }
    }    // Clear tinyalloc pool.
    c.tiny = 0
    c.tinyoffset = 0}

函数 releaseAll() 负责将 mcache.alloc 中各个 sizeclass 中的 mspan 归还给 mcentral。这里需要注意的是归还给 mcentral 的时候需要加锁，因为 mcentral 是全局的。除此之外将剩下的 mcache （基本是个空壳）归还给 mheap.cachealloc，其实就是把 mcache 插入 free list 表头。


func (f *fixalloc) free(p unsafe.Pointer) {
    f.inuse -= f.size
    v := (*mlink)(p)
    v.next = f.list
    f.list = v
}

5.2 mcentral 回收
当 mspan 没有 free object 的时候，将 mspan 归还给 mheap。

func (c *mcentral) freeSpan(s *mspan, preserve bool, wasempty bool) bool {
    ...
    lock(&c.lock)
    ...   
    if s.allocCount != 0 {
        unlock(&c.lock)        
         return false
    }

    c.nonempty.remove(s)
    unlock(&c.lock)
    mheap_.freeSpan(s, 0)
    return true}

5.3 mheap
mheap 并不向操作系统归还，但是会对 span 做一些操作，比如合并相邻的 span。

3. 总结
tcmalloc 是一种理论，运用到实践中还要考虑工程实现的问题。学习 Golang 源码的过程中，除了知道它是如何工作的之外，还可以学习到很多有趣的知识，比如使用变量填充 CacheLine 避免 False Sharing，利用 debruijn 序列求解 Trailing Zero（在函数中 sys.Ctz64 使用）等等。我想这就是读源码的意义所在吧。

4. 参考：

tcmalloc 介绍
TCMalloc : Thread-Caching Malloc
《Go 语言学习笔记》
False Sharing- wikipedia

有疑问加站长微信联系（非本文作者）

本文来自：51CTO博客

感谢作者：mob604756f0bbf4

查看原文：Golang的内存管理(下篇)

入群交流（和以上内容无关）：加入Go大咖交流群，或添加微信：liuxiaoyan-s 备注：入群；或加QQ群：701969077

Golang的内存管理(下篇)

Golang的内存管理(下篇)

Recommend

华中首家欧瑞博超级全宅智能体验店在武昌揭幕

净利最高暴增9倍定制家居一季度业绩果然爆发！

建材家居市场恢复加快，3月BHI大幅上涨

欧派首发净醛抗菌爱芯板

Go开发工程师：迎接上升风口，踏入蓝海行业！

同比增长超4倍！我乐2021Q1净利润1164.68万元

杭州方太华东幸福家体验店开业

大亚圣象：大亚集团解押及再质押250万股

Introducing .NET Multi-platform App UI: Q&A With David Ortinau

What Is Yield Farming in Decentralized Finance (DeFi)?

About Joyk