2

PerfView专题 (第十篇):洞察 C# 终结队列引发的内存泄漏

 1 year ago
source link: https://www.cnblogs.com/huangxincheng/p/16618860.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

C# 程序内存泄漏的诱发因素有很多,但从顶层原理上来说,就是该销毁的 用户根 对象没有被销毁,从而导致内存中意料之外的对象无限堆积,导致内存暴涨,最终崩溃,这其中的一个用户根就是 终结器队列,这一篇我们就来看下如何让 PerfView 配合 WinDbg 双剑合璧。

二:如何洞察

1. 终结器内存泄漏

为了模拟 终结器内存泄漏,我们故意在 析构函数 中执行复杂的逻辑,让析构过程足够的慢,这样可以实现 分配速度 远大于 销毁速度 ,达到消费能力不足引发的内存暴涨, 参考如下代码:


    internal class Program
    {
        static void Main(string[] args)
        {
            Task.Run(Add);

            Console.ReadLine();
        }

        static void Add()
        {
            for (int i = 0; i < 1000000; i++)
            {
                var person = new Person() { Name = $"jack{i}", Age = i };
            }
        }
    }

    public class Person
    {
        public string Name { get; set; }
        public int Age { get; set; }
        ~Person()
        {
            Thread.Sleep(new Random().Next(0,3000));
            Console.WriteLine($"name={Name} 已析构了哦 ...");
        }
    }

当分配操作结束后,用 WinDbg 附加到进程中,使用 !fq 查看内存情况,输出如下:


0:015> !fq
SyncBlocks to be cleaned up: 0
Free-Threaded Interfaces to be released: 0
MTA Interfaces to be released: 0
STA Interfaces to be released: 0
----------------------------------
generation 0 has 28423 finalizable objects (000000001BF5B108->000000001BF92940)
generation 1 has 4 finalizable objects (000000001BF5B0E8->000000001BF5B108)
generation 2 has 21 finalizable objects (000000001BF5B040->000000001BF5B0E8)
Ready for finalization 971560 objects (000000001BF92940->000000001C6FC280)
Statistics for all finalizable objects (including all objects ready for finalization):
              MT    Count    TotalSize Class Name
00007ffdbaa2f410        4           96 System.WeakReference
00007ffdbaa4f368        2          112 System.Threading.ThreadPoolWorkQueueThreadLocals
00007ffdbaa4c640        1          168 System.Diagnostics.Tracing.FrameworkEventSource
00007ffdbaa417b8        1          168 System.Diagnostics.Tracing.NativeRuntimeEventSource
00007ffdbaa4a158        1          176 System.Threading.Tasks.TplEventSource
00007ffdbaa05650        3          216 System.Threading.Thread
00007ffdbaa240a8        1          320 System.Diagnostics.Tracing.RuntimeEventSource
00007ffdbaa24ac8        8          896 System.Diagnostics.Tracing.EventSource+OverideEventProvider
00007ffdbaa4fb58   999987     31999584 ConsoleApp2.Person
Total 1000008 objects

从上面的 Ready for finalization 971560 objects 中可以看到,当前有 9.7w 的对象正在排队等待 Finalizer 线程执行,既然它可以执行,为什么执行这么慢呢? 这时候就需要调查下 Finalizer Thread 此时正在干嘛。


0:001> !t
ThreadCount:      7
UnstartedThread:  0
BackgroundThread: 6
PendingThread:    0
DeadThread:       0
Hosted Runtime:   no
                                                                                                            Lock  
 DBG   ID     OSID ThreadOBJ           State GC Mode     GC Alloc Context                  Domain           Count Apt Exception
   0    1     4f98 000000000058C270    2a020 Preemptive  0000000000000000:0000000000000000 00000000005814e0 1     MTA 
   5    2     3f4c 000000001AA94090  202b220 Preemptive  00000000088A9D60:00000000088A9FD0 00000000005814e0 0     MTA (Finalizer) 
   7    3     24b8 000000001AA986D0  102a220 Preemptive  0000000000000000:0000000000000000 00000000005814e0 0     MTA (Threadpool Worker) 
  11    4     5520 000000000056F580  1029220 Preemptive  00000000088A5100:00000000088A5FD0 00000000005814e0 0     MTA (Threadpool Worker) 
  12    5     1004 000000001AB26160  1029220 Preemptive  0000000000000000:0000000000000000 00000000005814e0 0     MTA (Threadpool Worker) 
  13    6     58a8 000000001B6D35D0    21220 Preemptive  0000000000000000:0000000000000000 00000000005814e0 0     Ukn 
  14    7      5b8 000000001B650820  1029220 Preemptive  0000000000000000:0000000000000000 00000000005814e0 0     MTA (Threadpool Worker) 
0:001> ~~[3f4c]s
ntdll!NtDelayExecution+0x14:
00007ffe`8908c634 c3              ret
0:005> !clrstack 
OS Thread Id: 0x3f4c (5)
        Child SP               IP Call Site
000000001ACEF868 00007ffe8908c634 [HelperMethodFrame: 000000001acef868] System.Threading.Thread.SleepInternal(Int32)
000000001ACEF960 00007ffe19f0c46b System.Threading.Thread.Sleep(Int32) [/_/src/System.Private.CoreLib/src/System/Threading/Thread.CoreCLR.cs @ 259]
000000001ACEF990 00007ffdba986e15 ConsoleApp2.Person.Finalize() [D:\net6\ConsoleApp1\ConsoleApp2\Program.cs @ 31]
000000001ACEFCE0 00007ffe1a4a6c06 [DebuggerU2MCatchHandlerFrame: 000000001acefce0] 

从输出中可以看到,终结器线程正在 Sleep() 函数,如果你有源码的话,可以看下 ConsoleApp2.Person.Finalize() 中的具体业务逻辑,如果没有源码的话,可以使用 !U 00007ffdba986e15 反汇编下方法源码。


0:005> !U 00007ffdba986e15
Normal JIT generated code
ConsoleApp2.Person.Finalize()
ilAddr is 00000000023920E0 pImport is 0000000002FFF460
Begin 00007FFDBA986DA0, size e9

D:\net6\ConsoleApp1\ConsoleApp2\Program.cs @ 31:
00007ffd`ba986dd4 48b998b4a6bafd7f0000 mov rcx,7FFDBAA6B498h (MT: System.Random)
00007ffd`ba986dde e85d0ab25f      call    coreclr!JIT_TrialAllocSFastMP_InlineGetThread (00007ffe`1a4a7840)
00007ffd`ba986de3 488945f8        mov     qword ptr [rbp-8],rax
00007ffd`ba986de7 488b4df8        mov     rcx,qword ptr [rbp-8]
00007ffd`ba986deb e848fdffff      call    00007ffd`ba986b38 (System.Random..ctor(), mdToken: 00000000060015AB)
00007ffd`ba986df0 488b4df8        mov     rcx,qword ptr [rbp-8]
00007ffd`ba986df4 33d2            xor     edx,edx
00007ffd`ba986df6 41b8b80b0000    mov     r8d,0BB8h
00007ffd`ba986dfc 488b45f8        mov     rax,qword ptr [rbp-8]
00007ffd`ba986e00 488b00          mov     rax,qword ptr [rax]
00007ffd`ba986e03 488b4040        mov     rax,qword ptr [rax+40h]
00007ffd`ba986e07 ff5030          call    qword ptr [rax+30h]
00007ffd`ba986e0a 8945f4          mov     dword ptr [rbp-0Ch],eax
00007ffd`ba986e0d 8b4df4          mov     ecx,dword ptr [rbp-0Ch]
00007ffd`ba986e10 e833e7feff      call    00007ffd`ba975548 (System.Threading.Thread.Sleep(Int32), mdToken: 0000000006001CD5)
>>> 00007ffd`ba986e15 90              nop

D:\net6\ConsoleApp1\ConsoleApp2\Program.cs @ 32:
00007ffd`ba986e16 488b0c25f0305d12 mov     rcx,qword ptr [125D30F0h] ("name=")
00007ffd`ba986e1e 48894de8        mov     qword ptr [rbp-18h],rcx
00007ffd`ba986e22 488b4d10        mov     rcx,qword ptr [rbp+10h]
00007ffd`ba986e26 e89dcfffff      call    00007ffd`ba983dc8 (ConsoleApp2.Person.get_Name(), mdToken: 0000000006000004)
00007ffd`ba986e2b 488945e0        mov     qword ptr [rbp-20h],rax
00007ffd`ba986e2f 4c8b0425f8305d12 mov     r8,qword ptr [125D30F8h] ("")
00007ffd`ba986e37 488b4de8        mov     rcx,qword ptr [rbp-18h]
00007ffd`ba986e3b 488b55e0        mov     rdx,qword ptr [rbp-20h]
00007ffd`ba986e3f e864d7feff      call    00007ffd`ba9745a8 (System.String.Concat(System.String, System.String, System.String), mdToken: 0000000006000705)
00007ffd`ba986e44 488945d8        mov     qword ptr [rbp-28h],rax
00007ffd`ba986e48 488b4dd8        mov     rcx,qword ptr [rbp-28h]
00007ffd`ba986e4c e8cf99ffff      call    00007ffd`ba980820 (System.Console.WriteLine(System.String), mdToken: 0000000006000081)
00007ffd`ba986e51 90              nop
00007ffd`ba986e52 90              nop
00007ffd`ba986e53 eb00            jmp     00007ffd`ba986e55
00007ffd`ba986e55 488bcc          mov     rcx,rsp
00007ffd`ba986e58 e808000000      call    00007ffd`ba986e65 (ConsoleApp2.Person.Finalize(), mdToken: 0000000006000008)
00007ffd`ba986e5d 90              nop

D:\net6\ConsoleApp1\ConsoleApp2\Program.cs @ 33:
00007ffd`ba986e5e 90              nop
00007ffd`ba986e5f 488d6500        lea     rsp,[rbp]
00007ffd`ba986e63 5d              pop     rbp
00007ffd`ba986e64 c3              ret

最终我们找到了问题原因,在真实项目中肯定不会这么简单的,往往会执行一个复杂的逻辑,接下来我们就有一个好奇点了,那个 复杂的逻辑 会大概执行多久呢?

因为 dump 只是一个静态快照,所以从 dump 中寻找的路子就封死了,那有没有方案呢? 肯定有啦,让 PerfView 大威天龙。

2. Finalize() 到底有多慢

在 CoreCLR 中有一些监控 Finalizer Thread 线程的 ETW 事件,具体是:

1)FinalizersStart 事件
2)FinalizerObject 事件
3)FinalizersStop 事件

当一个对象准备析构时,会触发 FinalizerObject ETW事件,所以观察对象之间的析构间隔,大概就能看出大致的 耗费时间

知道原理之后,接下来打开 PerfView,使用默认设置,启用 Collect -> Collect 收集,然后把应用程序跑起来,运行一段时间后,点击 Stop Collection ,在生成的 zip 面板中点击 Event ,搜索 Finalize 关键词,截图如下:

214741-20220824111224664-1766195352.png

从图中可以看到,TypeName 列都是 Person 对象,而且从 Time MSec 时间戳上可以观察到 Person 和 Person 之间相隔 s 级以上,起码说明析构函数 执行真的很慢。


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK