1

v3dv: fix double free error when releasing sems_info resources

 1 year ago
source link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14736
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

v3dv: fix double free error when releasing sems_info resources (!14736) · Merge requests · Mesa

Release event_wait.sems_info when destroying event_wait resources. The previous approach was crashing in the testcase: dEQP-VK.api.command_buffers.record_simul_use_primary and memory was leaking when we copy semaphores info to a not-null instance of event_wait.sems_info

Fixes: d5bd18fb ("v3dv: store wait semaphores in event_wait_cpu_job_info")

Edited 6 months ago by Melissa Wen
Detached merge request pipeline #497854 passed for db9098f2 6 months ago
Approval is optional

Test summary results are being parsed

View full report

Merged by 6 months ago (Jan 31, 2022 11:21pm UTC) 6 months ago

Merge details
Pipeline #497867 waiting for manual action for db9098f2 on main

Reviewed-by: Alejandro Piñeiro <[email protected]>

  • Why do we have a double free exactly? When we have a wait events job, we supposedly make a new copy of the semaphores for that, that we then destroy when the thread ends, so I presume the double-free means that when the we are done waiting for the events and before the thread wait ends and we call:

    VkResult result = queue_submit_job(queue, pjob, info->sems_info, NULL);

    for the remaining jobs, some of those jobs are holding a reference info->sems_info and eventually free it, which I don't think is supposed to happen. So what job is trying to do that exactly? I think we need to narrow it down to understand what is happening exactly.

    I was thinking the case where we have 2 wait for event jobs in the same command buffer, but I think that should work fine: the first one spawns the wait thread and copies the semaphores, then when the wait is done, we execute the second wait for events job in the same wait thread (it won't copy the semaphores) and the semaphores are freed once after the threads ends, so there should not be a double-free.

  • From what I tracked in this test case (using the commit that introduced the double free but before multisync support):

    • We have two wait_event_cpu_job in different command_buffer_batch (therefore different command buffers)
    • In the first wait_event_cpu_job from the first cmd_buf_batch, (cpu.event_wait) -> info->sems_info == NULL and we copy semaphores and spawn a wait_thread, as expected.
    • In the second wait_event_cpu_job from the second cmd_buf_batch, info->sems_info is not NULL, we copy semaphores again, spawn another wait_thread (because we are in a different cmd buffer) and lose the sems_info reference.
    • Finally, the wait_thread to both events finish, the first free the sems_info and the second crashes by trying to double free it.

    I'll dig a bit more.

Please register or sign in to reply

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK