Average weights of two Pytorch models

After reading this paper, I begin to do an experiment about it. Referencing this snippet, I wrote my code:

    net1 = model_builder.build_model()
    net2 = model_builder.build_model()
    output = model_builder.build_model()
    net1.load_state_dict(torch.load(args.model1, map_location="cpu"))
    net2.load_state_dict(torch.load(args.model2, map_location="cpu"))
    
    # Average
    sd1 = net1.named_parameters()
    sd2 = net2.named_parameters()
    sdo = dict(sd2)
    for name, param in sd1:
        sdo[name].data.copy_(0.5*param.data + 0.5*sdo[name].data)

    output.load_state_dict(sdo)
    torch.save(output, args.output)
    
    # here is a test
    output.load_state_dict(torch.load(args.output))

Python

    net1 = model_builder.build_model()

    net2 = model_builder.build_model()

    output = model_builder.build_model()

    net1.load_state_dict(torch.load(args.model1, map_location="cpu"))

    net2.load_state_dict(torch.load(args.model2, map_location="cpu"))

    # Average

    sd1 = net1.named_parameters()

    sd2 = net2.named_parameters()

    sdo = dict(sd2)

    for name, param in sd1:

        sdo[name].data.copy_(0.5*param.data + 0.5*sdo[name].data)

    output.load_state_dict(sdo)

    torch.save(output, args.output)

    # here is a test

    output.load_state_dict(torch.load(args.output))

But after generating the average-weights new model, the PyTorch failed to load it:

Traceback (most recent call last):
  File "average_models.py", line 43, in <module>
    output.load_state_dict(torch.load(args.output))
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1534, in load_state_dict
    state_dict = state_dict.copy()
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1186, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'RegNet' object has no attribute 'copy'

Shell

Traceback (most recent call last):

  File "average_models.py", line 43, in <module>

    output.load_state_dict(torch.load(args.output))

  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1534, in load_state_dict

    state_dict = state_dict.copy()

  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1186, in __getattr__

    raise AttributeError("'{}' object has no attribute '{}'".format(

AttributeError: 'RegNet' object has no attribute 'copy'

The reason for failure is quite simple: we only need to save the state_dict of the model instead of all information (since I am using FP16 format ). Therefore the correct code should be:

    net1 = model_builder.build_model()
    net2 = model_builder.build_model()
    net1.load_state_dict(torch.load(args.model1, map_location="cpu"))
    net2.load_state_dict(torch.load(args.model2, map_location="cpu"))

    # Average 
    sd1 = net1.named_parameters()
    sd2 = net2.named_parameters()
    sdo = dict(sd2) 
    for name, param in sd1:
        sdo[name].data.copy_(0.5*param.data + 0.5*sdo[name].data)

    torch.save(sdo, args.output)

Python

    net1 = model_builder.build_model()

    net2 = model_builder.build_model()

    net1.load_state_dict(torch.load(args.model1, map_location="cpu"))

    net2.load_state_dict(torch.load(args.model2, map_location="cpu"))

    # Average

    sd1 = net1.named_parameters()

    sd2 = net2.named_parameters()

    sdo = dict(sd2)

    for name, param in sd1:

        sdo[name].data.copy_(0.5*param.data + 0.5*sdo[name].data)

    torch.save(sdo, args.output)

BTW, the averaging of my models doesn’t rise accuracy as the paper suggests in my experiment.

Tips about Numpy and PyTorch
1. Type convertion in Numpy Here is my code: import numpy as np a =…
Some tips about PyTorch and Python
1. '()' may mean tuple or nothing. len(("birds")) # the inner '()' means nothing len(("birds",))…
Using PyTorch on ClearLinux docker image
I am using Nvidia's official docker image of PyTorch for my model training for quite…

July 14, 2022 - 23:40 RobinDong machine learning
PyTorch
Leave a comment

Average weights of two Pytorch models

Average weights of two Pytorch models

Related Posts

Leave a Reply Cancel reply

Recommend

Vivo T1x Launch Date In India & Flipkart Availability Officially Confirmed

迈迪克获数千万元A+轮融资，蓝资本领投

华为影像XMAGE品牌全新亮相，新一代旗舰手机的影像力稳了！

Java 插入公式到PPT幻灯片

Git basics: remove all local branches

大众汽车坚定投资中国：留给德国汽车工业的转型窗口期真的不多了

Deployment for Free -- A Machine Learning Platform for Stitch Fix's Data Scienti...

Stray is getting a line of cute accessories for your IRL cat

普宙科技发布多款旗舰新品，六大优势、两大亮点开启专业级无人机新局面

工程监测无线中继采集仪和无线网络的优势

About Joyk