10

Optimize unchecked indexing into chunks and chunks_mut by the8472 · Pull Request...

 4 years ago
source link: https://github.com/rust-lang/rust/pull/86823
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Copy link

Contributor

the8472 commented 21 days ago

Fixes #53340

# BEFORE

$ rustc +nightly -Copt-level=3 -Ccodegen-units=1 -Clto=fat chunks.rs
$ perf stat ./chunks

 Performance counter stats for './chunks':

          3,177.03 msec task-clock                #    1.000 CPUs utilized
                 4      context-switches          #    0.001 K/sec
                 0      cpu-migrations            #    0.000 K/sec
           984,006      page-faults               #    0.310 M/sec
    13,092,199,322      cycles                    #    4.121 GHz                      (83.29%)
       384,543,475      stalled-cycles-frontend   #    2.94% frontend cycles idle     (83.35%)
     7,414,280,722      stalled-cycles-backend    #   56.63% backend cycles idle      (83.38%)
    50,493,980,662      instructions              #    3.86  insn per cycle
                                                  #    0.15  stalled cycles per insn  (83.29%)
     6,625,375,297      branches                  # 2085.396 M/sec                    (83.38%)
         3,087,652      branch-misses             #    0.05% of all branches          (83.31%)

       3.178079469 seconds time elapsed

       2.327156000 seconds user
       0.762041000 seconds sys

# AFTER

$ ./build/x86_64-unknown-linux-gnu/stage1/bin/rustc -Copt-level=3 -Ccodegen-units=1 -Clto=fat chunks.rs
$ perf stat ./chunks

 Performance counter stats for './chunks':

          2,705.76 msec task-clock                #    1.000 CPUs utilized
                 4      context-switches          #    0.001 K/sec
                 0      cpu-migrations            #    0.000 K/sec
           984,005      page-faults               #    0.364 M/sec
    11,156,763,039      cycles                    #    4.123 GHz                      (83.26%)
       342,198,882      stalled-cycles-frontend   #    3.07% frontend cycles idle     (83.37%)
     6,486,263,637      stalled-cycles-backend    #   58.14% backend cycles idle      (83.37%)
    40,553,476,617      instructions              #    3.63  insn per cycle
                                                  #    0.16  stalled cycles per insn  (83.37%)
     6,668,429,113      branches                  # 2464.532 M/sec                    (83.37%)
         3,099,636      branch-misses             #    0.05% of all branches          (83.26%)

       2.706725288 seconds time elapsed

       1.782083000 seconds user
       0.848424000 seconds sys

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK