TVM开发报告 - 2020年10月 - JOYK Joy of Geek, Geek News, Link all geek

社区论坛(discuss.tvm.ai) 在十月份累计页面浏览量约14万2千次，独立用户访问数达2300余次。同时社区也分别加入了一位committer(@junrushao1994)和一位reviewer(@areusch). 功能和特性方面，我们在TIR中增加了仿射映射工具来支持循环优化和layout操作；在BYOC中加入了对TensorRT的支持；另外对auto-scheduler，TVM command-line, Rust binding，算子及模型也做了大量支持和优化。

具体细节和PR请参考下文：

Relay IR and TIR

[Relay] Mix mode type inference #6704
[Relay] Change some passes to mix mode #6695
[Relay] support i64 indices #6143
[ManifestAlloc] Handle TupleType inputs in CheckReshapeOnly #6776
[ARITH] Introduce iterator (quasi)affine map detection. #6667
[ARITH] Tight bound for floormod #6771

Operator support

[RELAY][OP] Dynamic conv2d batch size for cuda #6598
[Relay, TOPI] Complete rewrite of where op to support broadcasting #6759
[Topi] Allow batch_matmul to broadcast along batch dimension. #6616
[Relay][Training] Add more missing gradients #6767
[RELAY][OP] roi_pool operator alter layout #6516
Add dot product support for quantized convolution. #6445

Backend

[LLVM] Create fixed vector size according to latest LLVM12+ changes #6717
[Hexagon] Use nullptr instead of 0 in http:// hexagon_device_sim.cc #6718
[LLVM] Avoid warnings when compiling getNumElements with LLVM12+ #6738
[Hexagon] Remove use of designated initializers from http:// hexagon_module.cc #6055
[LLVM/CPU] Terminate basic block after “ret” instruction #6036
[LLVM] Add target feature string to function attributes #6763
[OpenCL] Only use thrust for cuda targets #6722
Adjust Vulkan queue selection and creation logic #6662
[VTA] quant support for alu-only op #6191

BYOC

[BYOC] Configurable optimize pass for PartitionGraph #6777
[BYOC][TensorRT] TensorRT BYOC integration #6395
[BYOC] Allow custom codegens to register their own constant updater #6697
[BYOC][ACL] Support add operation #6532
[BYOC] Added default_tuples parameter to AnnotateTarget pass #6655
[BYOC] Support control flow in annotate_target #6641

Ansor, Autoscheduler and AutoTVM

[AutoSchedule] Support multiple cache read and fix bugs #6686
[Ansor] Support multiple output ops and fix Python API printing #6584
[AutoTVM] Load configs even it has no entity #6100
[AutoScheduler] Improve the rule of mutating parallel granularity #6568
[AutoScheduler] Add task scheduler #6663 1

MicroTVM

[µTVM] Avoid use of builtin math functions #6630
Add µTVM Zephyr support + QEMU regression test #6603
[µTVM] Add serial transport, parameterize µTVM Zephyr test, run on physical HW #6789

Performance

Faster sparse_dense on GPUs #6580
[Relay] A set of utilities that allows a model to be run efficiently on tensorcores. #6748
[QNN] Optimize requantize for power of 2 and fix dequantize for per-channel quantized input #6675

Runtime

[RELAY][VM] Enable heterogeneous execution for Relay VM #6337
[Relay][VM] Add support for references. #6798
Updated runtime to run under FreeBSD. #6600

Tvmc

[tvmc] unify all logs on a single logger ‘TVMC’ #6577
[TVMC] use common function to obtain target from --target value on ‘tvmc compile’ #6788
[TVMC] ‘tvmc run’ --rpc-tracker and --rpc-tracker fail due to argparse misconfiguration #6762
[tvmc] Introduce ‘run’ subcommand (part 4/4) #6578
[tvmc] fix command line argument variable name in ‘compile’ #6574
[tvmc] command line driver ‘compile’ (part 2/4) #6302
[TVMC] fail gracefully in case no subcommand is provided #6625

Frontend

Add more Rust bindings #6678
[Rust] Improve NDArray, GraphRt, and Relay bindings #6563

Torch

[Torch, Quantization] Necessary workaround to prepare for 1.6 update #6602
[Torch, QNN] Support dynamic quantization flow to enable importing quantized transformer models #6782
[Torch] Object detection support update for PyTorch 1.6 #6659
[Torch] Support bincount and scatter_add ops #6740

Tensorflow

TF argmax - handling int64 datatype #6674
[TFLite, QNN] Slice op #6217
[TFLite] Fix detection of crop in convert_batch_to_space_nd #6670
[TENSORFLOW]TF Addons activations support added #5472
[Frontend][Tensorflow] Fix TF 1.15 conv2d_transpose parsing #6589
TF frontend: add expm1 op #6783

ONNX

[Relay][Frontend][Onnx] Allow A to B broadcasting of batch_matmul and reverse strided slice #6681
[Relay][Frontend][Onnx] Loop Support #6700

MXNet

[Relay][MXNet] Support broadcast_like #6561
[TOPI]][RELAY][MXNET]Reverse/Flip operator #5513
[RELAY][TOPI][MXNET]Sequence_last op support added #5994

Refactor and API changes

[REFACTOR] Remainings of util => utils #6778
Migrate IntImm & FloatImm ObjectRef to not-null #5788
[REFACTOR][Relay] Migrate Id ObjectRef to not-null #5748
[Diagnostics][Relay][InferType] Refactor InferType to work on whole module, and use new diagnostics. #6274
Refactor diagnostic to avoid circular dependencies #6692
Replace CHECK* with ICHECK* #6745
[API] Added remove_global_func to the Python API #6787
[RELAY] Refactor FoldConstant to skip TNonComputationalOps #6720
[TVMScript] refactor #6734

Build and CI

[CI] Update wasm emcc to latest #6755
[CI] Move to use main as the default #6665
[CI] CI docker staging update to latest #6708
[CI] Introduce all platform test for windows/mac/linux. #6756
[CI] Update ci-wasm to latest #6772
[CI] add python environment setup as part of cpp unittest runner script #6639
[TEST][TEDD] improve TEDD tests to also run on CPU Docker image #6643
[CI] add python environment setup as part of cpp unittest runner script #6639
[CI] Pin h5py version to < 3.0 to workaround issues with TF/Keras #6808
[TEST][CI] make sure graphviz is on both ci-cpu and ci-gpu images #6645
properly pass through command-line args in docker/bash.sh #6599
add black-format to docker/lint.sh, suppport in-place format #6601
[apps/bundle_deploy] Link demo_* targets with LDFLAGS and also with -lm. #6636
Add qemu build step to CI #6644
Add ci_qemu docker image #6485
Improve interactive docker/bash.sh #6344
Add cloudpickle dependency to docker images #6701
[Bugfix] Auto scheduler tutorial failure on CI #6723
[CI] Add m6g instance (ARM64) to CI #6781
[CI] fix cpp test #6796
[CI] Add m6g instance (ARM64) to CI #6780
[CI] Keras version upgraded from 2.3.1 to 2.4.3 #6793
[CI] Tensorflow version support upgrade from 2.1.0 to 2.3.1 #6706
[Docker] Turn on Rust docs and MxNet based ResNet #6640
[Docker] Fix tutorial broken by Docker build #6694
[Docker] Update CI CPU and GPU images based on new Docker build files. #6690
[CI] Install xgboost>=1.1.0 in CI container #6679
More CHECK to ICHECK #6758
Fix example code #6627
[Relay] Fix Strided Slice Infer Layout #6621
[CI] Update ci-cpu to the latest #6632
Add pytest-xdist and pytest-profiling to the base installation packages. #6736
[BUG_FIX] Fixes #6608: CHECK(data != nullptr) causes type checking to fail #6610
[Docker][CI][BYODT] add universal to Docker image #6654
update pyxir version to 0.1.3 #6769
Update to 20.08 version of the ethosn-driver. #6606
[CI] Set main as default in github actions #6669

Doc

[docs] Missing documentation dependency ‘autodocsumm’ on docs/README.txt #6595
[tvmc][docs] Getting started tutorial for TVMC #6597
[DOCS] Update has_dtype/has_shape to pattern lang doc #5847
[AutoScheduler] Use tempfile in tutorials #6728
[AutoScheduler] Improve the GPU tutorial by deleting measure_ctx earlier #6660
[AutoScheduler] Re-organize logs files for tutorials #6768
[Tutorial - QNN] Prequantized MXNet model compilation. #5362

Improvement and Bugfix

[PYTHON][WINDOWS] More robust dll loading behavior after python3.8 #6707
[LLVM][WINDOWS] Recover windows support for the latest LLVM #6698
Resolve more warnings in msvc #6702
[CONDA] Revamp conda recipe. #6732
[FFI][BUGFIX] Fix memory leak when Pac callback argument is NDArray #6744
[WASM] Update support for latest emcc, add ffi test. #6751
[FIX,MICROTVM] Skip microtvm tests if microtvm is not built #6693
[FIX,AUTOTVM] Print warning when all autotvm tasks fail with errors #6612
[FIX,MICROTVM] Add requires_micro decorators to microtvm tests #6747
[FIX,AUTOSCHEDULER] Fix auto_scheduler to run with multiprocessing’s spawn start method #6671
[FIX,PYLINT] Fix pylint errors on MacOS with Python 3.8 #6746
[TVMSCRIPT] Add synr dependency in preparation for tvmscript diagnostic overhaul #6795
[FIX,AUTOTVM] More descriptive error message when an autotvm task is not found #6652
[FIX][AUTOTVM] Make autotvm work with spawn #6790
[FIX,CMAKE] Use set_property with append flag instead of set_target_properties #6725
[AutoScheduler] Improve test cases #6657
[AutoScheduler] Fix a bug in thread binding #6683
[AutoScheduler] Fix mutate auto unroll #6807
[MKL] Fix offloading of batch_matmul to MKL #6752
[ConvertLayout] Fix Strided Slice #6619
[Topi][Cuda] Tiny bug fix for non-fp32 datatypes in conv2d_transpose. #6593
[Contrib][TRT] Fix Conv2D construction when channels attribute is not available. #6805
TFLite failures resulted from TF latest version upgrade resolved #6774
[Relay] Minor fix for some TF OD models #6729
[Relay] Fix dynamic case for Squeeze and Split #6739
[FIX] Fix cublas batch matmul #6715
[Frontend][Relay] Fix MXNet frontend to support NLP backbones in GluonNLP #6699
[Bugfix] Simplify reduce expression in te.gradient #6611
[ARITH] iter_affine_map bug fix, stride generalize #6753
Fix version check bug #6784
Fix leakyReLU support for CoreML #6651
fix a bug in convertSSA. #6785
Fix the Type bug in ConvertSSA. #6709
Fix format error in integrate.rst #6677
[Fix,Conda] update conda download url #6760
[CODEGEN][COREML] Call InferType explicitly in coreml test #6676
[AutoTVM][TOPI] Fix bifrost spatial packing conv2d auto tune #5684
Fix typographical error. #6664
Missing header for GraphRuntimeFactory in android_rpc #6648
[BUGFIX] Fix topi matrix multiplication using tensorcore to run faster #6749

People Who Reviewed Pull Requests:

Note: The format is name (number of activities) Disclaimer: number of activities do not directly correspond to the community’s view about the significance of contributions.

tqchen (98), zhiics (34), comaniac (33), junrushao1994 (33), jroesch (22), tmoreau89 (21), ZihengJiang (17), leandron (16), masahi (15), mbrookhart (12), merrymercy (9), anijain2305 (9), kevinthesun (9), u99127 (9), icemelon9 (8), FrozenGene (8), jwfromm (8), MarisaKirisame (7), siju-samuel (5), mbaret (5), tkonolige (5), jcf94 (5), Laurawly (4), trevor-m (4), giuseros (4), electriclilies (4), liangfu (3), areusch (3), lhutton1 (3), cbalint13 (3), rkimball (3), manupa-arm (3), hogepodge (3), yzhliu (2), wweic (2), t-vi (2), yongwww (2), ANSHUMAN87 (2), maheshambule (2), hypercubestart (2), csullivan (2), tom-gall (2), altanh (2), vinx13 (1), nhynes (1), lixiaoquan (1), kparzysz-quic (1), Huyuwei (1), slyubomirsky (1), vegaluisjose (1), soiferj (1), ajtulloch (1), weberlo (1), antinucleon (1), spectrometerHBH (1), adityaatluri (1), jmorrill (1), yongfeng-nv (1), Hzfengsy (1), ptrendx (1), yuluny2 (1), Shawn-Inspur (1)

People Whose Pull Requests are Updated:

Note: The format is name (number of activities)

tqchen (19), leandron (13), areusch (12), tkonolige (11), comaniac (9), zhiics (8), mbrookhart (8), merrymercy (7), masahi (7), anijain2305 (7), jwfromm (6), kparzysz-quic (6), ANSHUMAN87 (6), jroesch (5), trevor-m (5), siju-samuel (4), rkimball (4), zhanghaohit (4), kevinthesun (3), lixiaoquan (3), sxjscience (3), ZihengJiang (2), yzhliu (2), tmoreau89 (2), mbaret (2), u99127 (2), d-smirnov (2), electriclilies (2), hypercubestart (2), spectrometerHBH (2), gussmith23 (2), codeislife99 (2), hzfan (2), ishitatsuyuki (2), lsy643 (2), Presburger (2), qixiuai (2), shibuiwilliam (2), jtuyls (2), rohanmukh (2), MarisaKirisame (1), kazum (1), slyubomirsky (1), yongwww (1), cchung100m (1), cbalint13 (1), wpan11nv (1), giuseros (1), maheshambule (1), jmorrill (1), altanh (1), cloud-mxd (1), mwillsey (1), dpankratz (1), tristan-arm (1), ptrendx (1), Meteorix (1), ghostplant (1), Beya2019 (1), alter-xp (1), cylinbao (1), cgyurgyik (1), hogepodge (1), nolanliou (1), qiangxu1996 (1), MasterJH5574 (1), zhiqwang (1), anilmartha (1), chinakook (1), iiahim (1)

TVM开发报告 - 2020年10月

Relay IR and TIR

Operator support

Backend

BYOC

Ansor, Autoscheduler and AutoTVM

MicroTVM

Performance

Runtime

Tvmc

Frontend

Torch

Tensorflow

ONNX

MXNet

Refactor and API changes

Build and CI

Doc

Improvement and Bugfix

People Who Reviewed Pull Requests:

People Whose Pull Requests are Updated:

Recommend

线上排障技巧 | 动态修改LOGGER级别

TiDB x 平安金管家 | 拥抱 NewSQL 数据库，加速敏态业务创新

源码剖析：KEDA是如何工作的?

三千字轻松入门 TensorFlow 2

Google Photos 将在 2021 年 6 月取消无限免费容量服务

爱学习教育集团获近2亿美元D2轮融资：GIC领投华平等原股东跟

双十一高增长的背后，家装产业化还要多久？

2020年十大热词 - 卢松松博客

为什么总是觉得别人赚钱容易? - 卢松松博客

SEO整站优化值得注意的4个细节 - 卢松松博客

About Joyk