README.md

Efficient Neural Architecture Search via Parameter Sharing

Authors' implementation of "Efficient Neural Architecture Search via Parameter Sharing" (2018) in TensorFlow.

Includes code for CIFAR-10 image classification and Penn Tree Bank language modeling tasks.

Authors: Hieu Pham*, Melody Y. Guan*, Barret Zoph, Quoc V. Le, Jeff Dean

This is not an official Google product.

Penn Treebank

The Penn Treebank dataset is included at data/ptb. Depending on the system, you may want to run the script data/ptb/process.py to create the pkl version. All hyper-parameters are specified in these scripts.

To run the ENAS search process on Penn Treebank, please use the script

./scripts/ptb_search.sh

To run ENAS with a determined architecture, you have to specify the archiecture using a string. The following is an example script for using the architecture we described in our paper.

./scripts/ptb_final.sh

A sequence of architecture for a cell with N nodes can be specified using a sequence a of 2N + 1 tokens

a[0] is a number in [0, 1, 2, 3], specifying the activation function to use at the first cell: tanh, ReLU, identity, and sigmoid.
For each i, a[2*i] specifies a previous index and a[2*i+1] specifies the activation function at the i-th cell.

For a concrete example, the following sequence specifies the architecture we visualize in our paper

0 0 0 1 1 2 1 2 0 2 0 5 1 1 0 6 1 8 1 8 1 8 1

CIFAR-10

To run the experiments on CIFAR-10, please first download the dataset. Again, all hyper-parameters are specified in the scripts that we descibe below.

To run the ENAS experiments on the macro search space as described in our paper, please use the following scripts:

./scripts/cifar10_macro_search.sh
./scripts/cifar10_macro_final.sh

A macro architecture for a neural network with N layers consists of N parts, indexed by 1, 2, 3, ..., N. Part i consists of:

A number in [0, 1, 2, 3, 4, 5] that specifies the operation at layer i-th, corresponding to conv_3x3, separable_conv_3x3, conv_5x5, separable_conv_5x5, average_pooling, max_pooling.
A sequence of i - 1 numbers, each is either 0 or 1, indicating whether a skip connection should be formed from a the corresponding past layer to the current layer.

A concrete example can be found in our script ./scripts/cifar10_macro_final.sh.

To run the ENAS experiments on the micro search space as described in our paper, please use the following scripts:

./scripts/cifar10_micro_search.sh
./scripts/cifar10_micro_final.sh

A micro cell with B + 2 blocks can be specified using B blocks, corresponding to blocks numbered 2, 3, ..., B+1, each block consists of 4 numbers

index_1, op_1, index_2, op_2

Here, index_1 and index_2 can be any previous index. op_1 and op_2 can be [0, 1, 2, 3, 4], corresponding to separable_conv_3x3, separable_conv_5x5, average_pooling, max_pooling, identity.

A micro architecture can be specified by two sequences of cells concatenated after each other, as shown in our script ./scripts/cifar10_micro_final.sh

Citations

If you happen to use our work, please consider citing our paper.

@article{enas,
  title   = {Efficient Neural Architecture Search via Parameter Sharing},
  author  = {Pham, Hieu and
             Guan, Melody Y. and
             Zoph, Barret and
             Le, Quoc V. and
             Dean, Jeff
  },
  journal   = {Arxiv, 1802.03268},
  year      = {2018}
}

GitHub - melodyguan/enas: TensorFlow Code for paper "Efficient Neural Archi...

README.md

Efficient Neural Architecture Search via Parameter Sharing

Penn Treebank

CIFAR-10

Citations

Recommend

一口恶气在胸口

iCloud 同步丢了我的设计稿/(ㄒoㄒ)/~~

每天需要上班，晚上回家想锻炼一下身体，大概几点做些什么运动比较健康

《头号玩家》会不会超过复联唉，看预告片都好过瘾的样子??❥(^_-)

分页查询，查个 total 很难吗？

如何判定面试者的简历是造假的？

合租遇到一个喜欢放冷箭诋毁别人的室友，大家会怎么处理？

如何看待富豪捐 2 亿建 258 套别墅赠乡亲，房子却送不出去这件事？ - 知乎

星级酒店里的张旭豪，还是输给了平价酒店里的王兴

互联网基金促销“红包”名目繁多顶风作案不止

About Joyk