

Github GitHub - OATML/non-parametric-transformers: Code for "Self-Attention...
source link: https://github.com/OATML/non-parametric-transformers
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning
Overview | Abstract | Installation | Examples | Citation
Overview
Hi, good to see you here!
Thanks for checking out the code for Non-Parametric Transformers (NPTs).
This codebase will allow you to reproduce experiments from the paper as well as use NPTs for your own research.
Abstract
We challenge a common assumption underlying most supervised deep learning: that a model makes a prediction depending only on its parameters and the features of a single input. To this end, we introduce a general-purpose deep learning architecture that takes as input the entire dataset instead of processing one datapoint at a time. Our approach uses self-attention to reason about relationships between datapoints explicitly, which can be seen as realizing non-parametric models using parametric attention mechanisms. However, unlike conventional non-parametric models, we let the model learn end-to-end from the data how to make use of other datapoints for prediction. Empirically, our models solve cross-datapoint lookup and complex reasoning tasks unsolvable by traditional deep learning models. We show highly competitive results on tabular data, early results on CIFAR-10, and give insight into how the model makes use of the interactions between points.
Installation
Set up and activate the Python environment by executing
conda env create -f environment.yml
conda activate npt
For now, we recommend installing CUDA <= 10.2:
See issue with CUDA >= 11.0 here.
If you are running this on a system without a GPU, use the above with environment_no_gpu.yml
instead.
Examples
We now give some basic examples of running NPT.
NPT downloads all supported datasets automatically, so you don't need to worry about that.
We use wandb to log experimental results.
Wandb allows us to conveniently track run progress online.
If you do not want wandb enabled, you can run wandb off
in the shell where you execute NPT.
For example, run this to explore NPT with default configuration on Breast Cancer
python run.py --data_set breast-cancer
Another example: A run on the poker-hand dataset may look like this
python run.py --data_set poker-hand \
--exp_batch_size 4096 \
--exp_print_every_nth_forward 100
You can find all possible config arguments and descriptions in NPT/configs.py
or using python run.py --help
.
In scripts/
we provide a list with the runs and correct hyperparameter configurations presented in the paper.
We hope you enjoy using the code and please feel free to reach out with any questions
Citation
If you find this code helpful for your work, please cite our paper Paper as
@article{kossen2021self, title={Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning}, author={Kossen, Jannik and Band, Neil and Gomez, Aidan N. and Lyle, Clare and Rainforth, Tom and Gal, Yarin}, journal={arXiv:2106.02584}, year={2021} }
Recommend
-
135
Overview This is a python package implementing parametric t-SNE. We train a neural-network to learn a mapping by minimizing the Kullback-Leibler divergence between the Gaussian distance metric in the high-dimensional space and the Student...
-
40
Higher order functions and parametric polymorphism Parametric polymorphism So far we have only worked with functions that take value of a single type known beforehand. However, we have already seen...
-
36
README.md Status: Archive (code is provided as-is, no updates expected) Sparse Attention This repository contains the sparse at...
-
45
README.md Adaptive Attention Span for Transformers This is a code for running experiments in Adaptive Attent...
-
20
In this post, I thought it would be nice to collate some research on the advancements of Natural Language Processing (NLP) over the years.
-
44
x-transformers A concise but fully-featured transformer, complete with a set of promising experimental features from various papers. Install $ pip install x-transformers Usage
-
10
Focal Transformer This is the official implementation of our Focal Transformer -- "Focal Self-attention for Local-Global Interactions in Vision Transformers", by Jianwei Yang, Chunyu...
-
13
CadQuery What is CadQuery CadQuery is an intuitive, easy-to-use Python module for building parametric 3D CAD models. Using CadQuery, you can write short, simple scripts that produce high quality CAD models. It is easy to mak...
-
3
Ask HN: Can someone ELI5 transformers and the “Attention is all we need” paper?
-
7
"Attention", "Transformers", in Neural Network "Large Language Models" 05 Dec 2023 10:14 Yet Another Inadequate Placeholder I find this literature irritating and opaque. This is at least somewhat because I do not...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK