88

Data Science In Go: A Cheat Sheet by chewxy - Download free from Cheatography -...

 6 years ago
source link: https://www.cheatography.com/chewxy/cheat-sheets/data-science-in-go-a/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Introd­uction

Go is the future for doing data science. In this cheats­heet, we look at 2 families of libraries that will allow you to do that.

They are: gorgon­ia.o­rg­/tensor and gonum.o­rg­/v1­/go­num/mat . The gonum libraries will be referred to as gonum/mat

For this cheats­heet, assume the following:

import ts "­gor­gon­ia.o­rg­/te­nso­r"
Note on panic and error behaviour:
1. Most tensor operations return error.
2. gonum has a good policy of when errors are returned and when panics happen.

What To Use

I ever only want a float64 matrix or vector
use gonum/mat
I want to focus on doing statis­tic­al/­sci­entific work
use gonum/mat
I want to focus on doing machine learning work
use gonum/mat or gorgon­ia.o­rg­/tensor.
I want to focus on deep learning work
use gorgon­ia.o­rg­/tensor
I want multid­ime­nsional arrays
use gorgon­ia.o­rg­/tensor, or []mat.M­atrix
I want to work with different data types
use gorgon­ia.o­rg­/tensor
I want to wrangle data like in Pandas or R - with data frames
use kniren­/gota

Default Values

Numpy
a = np.Zer­os(­(2,3))
gonum/mat
a := mat.Ne­wDe­nse(2, 3, nil)
tensor
a := ts.New­(ts.Of­(Fl­oat32), ts.Wit­hSh­ape­(2,3))
A Range...
Numpy
a = np.ara­nge(0, 9).res­hap­e(3,3)
gonum
a := mat.NewDense(3, 3, floats.Span(make([]float64, 9), 0, 8)
tensor
a := ts.New­(ts.Wi­thB­ack­ing­(ts.Ra­nge­(ts.Int, 0, 9), ts.Wit­hSh­ape­(3,3))
Identity Matrices
Numpy
a = np.eye­(3,3)
gonum/mat
a := mat.Ne­wDi­ago­nal(3, []floa­t64{1, 1, 1})
tensor
a := ts.I(3, 3, 0)

Elemen­twise Arithmetic Operations

Addition
Numpy
c = a + b
c = np.add(a, b)
a += b  ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ # in-place
np.add(a, b, out=c) # reuse array
gonum/mat
c.Add(a, b)
a.Add(a, b) // in-place
tensor
var c *ts.Dense; c, err = a.Add(b)
var c ts.Tensor; c, err = ts.Add(a, b)
a.Add(b, ts.Use­Uns­afe())  ­ ­ ­ ­ // in-place
a.Add(b, ts.Wit­hRe­use(c))  ­ ­ ­ // reuse tensor
ts.Add(a, b, ts.Use­Uns­afe())  // in-place
ts.Add(a, b, ts.Wit­hRe­use(c)) // reuse
Note: The operations all returns a result and an error, omitted for brevity here. It's good habit to check for errors.
Subtra­ction
Numpy
c = a - b
c = np.sub­tra­ct(a, b)
gonum/mat
c.Sub(a, b)
tensor
c, err:= a.Sub(b)
c, err = ts.Sub(a, b)
Multip­lic­ation
Numpy
c = a * b
c = np.mul­tip­ly(a, b)
gonum/mat
c.MulE­lem(a, b)
tensor
c, err := a.Mul(b)
c, err := ts.Mul(a, b)
Division
Numpy
c = a / b
c = np.div­ide(a, b)
gonum/mat
c.DivE­lem(a, b)
tensor
c, err := a.Div(b)
c, err := ts.Div(a, b)
Note: When encoun­tering division by 0 for non-fl­oats, an error will be returned, and the value at which the offending value will be 0 in the result.
Note: All variations of arithmetic operations follow the patterns available in Addition for all examples.

Note on Shapes
In all of these functions, a and b has to be of the same shape. In Numpy operations with dissimilar shapes will throw an exception. With gonum/mat it'd panic. With tensor, it will be returned as an error.

Aggreg­ation

Sum
Numpy
s = a.sum()
s = np.sum(a)
gonum/mat
var s float64 = mat.Sum(a)
tensor
var s *ts.Dense = a.Sum()
var s ts.Tensor = ts.Sum(a)
Note: The result, which is a scalar value in this case, can be retrieved by calling s.Scal­arV­alue()
Sum Along An Axis
Numpy
s = a.sum(­axis=0)
s = np.sum(a, axis=0)
gonum/mat
Write a loop, with manual aid from mat.Col and mat.Row
Note: There's no perfor­mance loss by writing a loop. In fact there arguably may be a cognitive gain in being aware of what one is doing.
tensor
var s *ts.Dense = a.Sum(0)
var s ts.Tensor = ts.Sum(a, 0)
Argmax­/Argmin
Numpy
am = a.argmax()
am = np.arg­max(a)
gonum
Write a loop, using mat.Col and mat.Row
tensor
var am *ts.Dense; am, err = a.Argmax(ts.AllAxes)
var am ts.Tensor; am, err = ts.Argmax(a, ts.AllAxes)
Argmax­/Argmin Along An Axis
Numpy
am = a.argm­ax(­axis=0)
am = np.arg­max(a, axis=0)
gonum
Write a loop, using mat.Col and mat.Row
tensor
var am *ts.Dense; am, err = a.Argm­ax(0)
var am ts.Tensor; am, err = ts.Arg­max(a, 0)

Data Structure Creation

Numpy
a = np.arr­ay([1, 2, 3])
gonum/mat
a := mat.Ne­wDe­nse(1, 3, []floa­t64{1, 2, 3})
tensor
a := ts.New­(ts.Wi­thB­ack­ing­([]­int{1, 2, 3})
Creating a float64 matrix
Numpy
a = np.arr­ay([[0, 1, 2], [3, 4, 5]], dtype=­'fl­oat64')
gonum/mat
a := mat.Ne­wDe­nse(2, 3, []floa­t64{0, 1, 2, 3, 4, 5})
tensor
a := ts.New­(ts.Wi­thB­ack­ing­([]­flo­at64{0, 1, 2, 3, 4, 5}, ts.Wit­hSh­ape(2, 3))
Creating a float32 3-D array
Numpy
a = np.arr­ay(­[[[0, 1, 2], [3, 4, 5]], [[100, 101, 102], [103, 104, 105]]], dtype=­'fl­oat32')
tensor
a := ts.New(ts.WithShape(2, 2, 3), ts.WithBacking([]float32{0, 1, 2, 3, 4, 5, 100, 101, 102, 103, 104, 105}))
Note: The the tensor package is imported as ts
Additi­onally, gonum/mat actually offers many different data struct­ures, each being useful to a particular subset of comput­ations. The examples given in this document mainly assumes a dense matrix.

gonum Types

mat.Matrix
Abstract data type repres­enting any float64 matrix
*mat.Dense
Data type repres­enting a dense float64 matrix

tensor Types

tensor.Tensor
An abstract data type repres­enting any kind of tensors. Package functions work on these types.
*tensor.Dense
A representation of a densely packed multidimensional array. Methods return *tensor.Dense instead of tensor.Tensor
*tensor.CS
A repres­ent­ation of compressed sparse row/column matrices.
*tensor.MA
Coming soon - representation of masked multidimensional array. Methods return *tensor.MA instead of tensor.Tensor
tensor.DenseTensor
Utility type that represents densely packed multidimensional arrays
tensor.MaskedTensor
Utility type that represents densely packed multidimensional arrays that are masked by a slice of bool
tensor.Sparse
Utility type that represents any sparsely packed multi-dim arrays (for now: only *CS)

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK