88
Data Science In Go: A Cheat Sheet by chewxy - Download free from Cheatography -...
source link: https://www.cheatography.com/chewxy/cheat-sheets/data-science-in-go-a/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Introduction
Go is the future for doing data science. In this cheatsheet, we look at 2 families of libraries that will allow you to do that. They are: gorgonia.org/tensor and gonum.org/v1/gonum/mat . The gonum libraries will be referred to as gonum/mat For this cheatsheet, assume the following: import ts "gorgonia.org/tensor" |
Note on panic and error behaviour:
1. Most
2.
1. Most
tensor
operations return error
. 2.
gonum
has a good policy of when errors are returned and when panics happen.What To Use
I ever only want a float64 matrix or vectoruse
gonum/mat |
I want to focus on doing statistical/scientific work use
gonum/mat |
I want to focus on doing machine learning work use
gonum/mat or gorgonia.org/tensor . |
I want to focus on deep learning work use
gorgonia.org/tensor |
I want multidimensional arrays use
gorgonia.org/tensor , or []mat.Matrix |
I want to work with different data types use
gorgonia.org/tensor |
I want to wrangle data like in Pandas or R - with data frames use
kniren/gota |
Default Values
Numpy a = np.Zeros((2,3)) |
gonum/mat a := mat.NewDense(2, 3, nil) |
tensor a := ts.New(ts.Of(Float32), ts.WithShape(2,3)) |
A Range... |
Numpy a = np.arange(0, 9).reshape(3,3) |
gonum a := mat.NewDense(3, 3, floats.Span(make([]float64, 9), 0, 8) |
tensor a := ts.New(ts.WithBacking(ts.Range(ts.Int, 0, 9), ts.WithShape(3,3)) |
Identity Matrices |
Numpy a = np.eye(3,3) |
gonum/mat a := mat.NewDiagonal(3, []float64{1, 1, 1}) |
tensor a := ts.I(3, 3, 0) |
Elementwise Arithmetic Operations
Addition |
Numpy c = a + b c = np.add(a, b) a += b # in-place np.add(a, b, out=c) # reuse array |
gonum/mat c.Add(a, b) a.Add(a, b) // in-place |
tensor var c *ts.Dense; c, err = a.Add(b) var c ts.Tensor; c, err = ts.Add(a, b) a.Add(b, ts.UseUnsafe()) // in-place a.Add(b, ts.WithReuse(c)) // reuse tensor ts.Add(a, b, ts.UseUnsafe()) // in-place ts.Add(a, b, ts.WithReuse(c)) // reuse Note: The operations all returns a result and an error, omitted for brevity here. It's good habit to check for errors. |
Subtraction |
Numpy c = a - b c = np.subtract(a, b) |
gonum/mat c.Sub(a, b) |
tensor c, err:= a.Sub(b) c, err = ts.Sub(a, b) |
Multiplication |
Numpy c = a * b c = np.multiply(a, b) |
gonum/mat c.MulElem(a, b) |
tensor c, err := a.Mul(b) c, err := ts.Mul(a, b) |
Division |
Numpy c = a / b c = np.divide(a, b) |
gonum/mat c.DivElem(a, b) |
tensor c, err := a.Div(b) c, err := ts.Div(a, b) Note: When encountering division by 0 for non-floats, an error will be returned, and the value at which the offending value will be 0 in the result. |
Note: All variations of arithmetic operations follow the patterns available in Addition for all examples.
Note on Shapes
In all of these functions,
Note on Shapes
In all of these functions,
a
and b
has to be of the same shape. In Numpy operations with dissimilar shapes will throw an exception. With gonum/mat
it'd panic. With tensor
, it will be returned as an error.Aggregation
Sum |
Numpy s = a.sum() s = np.sum(a) |
gonum/mat var s float64 = mat.Sum(a) |
tensor var s *ts.Dense = a.Sum() var s ts.Tensor = ts.Sum(a) Note: The result, which is a scalar value in this case, can be retrieved by calling s.ScalarValue() |
Sum Along An Axis |
Numpy s = a.sum(axis=0) s = np.sum(a, axis=0) |
gonum/mat Write a loop, with manual aid from mat.Col and mat.Row Note: There's no performance loss by writing a loop. In fact there arguably may be a cognitive gain in being aware of what one is doing. |
tensor var s *ts.Dense = a.Sum(0) var s ts.Tensor = ts.Sum(a, 0) |
Argmax/Argmin |
Numpy am = a.argmax() am = np.argmax(a) |
gonum Write a loop, using mat.Col and mat.Row |
tensor var am *ts.Dense; am, err = a.Argmax(ts.AllAxes) var am ts.Tensor; am, err = ts.Argmax(a, ts.AllAxes) |
Argmax/Argmin Along An Axis |
Numpy am = a.argmax(axis=0) am = np.argmax(a, axis=0) |
gonum Write a loop, using mat.Col and mat.Row |
tensor var am *ts.Dense; am, err = a.Argmax(0) var am ts.Tensor; am, err = ts.Argmax(a, 0) |
Data Structure Creation
Numpy a = np.array([1, 2, 3]) |
gonum/mat a := mat.NewDense(1, 3, []float64{1, 2, 3}) |
tensor a := ts.New(ts.WithBacking([]int{1, 2, 3}) |
Creating a float64 matrix |
Numpy a = np.array([[0, 1, 2], [3, 4, 5]], dtype='float64') |
gonum/mat a := mat.NewDense(2, 3, []float64{0, 1, 2, 3, 4, 5}) |
tensor a := ts.New(ts.WithBacking([]float64{0, 1, 2, 3, 4, 5}, ts.WithShape(2, 3)) |
Creating a float32 3-D array |
Numpy a = np.array([[[0, 1, 2], [3, 4, 5]], [[100, 101, 102], [103, 104, 105]]], dtype='float32') |
tensor a := ts.New(ts.WithShape(2, 2, 3), ts.WithBacking([]float32{0, 1, 2, 3, 4, 5, 100, 101, 102, 103, 104, 105})) |
Note: The the
Additionally,
tensor
package is imported as ts
Additionally,
gonum/mat
actually offers many different data structures, each being useful to a particular subset of computations. The examples given in this document mainly assumes a dense matrix.gonum Types
mat.Matrix |
Abstract data type representing any float64 matrix |
*mat.Dense |
Data type representing a dense float64 matrix |
tensor Types
tensor.Tensor |
An abstract data type representing any kind of tensors. Package functions work on these types. |
*tensor.Dense |
A representation of a densely packed multidimensional array. Methods return *tensor.Dense instead of tensor.Tensor |
*tensor.CS |
A representation of compressed sparse row/column matrices. |
*tensor.MA |
Coming soon - representation of masked multidimensional array. Methods return *tensor.MA instead of tensor.Tensor |
tensor.DenseTensor |
Utility type that represents densely packed multidimensional arrays |
tensor.MaskedTensor |
Utility type that represents densely packed multidimensional arrays that are masked by a slice of bool |
tensor.Sparse |
Utility type that represents any sparsely packed multi-dim arrays (for now: only *CS ) |
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK