GitHub - BayesWitnesses/m2cgen: Transform ML models into a native code (Java, C,...
source link: https://github.com/BayesWitnesses/m2cgen
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
README.md
m2cgen
m2cgen (Model 2 Code Generator) - is a lightweight library which provides an easy way to transpile trained statistical models into a native code (Python, C, Java).
Installation
pip install m2cgen
Supported Languages
- Python
- Java
- C
Supported Models
Classification Regression Linear LogisticRegression, LogisticRegressionCV, RidgeClassifier, RidgeClassifierCV, SGDClassifier, PassiveAggressiveClassifier LinearRegression, HuberRegressor, ElasticNet, ElasticNetCV, TheilSenRegressor, Lars, LarsCV, Lasso, LassoCV, LassoLars, LassoLarsIC, OrthogonalMatchingPursuit, OrthogonalMatchingPursuitCV, Ridge, RidgeCV, BayesianRidge, ARDRegression, SGDRegressor, PassiveAggressiveRegressor SVM LinearSVC LinearSVR Tree DecisionTreeClassifier, ExtraTreeClassifier DecisionTreeRegressor, ExtraTreeRegressor Random Forest RandomForestClassifier, ExtraTreesClassifier RandomForestRegressor, ExtraTreesRegressor Boosting XGBClassifier(gbtree/dart booster only), LGBMClassifier(gbdt/dart booster only) XGBRegressor(gbtree/dart booster only), LGBMRegressor(gbdt/dart booster only)
Classification Output
Binary Multiclass Comment Linear Scalar value; signed distance of the sample to the hyperplane for the second class Vector value; signed distance of the sample to the hyperplane per each class The output is consistent with the output of LinearClassifierMixin.decision_function Tree/Random Forest/XGBoost/LightGBM Vector value; class probabilities Vector value; class probabilities The output is consistent with the output of the predict_proba method of DecisionTreeClassifier/ForestClassifier/XGBClassifier/LGBMClassifier
Usage
Here's a simple example of how a trained linear model can be represented in Java code:
from sklearn.datasets import load_boston from sklearn import linear_model import m2cgen as m2c boston = load_boston() X, y = boston.data, boston.target estimator = linear_model.LinearRegression() estimator.fit(X, y) code = m2c.export_to_java(estimator)
The example of the generated code:
public class Model { public static double score(double[] input) { return (((((((((((((36.45948838508965) + ((input[0]) * (-0.10801135783679647))) + ((input[1]) * (0.04642045836688297))) + ((input[2]) * (0.020558626367073608))) + ((input[3]) * (2.6867338193449406))) + ((input[4]) * (-17.76661122830004))) + ((input[5]) * (3.8098652068092163))) + ((input[6]) * (0.0006922246403454562))) + ((input[7]) * (-1.475566845600257))) + ((input[8]) * (0.30604947898516943))) + ((input[9]) * (-0.012334593916574394))) + ((input[10]) * (-0.9527472317072884))) + ((input[11]) * (0.009311683273794044))) + ((input[12]) * (-0.5247583778554867)); } }
You can find more examples of generated code for different models/languages here
CLI
m2cgen
can be used as a CLI tool to generate code using serialized model objects (pickle protocol):
$ m2cgen <pickle_file> --language <language> [--indent <indent>]
[--class_name <class_name>] [--package_name <package_name>]
[--recursion-limit <recursion_limit>]
Piping is also supported:
$ cat <pickle_file> | m2cgen --language <language>
FAQ
Q: Generation fails with RuntimeError: maximum recursion depth exceeded
error.
A: If this error occurs while generating code using an ensemble model, try to reduce the number of trained estimators within that model. Alternatively you can increase the maximum recursion depth with sys.setrecursionlimit(<new_depth>)
.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK