GitHub - sql-machine-learning/sqlflow: Brings SQL and AI together.
source link: https://github.com/sql-machine-learning/sqlflow
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
README.md
SQLFlow
What is SQLFlow?
SQLFlow is a bridge that connects a SQL engine, e.g. MySQL, Hive, SparkSQL or SQL Server, with TensorFlow and other machine learning toolkits. SQLFlow extends the SQL language to enable model training, prediction and inference.
Motivation
The current experience of development ML based applications requires a team of data engineers, data scientists, business analysts as well as a proliferation of advanced langauges and programming tools like Python, SQL, SAS, SASS, Julia, R. The fragmentation of tooling and development environment brings additional difficulties in engineering to model trainning/tunning. What if we marry the most widely used data management/processing language SQL with ML/system capabilities and let engineers with SQL skills develop advanced ML based applcations?
There are already some work in progress in the industry. We can write simple machine learning prediction (or scoring) algorithms in SQL using operators like DOT_PRODUCT
. However, this requires copy-n-pasting model parameters from the training program to SQL statements. In the commercial world, we see some proprietary SQL engines providing extensions to support machine learning capabilities.
- Microsoft SQL Server: Microsoft SQL Server has the machine learning service that runs machine learning programs in R or Python as an external script.
- Teradata SQL for DL: Teradata also provides a RESTful service, which is callable from the extended SQL SELECT syntax.
- Google BigQuery: Google BigQuery enables machine learning in SQL by introducing the
CREATE MODEL
statement.
None of the existing solution solves our pain point, instead we want it to be fully extensible.
- This solution should be compatible to many SQL engines, instead of a specific version or type.
- It should support sophisticated machine learning models, including TensorFlow for deep learning and xgboost for trees.
- We also want the flexibility to configure and run cutting-edge ML algorithms including specifying feature crosses, at least, no Python or R code embedded in the SQL statements, and fully integrated with hyperparameter estimation.
Quick Overview
Here are examples for training a Tensorflow DNNClassifer model using sample data Iris.train, and running prediction using the trained model. You can see how cool it is to write some elegant ML code using SQL:
sqlflow> SELECT * FROM iris.train TRAIN DNNClassifier WITH n_classes = 3, hidden_units = [10, 20] COLUMN sepal_length, sepal_width, petal_length, petal_width LABEL class INTO sqlflow_models.my_dnn_model; ... Training set accuracy: 0.96721 Done training
sqlflow> SELECT * FROM iris.test PREDICT iris.predict.class USING sqlflow_models.my_dnn_model; ... Done predicting. Predict table : iris.predict
How to use SQLFlow
Contributions
Feedback
Your feedback is our motivation to move on. Please let us know your questions, concerns, and issues by filing Github Issues.
License
Recommend
-
31
This post originally appeared in the IBM Developer blog here . This post is co-au...
-
27
When Bayes, Ockham, and Shannon come together to define machine learning A beautiful idea, which binds together concepts from statistics, information theory, and philosophy. ...
-
35
5 月 6 日,在 QCon 全球软件开发大会(北京站)2019 上,蚂蚁金服副 CTO 胡喜正式宣布开源机器学习工具 SQLFlow,他在演讲中表示:“未来三年,AI 能力会成为每一位技术人员的基本能力。我们希望通过开源 SQLFlow,降低人工智能应用的技...
-
27
甲骨文裁员专区 甲骨文中国区研发中心裁员赔偿 N+6 5 月 7 日上午,Oracle 召开了面向全中国区的电话会议,亚太区人力资源负责人在会上简要介绍道,公司正进行业务结构调整,导致一部分人要离开岗位,这将是全球性...
-
39
A tool for training ML models using less labeled data Fel...
-
27
端到端机器学习是一种由输入端的数据直接得到输出端结果的 AI 系统,它可以对业务人员屏蔽复杂技术细节,同时给模型以更多自动调节空间,增加模型整体契合度。近两年来,端到端机器学习成为 AI 领域研发热点,蚂蚁集团于 2019 年 5 月发布...
-
12
血缘关系分析工具SQLFLOW--实践指南
-
2
BrainBox AI brings machine learning to building controls Image Credit: Westend61 // Getty Images Hear from CIOs, CTOs, and other...
-
1
TigerGraph brings machine learning and analytics features to the cloud
-
2
Google brings machine learning to online spreadsheets with Simple ML for Sheets
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK