磨人的小妖精 Apache Superset 之绝对不改版
source link: https://mp.weixin.qq.com/s/dSpItu0a51K_icAb5p5l7A
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Apache Superset
2018年夏天,经过调研之后最终选择将BI平台定为使用Apache Superset实现。Superset在GitHub上有20K+Star,其综合能力可见一斑,但是经过深入了解和使用之后,只觉得并不能达到该有的预期,倒是很符合 (incubating)
的附加标签,同时也从某种程度上说明好用又免费的BI产品少之又少。毕竟Superset是免费开源产品,虽然对它不是很满意,但是本着不白嫖的原则也给Superset贡献了Star,同时将Superset这一阶段的二次开发告一段落,整理并分享本篇文章。
二次开发概览
版本 内容 初始版 Apache Superset 0.25.6源码 修改版 [改进]定制SQLAlchemy 1.2.2,支持对PostgreSQL 8.3的兼容 确定版 [改进]增加权限控制,用户只能查看和编辑自己创建的对象 最终版 [修复]修复Pivot类型图表自定义配置不生效的BUG 最终版2 [新增]SQL Editor左侧表结构增加表注释、字段注释的展示 最终版3 [新增]封装创建用户接口,封装创建数据源接口 完成版 [修复]修复SQL中包含百分号时查询报错问题 完成版2 [改进]修改SQL Editor中表数据预览策略,支持手动预览 完成版3 [新增]增加图表:普通折线图(XY-Line) 最终完成版 [改进]SQL Editor左侧Database、Schema、Table增加懒加载机制 最终完成版2 [新增]图表展示支持自定义排序 最终完成版3 [修复]修复数据库密码中包含特殊字符时无法连接数据库的问题 绝对不改版 [改进]修复日志记录时间差8小时的问题 绝对不改版2 [改进]改进每次查询新开数据库连接的问题 绝对不改版3 [修复]修复查询结果导出乱码问题 绝对不改版4 [新增]查询结果导出支持自定义文件名 绝对不改版5 [新增]查询结果展示数据条数 绝对不改版6 [改进]解决图表边界遮挡坐标轴刻度值问题 绝对不改版7 [改进]解决SQL Lab中字段包含中文报错问题 绝对不改版8 [新增]查询结果集支持复制 绝对不改版9 [改进]查询强制LIMIT,防止大数据量结果集造成内存飙升二次开发实现
[改进]定制SQLAlchemy 1.2.2,支持对PostgreSQL 8.3的兼容
背景
使用基于PostgreSQL 8.3devel版本的Greenplum数据源时,在SQL Editor页面左侧选择一张表,会出现表结构无法正常加载的情况。
展开表结构但是展示为空:
且Superset会出现 ERROR OCCURRED WHILE FETCHING TABLE METADATA
的提示:
打开浏览器控制台发现请求异常信息:
{"error": "(psycopg2.ProgrammingError) relation "pg_catalog.pg_enum" does not exist LINE 10: LEFT JOIN pg_catalog.pg_enum e ON t.oid = e..."}
解决方案
Superset使用SQLAlchemy查询Greenplum数据库,获取表的字段、索引、注释等信息,SQLAlchemy中对于PostgreSQL不同的版本使用了不同的查询分支,测试结果表明,将SQLAlchemy-1.2.2/lib/sqlalchemy/dialects/postgresql/base.py文件做如下修改,可以修复上述问题。
修改SQLAlchemy-1.2.2源码后重新打包安装即可。
[改进]增加权限控制,用户只能查看和编辑自己创建的对象
背景
Superset中的权限过于开放,虽然有用户、角色等概念,但是对于数据源、SQL查询记录、可视化图表等敏感模块,每个人都可以看到其他人创建的对象,无权限控制可言。
解决方案
实现 每个用户只能查看和编辑自己创建的对象。
具体实现
如:用户只能处理自己的数据源
from flask import g from flask_appbuilder.models.sqla.filters import FilterEqualFunction def get_user(): return g.user class DatabaseView(SupersetModelView, DeleteMixin, YamlExportMixin): ······ # 过滤规则 base_filters = [['created_by', FilterEqualFunction, get_user]] ······
常用模块改法概览:
模块 过滤 DatabaseView base_filters = [['created_by', FilterEqualFunction, get_user]] SliceModelView base_filters = [['id', SliceFilter, lambda: []], ['created_by', FilterEqualFunction, get_user]] DashboardModelView base_filters = [['slice', DashboardFilter, lambda: []], ['created_by', FilterEqualFunction, get_user]] search_queries search_user_id = g.user.id SavedQueryView base_filters = [['created_by', FilterEqualFunction, get_user]][修复]修复Pivot类型图表自定义配置不生效的BUG
背景
在使用Pivot Table时,如果自定义Metrics设置,则图表无法展示。报错信息提示 unhashable type: 'dict'
。
Traceback (most recent call last): File "/data/Superset/superset-0.25.6/superset/views/core.py", line 1107, in generate_json payload = viz_obj.get_payload() File "/data/Superset/superset-0.25.6/superset/viz.py", line 329, in get_payload payload['data'] = self.get_data(df) File "/data/Superset/superset-0.25.6/superset/viz.py", line 628, in get_data margins=self.form_data.get('pivot_margins'), File "/home/.local/share/virtualenvs/Superset-VKnIdCi7/lib/python2.7/site-packages/pandas/core/frame.py", line 4468, in pivot_table margins_name=margins_name) File "/home/.local/share/virtualenvs/Superset-VKnIdCi7/lib/python2.7/site-packages/pandas/core/reshape/pivot.py", line 57, in pivot_table if i not in data: File "/home/.local/share/virtualenvs/Superset-VKnIdCi7/lib/python2.7/site-packages/pandas/core/generic.py", line 1075, in __contains__ return key in self._info_axis File "/home/.local/share/virtualenvs/Superset-VKnIdCi7/lib/python2.7/site-packages/pandas/core/indexes/base.py", line 1694, in __contains__ hash(key) TypeError: unhashable type: 'dict'
解决方案
在PivotTableViz中对于metrics参数改为区分处理即可。
具体实现
class PivotTableViz(BaseViz): ······ def get_data(self, df): if ( self.form_data.get('granularity') == 'all' and DTTM_ALIAS in df): del df[DTTM_ALIAS] metrics = self.form_data.get('metrics') values = [] for metric in metrics: if isinstance(metric, dict): values.append(metric['label']) else: values.append(metric) df = df.pivot_table( index=self.form_data.get('groupby'), columns=self.form_data.get('columns'), values=values, aggfunc=self.form_data.get('pandas_aggfunc'), margins=self.form_data.get('pivot_margins'), ) ······
[新增]SQL Editor左侧表结构增加表注释、字段注释的展示
背景
SQL Editor左侧表结构默认只加载字段名称、字段类型、主键、外键,而无法得知某个表的注释和字段注释,如果需要查看此类信息则无法满足,给日常使用带来一定不便。
解决方案
效果展示
具体实现
superset/views/core.py
修改 /table/<database_id>/<table_name>/<schema>/
接口,增加获取表注释操作,并将注释信息拼到接口结果集。
superset/static/assets/src/SqlLab/components/ColumnElement.jsx
修改字段名称,增加字段注释展示。
superset/static/assets/src/SqlLab/components/TableElement.jsx
修改表名,增加表注释展示。
[新增]封装创建用户接口,封装创建数据源接口
背景
Superset管理员可以在后台添加用户、给用户设置权限、配置数据源等,但是如果需要与其他平台打通,比如有新员工入职时,自动开通Superset账户,或者用户在数据库权限管理平台申请一个数据库的权限之后,自动设置好Superset数据源,则无法直接实现。
解决方案
封装创建用户接口和创建数据源接口,用于与其他平台做集成。
具体实现
superset/views/core.py
@app.route('/add_superset_user', methods=['POST']) @csrf.exempt def add_superset_user(): """ 封装Superset创建用户接口 Args: { 'username':, # 账户用户名 'first_name':, 'last_name':, 'email':, # 邮箱 'role_name':, # 角色名称,若有多个可按英文逗号分隔 'password': # 账户密码 } :return: { 'status' : '', # True/False 'message' : '' # 描述 } """ result = { 'status' : '', 'message' : '', } try: username = request.form.get('username') first_name = request.form.get('first_name') last_name = request.form.get('last_name') email = request.form.get('email') role_name = request.form.get('role_name') role = appbuilder.sm.find_role('Public') # default password = request.form.get('password') user = appbuilder.sm.add_user(username, first_name, last_name, email, role, password) if user: if role_name: role_names = role_name.split(',') for rn in role_names: role = appbuilder.sm.find_role(rn) user.roles.append(role) appbuilder.get_session().commit() result['status'] = True result['message'] = 'SUCCESS' else: result['status'] = False result['message'] = '用户创建失败' except Exception as e: logging.exception(e) result['status'] = False result['message'] = str(e) return json.dumps(str(result)) @app.route('/add_superset_dbs', methods=['POST']) @csrf.exempt def add_superset_dbs(): """ 封装Superset添加数据库接口 一次可添加一个或多个数据库 Args: { 'superset_user_id':, # 给superset端的哪个用户创建数据库,superset_user_id、superset_user_name至少提供一个 'superset_user_name':, # 给superset端的哪个用户创建数据库,superset_user_id、superset_user_name至少提供一个 'db_configs': [ # 数据库配置 { "db_type" : , "db_name" : , "db_host" : , "db_port" : , "db_username" : , "db_password" : }, { ... }, ... ] } :return: { 'status' : '', # True/False 'message' : '', # 描述 } """ result = { 'status' : '', 'message' : '', } try: superset_user_name = request.form.get('superset_user_name') superset_user_id = request.form.get('superset_user_id') if superset_user_name and not superset_user_id: # 如果传入用户ID为空,则使用传入用户名称查找对应用户 create_by_user = appbuilder.sm.find_user(username=superset_user_name) superset_user_id = create_by_user.id if not superset_user_id: raise Exception('没法关联到创建用户') if not superset_user_name: create_by_user = appbuilder.sm.get_user_by_id(superset_user_id) superset_user_name = create_by_user.username db_configs = request.form.get('db_configs') db_configs = eval(db_configs) time_now = datetime.now() dbs = [] for db_config in db_configs: db_type = db_config['db_type'] db_name = db_config['db_name'] db_host = db_config['db_host'] db_port = db_config['db_port'] db_username = db_config['db_username'] db_password = db_config['db_password'] # populate_db sqlalchemy_uri_template = 'postgresql://%s:%s@%s:%s/%s' sqlalchemy_uri = '' if db_type == 2: # postgresql sqlalchemy_uri = sqlalchemy_uri_template % (db_username, db_password, db_host, db_port, db_name) else: raise Exception('暂未支持的数据库类型') db = models.Database( created_on = time_now, changed_on = time_now, database_name = superset_user_name + '_' + db_name, sqlalchemy_uri = sqlalchemy_uri, created_by_fk = superset_user_id, changed_by_fk = superset_user_id, password = db_password, cache_timeout = None, extra = """{ "metadata_params": {}, "engine_params": {} }""", select_as_create_table_as = 0, allow_ctas = 0, expose_in_sqllab = 1, force_ctas_schema = None, allow_run_async = 0, allow_run_sync = 1, allow_dml = 0, verbose_name = None, impersonate_user = 0, allow_multi_schema_metadata_fetch = 1 ) # pre_add db.set_sqlalchemy_uri(db.sqlalchemy_uri) security_manager.merge_perm('database_access', db.perm) for schema in db.all_schema_names(): security_manager.merge_perm( 'schema_access', security_manager.get_schema_perm(db, schema)) # add appbuilder.get_session().add(db) appbuilder.get_session().commit() dbs.append(db) result['status'] = True result['message'] = 'SUCCESS' except Exception as e: logging.exception(e) result['status'] = False result['message'] = str(e) return json.dumps(str(result))
[修复]修复SQL中包含百分号时查询报错问题
背景
受MySQLdb影响,Superset在查询MySQL数据库时,如果SQL中含有百分号 %
,则查询出错。
报错日志
unsupported format character ''' (0x27) at index 49 Traceback (most recent call last): File "/data/Superset/superset-0.25.6/superset/sql_lab.py", line 220, in execute_sql **db_engine_spec.cursor_execute_kwargs) File "/home/.local/share/virtualenvs/Superset-VKnIdCi7/lib/python2.7/site-packages/MySQLdb/cursors.py", line 238, in execute query = query % args ValueError: unsupported format character ''' (0x27) at index 49
解决方案
将百分号 %
进行替换。
具体实现
superset/connectors/sqla/models.py
[改进]修改SQL Editor中表数据预览策略,支持手动预览
背景
在SQL Editor左侧选中一张表,默认会进行表数据预览,而表数据预览并非总是需要的,而且假如这张表很大,则表数据预览会比较耗时。
解决方案
改为默认不进行表数据预览,同时支持手动执行预览。
效果展示
具体实现
superset/static/assets/src/SqlLab/components/ResultSet.jsx
superset/static/assets/src/SqlLab/actions.js
[新增]增加图表:普通折线图(XY-Line)
背景
Superset现有图表中只有两种折线图:时序折线图与双轴折线图,如果需要一种普通的(X轴&Y轴)折线图,其中X轴同时支持数值、时间、日期、文字标签等类型,则无法实现。 (注:Bar Chart可以实现类似需求,但其非折线图)
解决方案
新增图表 XY - Line Chart
。数据复用Bar Chart的实现,展现形式使用Line Chart即可。
效果展示
具体实现
superset/static/assets/src/explore/visTypes.js
xy_line: { label: t('XY - Line Chart'), showOnExplore: true, controlPanelSections: [ { label: t('Query'), expanded: true, controlSetRows: [ ['metrics'], ['groupby'], ['row_limit'], ['contribution'], ], }, { label: t('Chart Options'), expanded: true, controlSetRows: [ ['color_scheme'], ['show_legend'], ['y_axis_format', 'y_axis_label'], ], }, { label: t('X Axis'), expanded: true, controlSetRows: [ ['x_axis_label', 'bottom_margin'], ['x_ticks_layout', 'reduce_x_ticks'], ], }, ], controlOverrides: { groupby: { label: t('Series'), }, }, },
superset/static/assets/src/visualizations/index.js
superset/static/assets/src/visualizations/nvd3_vis.js
superset/viz.py
from superset.utils import string_to_num class XYLineViz(DistributionBarViz): """A good old xy-line chart""" viz_type = 'xy_line' verbose_name = _('XY - Line Chart') is_timeseries = False def query_obj(self): return super(XYLineViz, self).query_obj() def get_data(self, df): fd = self.form_data pt = self.sort_df_values(df) if fd.get('contribution'): pt = pt.fillna(0) pt = pt.T pt = (pt / pt.sum()).T chart_data = [] for name, ys in pt.items(): if pt[name].dtype.kind not in 'biufc' or name in self.groupby: continue if isinstance(name, string_types): series_title = name elif len(self.metrics) > 1: series_title = ', '.join(name) else: l = [str(s) for s in name[1:]] # noqa: E741 series_title = ', '.join(l) values = [] x_i = 0 import datetime for i, v in ys.items(): x = i if isinstance(x, (tuple, list)): x = ', '.join([text_type(s) for s in x]) if isinstance(x, datetime.date): x = str(x) else: xn = string_to_num(x) x = xn if xn else text_type(x) values.append({ 'x': x_i, 'y': v, 'label': x }) x_i = x_i + 1 d = { 'key': series_title, 'values': values, } chart_data.append(d) return chart_data
superset/static/assets/images/viz_thumbnails/xy_line.png
[改进]SQL Editor左侧Database、Schema、Table增加懒加载机制
背景
SQL Editor中每次打开一个Tab时,Superset都会重新加载数据源中的Database、Schema、Table三个列表,使用时存在等待时间,如果列表项过多或者加载过慢,会比较影响体验。
解决方案载机制
列表增加缓存机制,如果缓存中没有列表项则查库获取最新列表项,否则直接载入缓存中的列表项。同时支持手动获取最新列表项。
效果展示
具体实现
superset/static/assets/src/SqlLab/components/SqlEditorLeftBar.jsx
······ class SqlEditorLeftBar extends React.PureComponent { ······ componentWillMount() { const editorId = this.props.queryEditor.id; const storage = window.sessionStorage; const schemaOptionsCached = storage.getItem('schemaOptionsCached' + editorId); const tableOptionsCached = storage.getItem('tableOptionsCached' + editorId); if(schemaOptionsCached) { const schemaOptions = JSON.parse(schemaOptionsCached); this.setState({ schemaLoading: false, schemaOptions: schemaOptions }); } else { this.fetchSchemas(this.props.queryEditor.dbId); } if(tableOptionsCached) { const tableOptions = JSON.parse(tableOptionsCached); const tableLength = tableOptions.length; const filterOptions = createFilterOptions({ options: tableOptions }); this.setState({ filterOptions, tableLoading: false, tableOptions: tableOptions, tableLength: tableLength, }); } else { this.fetchTables(this.props.queryEditor.dbId, this.props.queryEditor.schema); } } refreshDatasources() { this.fetchSchemas(this.props.queryEditor.dbId); this.fetchTables(this.props.queryEditor.dbId, this.props.queryEditor.schema); } ······ fetchTables(dbId, schema, substr) { // This can be large so it shouldn't be put in the Redux store if (dbId && schema) { this.setState({ tableLoading: true, tableOptions: [] }); const url = `/superset/tables/${dbId}/${schema}/${substr}/`; $.get(url).done((data) => { const filterOptions = createFilterOptions({ options: data.options }); this.setState({ filterOptions, tableLoading: false, tableOptions: data.options, tableLength: data.tableLength, }); // 缓存 const editorId = this.props.queryEditor.id; const storage = window.sessionStorage; storage.setItem('tableOptionsCached' + editorId, JSON.stringify(data.options)); }) .fail(() => { this.setState({ tableLoading: false, tableOptions: [], tableLength: 0 }); notify.error(t('Error while fetching table list')); }); } else { this.setState({ tableLoading: false, tableOptions: [], filterOptions: null }); } } ······ fetchSchemas(dbId) { const actualDbId = dbId || this.props.queryEditor.dbId; if (actualDbId) { this.setState({ schemaLoading: true }); const url = `/superset/schemas/${actualDbId}/`; $.get(url).done((data) => { const schemaOptions = data.schemas.map(s => ({ value: s, label: s })); this.setState({ schemaOptions, schemaLoading: false }); // 缓存 const editorId = this.props.queryEditor.id; const storage = window.sessionStorage; storage.setItem('schemaOptionsCached' + editorId, JSON.stringify(schemaOptions)); }) .fail(() => { this.setState({ schemaLoading: false, schemaOptions: [] }); notify.error(t('Error while fetching schema list')); }); } } render() { ······ <div className="m-t-5"> <Button bsSize="sm" bsStyle="primary" onClick={this.refreshDatasources.bind(this)} > {t('刷新数据源')} </Button> </div> ······
[新增]图表展示支持自定义排序
背景
Superset中的图表默认以第一个指标进行排序,而日常使用中更多的是想根据维度进行排序,但是Superset无法实现此需求。
解决方案
新增自定义排序机制:图表编辑页面中的 SQL 模块增加 ORDER BY
配置,后台接收此配置并整合到最终执行的SQL中。
效果展示
具体实现
superset/static/assets/src/explore/controls.jsx
where: { ······ }, order_by: { type: 'TextAreaControl', label: t('Custom ORDER BY clause'), default: '', language: 'sql', minLines: 2, maxLines: 10, offerEditInModal: false, description: t('The text in this box gets included in your query\'s ORDER BY ' + 'clause. You can include ' + 'complex expression, parenthesis and anything else ' + 'supported by the backend it is directed towards.' + 'Example:{"column1":"desc","column2":"asc"}' ), }, having: { ······ },
superset/static/assets/src/explore/visTypes.js
superset/connectors/sqla/models.py
def get_sqla_query( # sqla ······ template_processor = self.get_template_processor(**template_kwargs) db_engine_spec = self.database.db_engine_spec if not orderby: orderby = extras.get('orderby', []) orderby = orderby or [] ······
superset/viz.py
from collections import OrderedDict ······ class BaseViz(object): ······ def query_obj(self): ······ extras = { ······ 'orderby': self.parse_order_by() } ······ def parse_order_by(self): order_by_options = self.form_data.get('order_by', '') orderby = [] if order_by_options: try: order_dict = { 'asc':True, 'desc':False } orderbys = json.loads(order_by_options, object_pairs_hook=OrderedDict) for col, odr in orderbys.items(): orderby.append((col, order_dict.get(odr.lower(), False))) except Exception as e: logging.exception(e) return orderby def sort_df_values(self, df): orderby = self.parse_order_by() pt = df if orderby: orderby = dict(orderby) by = orderby.keys() ascending = orderby.values() pt = pt.sort_values(by, ascending=ascending) pt = pt.set_index(self.groupby) return pt class DistributionBarViz(DistributionPieViz): ······ def get_data(self, df): fd = self.form_data pt = self.sort_df_values(df) if fd.get('contribution'): pt = pt.fillna(0) pt = pt.T pt = (pt / pt.sum()).T chart_data = [] ······ class XYLineViz(DistributionBarViz): ······ def get_data(self, df): fd = self.form_data pt = self.sort_df_values(df) if fd.get('contribution'): pt = pt.fillna(0) pt = pt.T pt = (pt / pt.sum()).T chart_data = [] ······
[修复]修复数据库密码中包含特殊字符时无法连接数据库的问题
背景
配置的数据源如果密码中有特殊字符会报错无法使用。
解决方案
使用urlquote将密码转换即可。
具体实现
superset/models/core.py
from urllib import quote_plus as urlquote ······ def set_sqlalchemy_uri(self, uri): conn = sqla.engine.url.make_url(uri.strip()) if conn.password and conn.password != PASSWORD_MASK and not custom_password_store: # do not over-write the password with the password mask self.password = urlquote(conn.password) conn.password = PASSWORD_MASK if conn.password else None self.sqlalchemy_uri = str(conn) # hides the password
[改进]修复日志记录时间差8小时的问题
背景
Superset使用MySQL管理元数据时,logs表中dttm字段时间与当前时间差8个小时。
解决方案
具体实现
superset/models/core.py
class Log(Model): ······ dttm = Column(DateTime, default=datetime.utcnow)
[改进]改进每次查询新开数据库连接的问题
背景
SQL Editor中每次查询都会新开一条数据库连接,长时间使用会导致出现大量数据库连接。
解决方案
具体实现
superset/sql_lab.py
[修复]修复查询结果导出乱码问题
背景
Superset查询结果导出CSV出现乱码。
解决方案
改为使用 utf-8-sig
编码。
具体实现
superset/config.py
# CSV Options: key/value pairs that will be passed as argument to DataFrame.to_csv method # note: index option should not be overridden CSV_EXPORT = { 'encoding': 'utf-8-sig', }
[新增]查询结果导出支持自定义文件名
背景
Superset查询结果导出时不支持设置导出文件名称,默认下载的文件名称是无意义字符串,需要先下载到本地磁盘然后更改文件名称。
解决方案
导出时增加设置文件名称的步骤,默认名称是无意义字符串(与原生相同),同时支持输入自定义名称。
效果展示
具体实现
superset/static/assets/src/SqlLab/components/ExportResults.jsx
import React from 'react'; import PropTypes from 'prop-types'; import { FormControl, FormGroup, Row, Col } from 'react-bootstrap'; import Button from '../../components/Button'; import ModalTrigger from '../../components/ModalTrigger'; import { t } from '../../locales'; const propTypes = { defaultFileName: PropTypes.string, queryId: PropTypes.number, animation: PropTypes.bool, onSave: PropTypes.func, }; const defaultProps = { defaultFileName: t('Undefined'), animation: true, onSave: () => {}, }; class ExportResults extends React.PureComponent { constructor(props) { super(props); this.state = { fileName: props.defaultFileName, showSave: false, }; this.toggleSave = this.toggleSave.bind(this); this.onSave = this.onSave.bind(this); this.onCancel = this.onCancel.bind(this); this.onFileNameChange = this.onFileNameChange.bind(this); } onSave() { window.location.href = '/superset/csv/' + this.props.queryId + '/' + this.state.fileName; this.saveModal.close(); } onCancel() { this.saveModal.close(); } onFileNameChange(e) { this.setState({ fileName: e.target.value }); } toggleSave(e) { this.setState({ target: e.target, showSave: !this.state.showSave }); } renderModalBody() { return ( <FormGroup bsSize="small"> <Row> <Col md={12}> <small> <label className="control-label" htmlFor="embed-height"> {t('File name')} </label> </small> <FormControl type="text" placeholder={t('File name')} value={this.state.fileName} onChange={this.onFileNameChange} /> </Col> </Row> <br /> <Row> <Col md={12}> <Button bsStyle="primary" onClick={this.onSave} className="m-r-3" > {t('Export')} </Button> <Button onClick={this.onCancel} className="cancelExport"> {t('Cancel')} </Button> </Col> </Row> </FormGroup> ); } render() { return ( <span className="ExportResults"> <ModalTrigger ref={(ref) => { this.saveModal = ref; }} modalTitle={t('Export Results')} modalBody={this.renderModalBody()} triggerNode={ <Button bsSize="small" className="toggleSave" onClick={this.toggleSave}> <i className="fa fa-file-text-o" /> {t('Export Results')} </Button> } bsSize="small" /> </span> ); } } ExportResults.propTypes = propTypes; ExportResults.defaultProps = defaultProps; export default ExportResults;
superset/static/assets/src/SqlLab/components/ResultSet.jsx
import ExportResults from './ExportResults'; ······ if (this.props.search || this.props.visualize || this.props.csv) { let csvButton; if (this.props.csv) { csvButton = ( <ExportResults defaultFileName={this.props.tab + '_' + this.props.query.id} queryId={this.props.query.id} className="m-r-5" onSave={this.props.actions.saveQuery} /> ); } let visualizeButton; if (this.props.visualize) { ······
superset/views/core.py
@has_access @expose('/csv/<client_id>/<file_name>') @log_this def csv(self, client_id, file_name): """Download the query results as csv.""" logging.info('Exporting CSV file [{}]'.format(client_id)) query = ( db.session.query(Query) ······ response = Response(csv, mimetype='text/csv') response.headers['Content-Disposition'] = ( 'attachment; filename={}.csv'.format(parse.quote(file_name or query.name))) logging.info('Ready to return response') return response
[新增]查询结果展示数据条数
背景
Superset查询结果展示中没有数据条数信息。
解决方案
效果展示
具体实现
superset/static/assets/src/SqlLab/components/ResultSet.jsx
······ let resultCountSpan; let resultCount = 0; if(this.props.query.state === 'success') { const results = this.props.query.results; let data; if (this.props.cache && this.props.query.cached) { data = this.state.data; } else if (results && results.data) { data = results.data; } if (data && data.length > 0) { resultCount = data.length; } } resultCountSpan = ( <span className="btn-sm"> {resultCount} {t('records')} </span> ); let searchBox; if (this.props.search) { searchBox = ( <input ······ <div className="pull-left"> <ButtonGroup> {visualizeButton} {csvButton} {resultCountSpan} </ButtonGroup> </div> <div className="pull-right"> {searchBox} ······
[改进]解决图表边界遮挡坐标轴刻度值问题
背景
Superset图表中如果刻度值较大,可能存在被边界遮挡的情况。
解决方案
调整图表边界控制。
效果展示
具体实现
superset/static/assets/src/visualizations/nvd3_vis.js
······ const maxMarginPad = 50; const animationTime = 1000; const minHeightForBrush = 480; const BREAKPOINTS = { ······ if (chart.yAxis !== undefined || chart.yAxis2 !== undefined) { // Hack to adjust y axis left margin to accommodate long numbers const containerWidth = slice.container.width(); const marginPad = Math.ceil( Math.min(isExplore ? containerWidth * 0.1 : containerWidth * 0.1, maxMarginPad), ); const maxYAxisLabelWidth = chart.yAxis2 ? getMaxLabelSize(slice.container, 'nv-y1') : getMaxLabelSize(slice.container, 'nv-y'); const maxXAxisLabelHeight = getMaxLabelSize(slice.container, 'nv-x'); ······
[修复]解决SQL Lab中字段包含中文报错问题
背景
SQL Editor中查询如果字段名称包含中文,则查询报错。
解决方案
具体实现
superset/dataframe.py
import sys reload(sys) sys.setdefaultencoding("utf-8")
[新增]查询结果集支持复制
背景
一种常见的使用场景是将查询结果复制出来,用来查看或作他用。
解决方案
支持将查询结果集复制到剪贴板。
效果展示
具体实现
引入依赖 react-copy-to-clipboard 。
superset/static/assets/src/SqlLab/components/ResultSet.jsx
······ import {CopyToClipboard} from 'react-copy-to-clipboard'; ······ let resultCountSpan; let resultCount = 0; let copyButton; let copyData = ""; if(this.props.query.state === 'success') { const results = this.props.query.results; let data; if (this.props.cache && this.props.query.cached) { data = this.state.data; } else if (results && results.data) { data = results.data; } if (data && data.length > 0) { resultCount = data.length; const columns = results.columns; for(let ci = 0;ci < columns.length;ci++) { let col = columns[ci]; copyData += col.name + ' \t '; } copyData += ' \n '; for(let di = 0;di < data.length;di++) { let d = data[di]; for(let ci = 0;ci < columns.length;ci++) { let col = columns[ci]; copyData += d[col.name] + ' \t '; } copyData += ' \n '; } } console.log(results); console.log(copyData); } resultCountSpan = ( <span className="btn-sm"> {resultCount} {t('records')} </span> ); copyButton = ( <CopyToClipboard text={copyData}> <Button bsSize="small" > <i className="fa fa-copy m-l-1" /> {t('Copy to clipboard')} </Button> </CopyToClipboard> ); let searchBox; if (this.props.search) { searchBox = ( <input type="text" onChange={this.changeSearch.bind(this)} className="form-control input-sm" placeholder={t('Search Results')} /> ); } return ( <div className="ResultSetControls"> <div className="clearfix"> <div className="pull-left"> <ButtonGroup> {visualizeButton} {csvButton} {copyButton} {resultCountSpan} </ButtonGroup> </div> <div className="pull-right"> {searchBox} </div> </div> </div> ); } return <div className="noControls" />; } ······
[改进]查询强制LIMIT,防止大数据量结果集造成内存飙升
背景
Superset在日常使用过程中,偶尔会出现因查询SQL返回大量结果集,造成服务器内存飙升甚至耗尽的情况。
解决方案
拦截将要执行的SQL使用limit对其进行包裹,即可将SQL查询的最大数据集控制在指定范围内。
具体实现
superset/sql_lab.py
······ elif (query.limit and superset_query.is_select() and db_engine_spec.limit_method == LimitMethod.WRAP_SQL): executed_sql = database.wrap_sql_limit(executed_sql, query.limit) query.limit_used = True elif query.limit and superset_query.is_select(): # 强制添加LIMIT,防止大数据量结果集造成内存飙升 executed_sql = database.wrap_sql_limit(executed_sql, query.limit) query.limit_used = True # Hook to allow environment-specific mutation (usually comments) to the SQL SQL_QUERY_MUTATOR = config.get('SQL_QUERY_MUTATOR') if SQL_QUERY_MUTATOR: ······
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK