1. 文件数据集
1.1. 文件数据集说明
文件数据集是数据集的一种,是用户通过上传本地文件,创建的数据集。文件数据集可以上传到衡石引擎,也可以上传到用户配置的可写数据连接中。
文件数据集结构说明
详情见 数据集说明,下面是补充结构说明。
字段 | 类型 | 描述 |
---|---|---|
options.type | STRING | 数据集类型,数据连接数据集的 type 为 connection |
options.connectionId | NUMBER | 文件数据集所在的数据连接 id |
options.origin | STRING | 文件的来源类型 |
options.delimiter | STRING | 文件中定义的分隔符 |
options.encoding | STRING | 文件编码 |
options.header | INTEGER | 表头所在行数,从0开始 |
options.origin | STRING | 文件的原始类型 |
options.range | OBJECT数组 | 选定的表格范围 |
options.range[].xbegin | INTEGER | 开始列数,从0开始,包含自身 |
options.range[].xend | INTEGER | 结束列数,从0开始,包含自身 |
options.range[].ybegin | INTEGER | 开始行数,从0开始,包含自身 |
options.range[].yend | INTEGER | 结束行数,从0开始,包含自身 |
1.2. 接口说明
1.2.1. 上传文件
请求URL
POST /api/v1/files
需要认证:是
请求参数
URL 参数
字段 | 类型 | 说明 |
---|---|---|
file | BINARY | 必填, 模版文件流 |
返回对象的格式说明
字段 | 类型 | 说明 |
---|---|---|
version | STRING | 当前系统版本哈希值 |
data | OBJECT | 文件的描述 |
data.fileId | NUMBER | 文件id |
data.type | STRING | 文件类型 |
data.sheetList | OBJECT 数组 | 文件的sheet信息 |
data.sheetList[].id | NUMBER | sheet 的 id |
data.sheetList[].name | NUMBER | sheet 的 名字,值为 文件名 + sheet 名 |
接口示例1: 上传文件
POST /api/files
请求参数:
file: (binary)
返回
{
"version": "3.2-SNAPSHOT@@git.commit.id.abbrev@#d800fee",
"data": {
"fileId": "32",
"type": "file_excel",
"sheetList": [
{
"id": 0,
"name": "a_ivt_regions a_ivt_regions"
}
]
}
}
1.2.2. 预览文件{fileId}的第{sheetId}个工作表
请求URL
POST /api/v1/files/{fileId}/sheets/{sheetId}/preview
需要认证:是
请求参数
URL 参数
字段 | 类型 | 说明 |
---|---|---|
fileId | NUMBER | 必填, 文件的 id |
sheetId | NUMBER | 必填, sheet的 id |
request body 请求体
请求体是 JSON 实体。
字段 | 类型 | 描述 |
---|---|---|
offset | NUMBER | 偏移量 |
limit | NUMBER | 限制条数 |
transpose | BOOL | 是否行列反转,false 表示不反转,true 表示反转,默认为 false |
delimiter | STRING | 分隔符 |
limit | NUMBER | 分隔符 |
encoding | STRING | 文件编码 |
返回对象的格式说明
字段 | 类型 | 说明 |
---|---|---|
version | STRING | 当前系统版本哈希值 |
data | OBJECT | 文件的描述 |
data.schema | OBJECT 数组 | 每一个元素表示一个数据集字段的属性,与数据集的 options.schema 相同 |
data.data | OBJECT 数组 | 每一个元素是一行数据,一行中每个值与schema元素一一对应 |
data.suggestOptions | OBJECT | 系统分析的文件详细信息 |
data.suggestOptions.delimiter | STRING | 文件中定义的分隔符 |
data.suggestOptions.encoding | STRING | 文件编码 |
data.suggestOptions.header | NUMBER | 文件的表头是第几行 |
data.suggestOptions.origin | STRING | 文件的原始类型 |
接口示例1: 预览文件数据集的一个sheet
POST /api/files/32/sheets/0/preview
{"delimiter":"comma","encoding":"UTF-8","transpose":false}
返回
{
"version": "3.2-SNAPSHOT@@git.commit.id.abbrev@#d800fee",
"data": {
"schema": [
{
"fieldName": "_c0",
"type": "string",
"visible": true,
"label": "region_name"
},
{
"fieldName": "_c1",
"type": "string",
"visible": true,
"label": "region_id"
}
],
"data": [
[
"region_name",
"region_id"
]
],
"suggestOptions": {
"encoding": "UTF-8",
"delimiter": "comma",
"header": 0,
"padHeader": false,
"transpose": false,
"origin": "file_excel",
"offset": 0,
"limit": 1000
}
}
}
1.2.3. 选择文件{fileId}的第{sheetId}个工作表
请求URL
POST /api/v1/files/{fileId}/sheets/{sheetId}/select
需要认证:是
请求参数
URL 参数
字段 | 类型 | 说明 |
---|---|---|
fileId | NUMBER | 必填, 文件的 id |
sheetId | NUMBER | 必填, sheet的 id |
request body 请求体
请求体是 JSON 实体。
字段 | 类型 | 描述 |
---|---|---|
delimiter | STRING | 文件中定义的分隔符 |
encoding | STRING | 文件编码 |
header | INTEGER | 表头所在行数,从0开始 |
origin | STRING | 文件的原始类型 |
range | OBJECT数组 | 选定的表格范围 |
range[].xbegin | INTEGER | 开始列数,从0开始,包含自身 |
range[].xend | INTEGER | 结束列数,从0开始,包含自身 |
range[].ybegin | INTEGER | 开始行数,从0开始,包含自身 |
range[].yend | INTEGER | 结束行数,从0开始,包含自身 |
返回对象的格式说明
字段 | 类型 | 说明 |
---|---|---|
version | STRING | 当前系统版本哈希值 |
data | OBJECT | 见文件数据集结构说明 |
接口示例1:
POST /api/files/35/sheets/0/select
{
"origin": "file_excel",
"header": 0,
"padHeader": false,
"delimiter": "comma",
"transpose": false,
"range": [
{
"xbegin": 0,
"xend": 1,
"ybegin": 0,
"yend": 4
}
],
"encoding": "UTF-8"
}
返回
{
"version": "3.2-SNAPSHOT@@git.commit.id.abbrev@#d800fee",
"data": {
"data": [
[
"Europe",
"1"
]
],
"schema": [
{
"fieldName": "_c0",
"basicType": "string",
"defaultAggrType": "count",
"type": "string",
"originType": "string",
"label": "region_name",
"config": {},
"visible": true,
"nativeType": "varchar",
"suggestedTypes": [
"string"
],
"detectedType": "string"
},
{
"fieldName": "_c1",
"basicType": "number",
"defaultAggrType": "sum",
"type": "number",
"originType": "string",
"label": "region_id",
"config": {
"seperator": " ",
"dialectName": "PostgresqlDialect"
},
"visible": true,
"nativeType": "varchar",
"suggestedTypes": [
"number",
"string"
],
"detectedType": "number"
}
],
"pagable": true,
"importSwitchable": false,
"randomable": false
}
}
1.2.4. 获取当前应用中可上传本地文件的自定义数据连接
请求URL
GET /api/v1/apps/{appId}/file-writable-connection
需要认证:是
请求参数
URL 参数
字段 | 类型 | 说明 |
---|---|---|
appId | NUMBER | 应用的 id |
返回对象的格式说明
字段 | 类型 | 说明 |
---|---|---|
version | STRING | 当前系统版本哈希值 |
data | OBJECT | 可用于上传本地文件的数据连接的数组,详见 数据连接的结构说明 |
接口示例1: 获取当前应用中可用于上传本地文件的数据连接
GET /api/apps/4669/file-writable-connection
返回
{
"version": "3.2-SNAPSHOT@@git.commit.id.abbrev@#d800fee",
"code": 0,
"msg": "success",
"data": [
{
"id": 2058,
"options": {
"encoding": "UTF-8",
"type": "postgresql",
"maxConnNum": 10,
"config": {},
"category": "Database",
"protocol": "http",
"outputAble": true,
"fileOutputPath": [
"app"
]
},
"createdBy": 1,
"createdAt": "2020-05-18 11:46:14",
"updatedBy": 1,
"updatedAt": "2020-07-09 17:59:28",
"visible": true,
"isDelete": false,
"title": "***",
"status": 0,
"refreshStats": {},
"hsVersion": 1
}
]
}
1.2.5. 获取可用于上传本地文件的衡石内置数据连接
请求URL
GET /api/connections/internal
需要认证:是
返回对象的格式说明
字段 | 类型 | 说明 |
---|---|---|
version | STRING | 当前系统版本哈希值 |
data | OBJECT | 可用于上传本地文件的数据连接的数组,详见 数据连接的结构说明 |
接口示例1:
GET /api/connections/internal
返回
{
"version": "3.2-SNAPSHOT@@git.commit.id.abbrev@#d800fee",
"code": 0,
"msg": "success",
"data": [
{
"id": 3,
"options": {
"type": "engine",
"maxConnNum": 10,
"config": {},
"category": "Internal",
"protocol": "http",
"outputAble": true,
"fileOutputPath": [
"hengshi_internal_engine_tmp_schema"
]
},
"createdAt": "2020-02-22 10:11:00",
"updatedAt": "2020-02-22 10:11:00",
"visible": true,
"isDelete": false,
"title": "引擎连接",
"status": 0,
"refreshStats": {},
"hsVersion": 0
}
]
}
1.2.6. 保存文件{fileId}的第{sheetId}个工作表
请求URL
POST /api/v1/files/{fileId}/sheets/{sheetId}/apps/{appId}/save
需要认证:是
请求参数
URL 参数
字段 | 类型 | 说明 |
---|---|---|
fileId | NUMBER | 必填, 文件的 id |
sheetId | NUMBER | 必填, sheet的 id |
appId | NUMBER | 必填, 文件数据集要保存的应用 id |
request body 请求体
请求体是 JSON 实体。
字段 | 类型 | 描述 |
---|---|---|
title | STRING | 数据集的名字 |
options | OBJECT | 文件数据集的数据结构 options |
文件数据集结构说明中的 options 信息都是必须提供的,其中 options.connectionId 在这里表示文件数据集要上传到的数据连接 id,就是通过接口(获取当前应用中可上传本地文件的自定义数据连接)或者接口 (获取可用于上传本地文件的衡石内置数据连接)获得的数据连接中的一个。
返回对象的格式说明
字段 | 类型 | 说明 |
---|---|---|
version | STRING | 当前系统版本哈希值 |
data | OBJECT | 见文件数据集结构说明 |
接口示例1:
POST /api/files/35/sheets/0/apps/4669/save
{
"title": "a_ivt_regions",
"options": {
"connectionId": 3,
"schema": [
{
"fieldName": "_c0",
"basicType": "string",
"defaultAggrType": "count",
"type": "string",
"originType": "string",
"label": "region_name",
"config": {},
"visible": true,
"nativeType": "varchar",
"suggestedTypes": [
"string"
],
"detectedType": "string",
"distinct": false,
"alias": "region_name",
"name": "_c0"
},
{
"fieldName": "_c1",
"basicType": "number",
"defaultAggrType": "sum",
"type": "number",
"originType": "string",
"label": "region_id",
"config": {
"seperator": " ",
"dialectName": "PostgresqlDialect"
},
"visible": true,
"nativeType": "varchar",
"suggestedTypes": [
"number",
"string"
],
"detectedType": "number",
"distinct": false,
"alias": "region_id",
"name": "_c1"
}
],
"origin": "file_excel",
"header": 0,
"padHeader": false,
"delimiter": "comma",
"encoding": "UTF-8",
"transpose": false,
"range": [
{
"xbegin": 0,
"xend": 1,
"ybegin": 0,
"yend": 4
}
],
"cache": false
}
}
返回
{
"version": "3.2-SNAPSHOT@@git.commit.id.abbrev@#d800fee",
"data": {
"id": 11,
"title": "a_ivt_regions",
"createdBy": 1,
"createdAt": "2020-07-09 18:25:15",
"updatedBy": 1,
"updatedAt": "2020-07-09 18:25:16",
"visible": true,
"isDelete": false,
"appId": 4669,
"options": {
"cache": false,
"type": "connection",
"totalSize": 0,
"rowCount": 0,
"connectionTitle": "hengshi_sense_internal_storage",
"refreshHours": [],
"refreshMinute": 0,
"connectionId": 1,
"origin": "file_excel",
"table": "file_tb_e146214e117eab43ec3dc90292d461f4",
"path": [],
"transpose": false,
"delimiter": ",",
"encoding": "UTF-8",
"header": 0,
"padHeader": false,
"range": [
{
"xbegin": 0,
"xend": 1,
"ybegin": 0,
"yend": 4
}
],
"storageType": "engine",
"dialectOptions": {
"dialectName": "PostgresqlDialect",
"majorVersion": 10,
"minorVersion": 4
},
"storageConnectionId": 3,
"storageConnectionTitle": "引擎连接",
"schema": [
{
"datasetId": 11,
"fieldName": "_c0",
"hsVersion": 1,
"basicType": "string",
"defaultAggrType": "count",
"type": "string",
"originType": "string",
"comment": "",
"label": "region_name",
"config": {},
"visible": true,
"nativeType": "text",
"suggestedTypes": [
"string"
],
"detectedType": "string"
},
{
"datasetId": 11,
"fieldName": "_c1",
"hsVersion": 1,
"basicType": "number",
"defaultAggrType": "sum",
"type": "number",
"originType": "string",
"comment": "",
"label": "region_id",
"config": {
"seperator": " ",
"dialectName": "PostgresqlDialect"
},
"visible": true,
"nativeType": "text",
"suggestedTypes": [
"number",
"string"
],
"detectedType": "number"
},
{
"datasetId": 11,
"fieldName": "_hs_row_id",
"hsVersion": 0,
"basicType": "number",
"defaultAggrType": "sum",
"type": "number",
"originType": "integer",
"comment": "",
"config": {
"dialectName": "PostgresqlDialect"
},
"visible": false,
"nativeType": "serial",
"suggestedTypes": [
"number",
"string"
],
"detectedType": "integer"
}
],
"metrics": []
},
"importType": 1,
"importStatus": 1,
"importOptions": {},
"status": 3,
"refreshStats": {
"refreshAt": "2020-07-09 18:25:15",
"executeRefreshAt": "2020-07-09 18:25:15",
"executeRefreshRowCountAt": 1594290316712
},
"datasetAcl": {
"level": "FULLACCESS",
"dataFilters": []
},
"hsVersion": 7,
"creator": {
"id": 1,
"name": "trial",
"email": "trial@hengshi.io"
},
"updater": {
"id": 1,
"name": "trial",
"email": "trial@hengshi.io"
},
"importSwitchable": false,
"refreshSchema": false,
"type": "connection",
"origin": "file_excel",
"emptyDataset": false,
"public": true
}
}
1.2.7. 用文件{fileId}的第{sheetId}个工作表替换数据集
请求URL
POST /api/v1/files/{fileId}/sheets/{sheetId}/apps/{appId}/replace
需要认证:是
请求参数
URL 参数
字段 | 类型 | 说明 |
---|---|---|
fileId | NUMBER | 必填, 文件的 id |
sheetId | NUMBER | 必填, sheet的 id |
appId | NUMBER | 必填, 数据集所在的应用 id |
request body 请求体
请求体是 JSON 实体。
字段 | 类型 | 描述 |
---|---|---|
id | NUMBER | 要替换的数据集 id |
options | OBJECT | 文件数据集的数据结构 options |
文件数据集结构说明中的 options 信息都是必须提供的,其中 options.connectionId 在这里表示文件数据集要上传到的数据连接 id,就是通过接口(获取当前应用中可上传本地文件的自定义数据连接)或者接口 (获取可用于上传本地文件的衡石内置数据连接)获得的数据连接中的一个。
和新建文件数据集的不同之处在于options.schema,如果新的数据集fieldName和原数据集的fieldName不同,那新的fieldName保存到dbFieldName字段里,这和其它替换数据集相同。
返回对象的格式说明
字段 | 类型 | 说明 |
---|---|---|
version | STRING | 当前系统版本哈希值 |
data | OBJECT | 见文件数据集结构说明 |
接口示例1:
POST /api/files/35/sheets/0/apps/4669/replace
{
"id": 11,
"options": {
"schema": [
{
"fieldName": "_c0",
"basicType": "string",
"defaultAggrType": "count",
"type": "string",
"originType": "string",
"label": "country_id",
"config": {},
"visible": true,
"nativeType": "varchar",
"suggestedTypes": [
"string"
],
"detectedType": "string",
"distinct": false,
"alias": "country_id"
},
{
"fieldName": "_c1",
"basicType": "string",
"defaultAggrType": "count",
"type": "string",
"originType": "string",
"label": "country_name",
"config": {},
"visible": true,
"nativeType": "varchar",
"suggestedTypes": [
"string"
],
"detectedType": "string",
"distinct": false,
"alias": "country_name"
},
{
"fieldName": "_c2",
"basicType": "number",
"defaultAggrType": "sum",
"type": "number",
"originType": "string",
"label": "region_id",
"config": {
"seperator": " ",
"dialectName": "PostgresqlDialect"
},
"visible": true,
"nativeType": "varchar",
"suggestedTypes": [
"number",
"string"
],
"detectedType": "number",
"distinct": false,
"alias": "region_id"
}
],
"origin": "file_excel",
"header": 0,
"padHeader": false,
"delimiter": "comma",
"encoding": "UTF-8",
"transpose": false,
"range": [
{
"xbegin": 0,
"xend": 2,
"ybegin": 0,
"yend": 3
}
],
"cache": false,
"connectionId": 3
}
}
返回
{
"version": "3.2-SNAPSHOT@@git.commit.id.abbrev@#d800fee",
"data": {
"id": 11,
"title": "a_ivt_regions",
"createdBy": 1,
"createdAt": "2020-07-09 18:25:15",
"updatedBy": 1,
"updatedAt": "2020-07-09 18:40:30",
"visible": true,
"isDelete": false,
"appId": 4669,
"options": {
"cache": false,
"type": "connection",
"totalSize": 0,
"rowCount": 0,
"connectionTitle": "hengshi_sense_internal_storage",
"refreshHours": [],
"refreshMinute": 0,
"connectionId": 1,
"origin": "file_excel",
"table": "file_tb_d7a4d26309bfdcd433aef19e21863175",
"path": [],
"transpose": false,
"delimiter": ",",
"encoding": "UTF-8",
"header": 0,
"padHeader": false,
"range": [
{
"xbegin": 0,
"xend": 2,
"ybegin": 0,
"yend": 3
}
],
"storageType": "engine",
"dialectOptions": {
"dialectName": "PostgresqlDialect",
"majorVersion": 10,
"minorVersion": 4
},
"storageConnectionId": 3,
"storageConnectionTitle": "引擎连接",
"schema": [
{
"datasetId": 11,
"fieldName": "_c0",
"hsVersion": 3,
"basicType": "string",
"defaultAggrType": "count",
"type": "string",
"originType": "string",
"comment": "",
"label": "country_id",
"config": {},
"visible": true,
"nativeType": "text",
"suggestedTypes": [
"string"
],
"detectedType": "string"
},
{
"datasetId": 11,
"fieldName": "_c1",
"hsVersion": 3,
"basicType": "string",
"defaultAggrType": "count",
"type": "string",
"originType": "string",
"comment": "",
"label": "country_name",
"config": {
"seperator": " ",
"dialectName": "PostgresqlDialect"
},
"visible": true,
"nativeType": "text",
"suggestedTypes": [
"string"
],
"detectedType": "string"
},
{
"datasetId": 11,
"fieldName": "_c2",
"hsVersion": 1,
"basicType": "number",
"defaultAggrType": "sum",
"type": "number",
"originType": "string",
"comment": "",
"label": "region_id",
"config": {
"seperator": " ",
"dialectName": "PostgresqlDialect"
},
"visible": true,
"nativeType": "text",
"suggestedTypes": [
"number",
"string"
],
"detectedType": "number"
},
{
"datasetId": 11,
"fieldName": "_hs_row_id",
"hsVersion": 0,
"basicType": "number",
"defaultAggrType": "sum",
"type": "number",
"originType": "integer",
"comment": "",
"config": {
"dialectName": "PostgresqlDialect"
},
"visible": false,
"nativeType": "serial",
"suggestedTypes": [
"number",
"string"
],
"detectedType": "integer"
}
],
"metrics": []
},
"importType": 1,
"importStatus": 1,
"importOptions": {},
"status": 3,
"refreshStats": {
"refreshAt": "2020-07-09 18:40:29",
"executeRefreshAt": "2020-07-09 18:40:29",
"executeRefreshRowCountAt": 1594291230528
},
"datasetAcl": {
"level": "FULLACCESS",
"dataFilters": []
},
"hsVersion": 17,
"creator": {
"id": 1,
"name": "trial",
"email": "trial@hengshi.io"
},
"updater": {
"id": 1,
"name": "trial",
"email": "trial@hengshi.io"
},
"importSwitchable": false,
"refreshSchema": false,
"type": "connection",
"origin": "file_excel",
"emptyDataset": false,
"public": true
}
}