GPDB管理员笔记(三)装载和卸载数据
date date, amount float4, category text, description text)
LOCATION ( ‘http://intranet.company.com/expenses/sales/file.csv‘,
‘http://intranet.company.com/expenses/exec/file.csv‘,
‘http://intranet.company.com/expenses/finance/file.csv‘,
‘http://intranet.company.com/expenses/ops/file.csv‘,
‘http://intranet.company.com/expenses/marketing/file.csv‘,
‘http://intranet.company.com/expenses/eng/file.csv‘ )
FORMAT ‘CSV‘ ( HEADER );
SELECT * from ext_expenses where category=‘travel‘;
或者想要快速装载全部数据到一个新的数据库表中:
=# CREATE TABLE expenses AS SELECT * from ext_expenses;
--2014-03-04 13:51:30-- http://mirrors.aliyun.com/repo/Centos-6.repo
正在解析主机 mirrors.aliyun.com... 115.28.122.210, 112.124.140.210
正在连接 mirrors.aliyun.com|115.28.122.210|:80... 已连接。
已发出 HTTP 请求,正在等待回应... 200 OK
长度:2086 (2.0K) [application/octet-stream]
正在保存至: “Centos-6.repo”
100%[==============================================================================================================================>] 2,086 --.-K/s in 0s
2014-03-04 13:51:30 (194 MB/s) - 已保存 “Centos-6.repo” [2086/2086])
libo=# CREATE EXTERNAL WEB TABLE ext_expenses (name text)
libo-# location (‘http://mirrors.aliyun.com/repo/Centos-6.repo‘)
libo-# FORMAT ‘TEXT‘ ( DELIMITER ‘|‘ NULL ‘ ‘) ;
CREATE EXTERNAL TABLE
NOTICE: Table doesn‘t have ‘DISTRIBUTED BY‘ clause -- Using column(s) named ‘colum‘ as the Greenplum Database data distribution key for this table.
HINT: The ‘DISTRIBUTED BY‘ clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
ERROR: could not translate host name "mirrors.aliyun.com", port "80" to address: Temporary failure in name resolution (cdbutil.c:754) (seg0 slice1 sdw1:40000 pid=26261) (cdbdisp.c:1489)
libo=#
libo=#
libo=# SELECT * from ext_expenses;
ERROR: could not translate host name "mirrors.aliyun.com", port "80" to address: Temporary failure in name resolution (cdbutil.c:754) (seg0 slice1 sdw1:40000 pid=26254) (cdbdisp.c:1489)
DROP EXTERNAL TABLE
libo=# CREATE EXTERNAL WEB TABLE ext_expenses (colum text)
libo-# location (‘http://115.28.122.210/repo/Centos-6.repo‘)
libo-# FORMAT ‘TEXT‘ ( DELIMITER ‘|‘ NULL ‘ ‘) ;
CREATE EXTERNAL TABLE
libo=# select * from ext_expenses;
ERROR: connection with gpfdist failed for http://115.28.122.210/repo/Centos-6.repo. effective url: http://115.28.122.210/repo/Centos-6.repo. (seg0 slice1 sdw1:40000 pid=26296)
libo=#
[1] 10321
[gpadmin@mdw data_tst]$ Serving HTTP on port 8081, directory /home/gpadmin/data_tst
--2014-03-04 14:14:01-- http://192.168.100.101:8081/aaa
正在连接 192.168.100.101:8081... 已连接。
已发出 HTTP 请求,正在等待回应... 200 ok
长度:未指定 [text/plain]
正在保存至: “aaa”
[ <=> ] 17 --.-K/s in 0s
2014-03-04 14:14:01 (1.61 MB/s) - “aaa” 已保存 [17]
libo-# location (‘http://192.168.100.101:8081/aaa‘)
libo-# FORMAT ‘TEXT‘ ( DELIMITER ‘|‘ NULL ‘ ‘) ;
CREATE EXTERNAL TABLE
libo=# select * from ext_expenses;
colum
-------
aaaa
aaa
aa
a
(7 rows)
SELECT 10
libo-# location (‘gpfdist://192.168.100.11:8081/aaa.csv‘)
libo-# format ‘csv‘;
CREATE EXTERNAL TABLE
libo=# select * from t_ext;
ERROR: missing data for column "name" (seg3 slice1 sdw2:40001 pid=10243)
DETAIL: External table t_ext, line 4000 of gpfdist://192.168.100.11:8081/aaa.csv: ""
结合使用SEGMENT REJECT LIMIT子句。
拒绝限制count参数可用于指定记录数(缺省),或者使用PERCENT指定记录
百分比。
保存错误记录以备将来的检查,使用LOG ERRORS INTO子句指定错误记
录日志表。
libo-# ;
gp_external_enable_exec
-------------------------
on
(1 row)
或者CR加LF(CR+LF/回车换行/0x0A 0x0D)作为一行的分割。LF是标准UNIX或
类UNIX操作系统的标准换行标识符。其他操作系统(如Windows、Mac OS 9)可
能是CR或者CR+LF。所有这些换行标识符在GPDB中都被支持作为行分隔符
省的列分隔符是逗号(0x2C)。不过在使用COPY、CREATE EXTERNAL TABLE
时或者使用gpload定义数据格式时都可以使用DELIMITER子句执行其他的单
字符分隔符。
郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。