這里我用4.2.2,下載好后,解壓開,把 phoenix⑷.2.2-server.jar 拷貝到所有RegionServer的lib目錄下 /usr/lib/hbase/lib
我們建立sql 名叫 us_population.sql 內容是
CREATE TABLE IF NOT EXISTS us_population ( state CHAR(2) NOT NULL, city VARCHAR NOT NULL, population BIGINT CONSTRAINT my_pk PRIMARY KEY (state, city));
建立1個文件 us_population.csv
NY,New York,8143197
CA,Los Angeles,3844829
IL,Chicago,2842518
TX,Houston,2016582
PA,Philadelphia,1463281
AZ,Phoenix,1461575
TX,San Antonio,1256509
CA,San Diego,1255540
TX,Dallas,1213825
CA,San Jose,912332
再創建1個文件 us_population_queries.sql
SELECT state as "State",count(city) as "City Count",sum(population) as "Population Sum" FROM us_population GROUP BY state ORDER BY sum(population) DESC;
然后1起履行
phoenix⑷.2.2-bin/bin/psql.py host1,host2:2181 us_population.sql us_population.csv us_population_queries.sql
這邊記得把 host1 和 host2 換成你的zookeeper地址
這條命令你同時做了 創建1個表,插入數據,查詢結果 3件事情
[root@host1 ~]# phoenix⑷.2.2-bin/bin/psql.py host1,host2:2181 us_population.sql us_population.csv us_population_queries.sql
15/03/04 17:14:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/03/04 17:14:24 WARN impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-phoenix.properties,hadoop-metrics2.properties
no rows upserted
Time: 0.726 sec(s)
csv columns from database.
CSV Upsert complete. 10 rows upserted
Time: 0.103 sec(s)
St City Count Population Sum
-- ---------------------------------------- ----------------------------------------
NY 1 8143197
CA 3 6012701
TX 3 4486916
IL 1 2842518
PA 1 1463281
AZ 1 1461575
Time: 0.048 sec(s)
用hbase shell 看下會發現多出來1個 US_POPULATION 表,用scan 命令查看1下這個表的數據
hbase(main):002:0> scan 'US_POPULATION'
ROW COLUMN+CELL
AZPhoenix column=0:POPULATION, timestamp=1425460467206, value=x80x00x00x00x00x16MG
AZPhoenix column=0:_0, timestamp=1425460467206, value=
CALos Angeles column=0:POPULATION, timestamp=1425460467206, value=x80x00x00x00x00:xAAxDD
CALos Angeles column=0:_0, timestamp=1425460467206, value=
CASan Diego column=0:POPULATION, timestamp=1425460467206, value=x80x00x00x00x00x13(t
CASan Diego column=0:_0, timestamp=1425460467206, value=
CASan Jose column=0:POPULATION, timestamp=1425460467206, value=x80x00x00x00x00x0DxEBxCC
CASan Jose column=0:_0, timestamp=1425460467206, value=
ILChicago column=0:POPULATION, timestamp=1425460467206, value=x80x00x00x00x00+_x96
ILChicago column=0:_0, timestamp=1425460467206, value=
NYNew York column=0:POPULATION, timestamp=1425460467206, value=x80x00x00x00x00|A]
NYNew York column=0:_0, timestamp=1425460467206, value=
PAPhiladelphia column=0:POPULATION, timestamp=1425460467206, value=x80x00x00x00x00x16SxF1
PAPhiladelphia column=0:_0, timestamp=1425460467206, value=
TXDallas column=0:POPULATION, timestamp=1425460467206, value=x80x00x00x00x00x12x85x81
TXDallas column=0:_0, timestamp=1425460467206, value=
TXHouston column=0:POPULATION, timestamp=1425460467206, value=x80x00x00x00x00x1ExC5F
TXHouston column=0:_0, timestamp=1425460467206, value=
TXSan Antonio column=0:POPULATION, timestamp=1425460467206, value=x80x00x00x00x00x13,=
TXSan Antonio column=0:_0, timestamp=1425460467206, value=
10 row(s) in 0.2220 seconds
會發現
- 之前定義的PRIMARY KEY 為 state, city ,因而Phoenix就把你輸入的state 和 city的值拼起來成為rowkey
- 其他的字段還是依照列名去保存,默許的列簇為 0
- 還有1個0:_0 這個列是沒有值的,這個是Phoenix處于性能方面斟酌增加的1個列,不用管這個列
命令行方式
然后履行sqlline.py
$ ./sqlline.py localhost
可以進入命令行模式
0: jdbc:phoenix:localhost>
退出命令行的方式是履行 !quit
0: jdbc:phoenix:localhost>!quit
命令開頭需要1個感嘆號,使用help可以打印出所有命令
0: jdbc:phoenix:localhost> help
!all Execute the specified SQL against all the current connections
!autocommit Set autocommit mode on or off
!batch Start or execute a batch of statements
!brief Set verbose mode off
!call Execute a callable statement
!close Close the current connection to the database
!closeall Close all current open connections
!columns List all the columns for the specified table
!commit Commit the current transaction (if autocommit is off)
!connect Open a new connection to the database.
!dbinfo Give metadata information about the database
!describe Describe a table
!dropall Drop all tables in the current database
!exportedkeys List all the exported keys for the specified table
!go Select the current connection
!help Print a summary of command usage
!history Display the command history
!importedkeys List all the imported keys for the specified table
!indexes List all the indexes for the specified table
!isolation Set the transaction isolation for this connection
!list List the current connections
!manual Display the SQLLine manual
!metadata Obtain metadata information
!nativesql Show the native SQL for the specified statement
!outputformat Set the output format for displaying results
(table,vertical,csv,tsv,xmlattrs,xmlelements)
!primarykeys List all the primary keys for the specified table
!procedures List all the procedures
!properties Connect to the database specified in the properties file(s)
!quit Exits the program
!reconnect Reconnect to the database
!record Record all output to the specified file
!rehash Fetch table and column names for command completion
!rollback Roll back the current transaction (if autocommit is off)
!run Run a script from the specified file
!save Save the current variabes and aliases
!scan Scan for installed JDBC drivers
!script Start saving a script to a file
!set Set a sqlline variable
!sql Execute a SQL command
!tables List all the tables in the database
!typeinfo Display the type map for the current connection
!verbose Set verbose mode on
建立employee的映照表
數據準備
然后我們來建立1個映照表,映照我之前建立過的1個hbase表 employee
hbase(main):003:0> describe 'employee'
DESCRIPTION ENABLED
'employee', {NAME => 'company', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => true
'1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', I
N_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'family', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLIC
ATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => '
false', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
1 row(s) in 0.1120 seconds
可以看出employee有連個列簇 company 和 family
hbase(main):016:0> scan 'employee'
ROW COLUMN+CELL
row1 column=company:name, timestamp=1425537923391, value=ted
row1 column=company:position, timestamp=1425537950471, value=worker
row1 column=family:tel, timestamp=1425537956413, value=13600912345
row2 column=family:tel, timestamp=1425537994087, value=18942245698
row2 column=family:name, timestamp=1425537975610, value=michael
row2 column=family:position, timestamp=1425537985594, value=manager
2 row(s) in 0.0340 seconds
有兩條數據。如果沒有這些數據的同學可以用以下命令創建
create 'employee','company','family'
put 'employee','row1','company:name','ted'
put 'employee','row1','company:position','worker'
put 'employee','row1','family:tel','13600912345'
put 'employee','row2','company:name','michael'
put 'employee','row2','company:position','manager'
put 'employee','row2','family:tel','1894225698'
scan 'employee'
關于映照表
在建立映照表之前要說明的是,Phoenix是大小寫敏感的,并且所有命令都是大寫,如果你建的表名沒有用雙引號括起來,那末不管你輸入的是大寫還是小寫,建立出來的表名都是大寫的,如果你需要建立出同時包括大寫和小寫的表名和字段名,請把表名或字段名用雙引號括起來
你可以建立讀寫的表或只讀的表,他們的區分以下
- 讀寫表:如果你定義的列簇不存在,會被自動建立出來,并且賦以空值
- 只讀表:你定義的列簇必須事前存在
建立映照
0: jdbc:phoenix:localhost> CREATE TABLE IF NOT EXISTS "employee" ("no" CHAR(4) NOT NULL PRIMARY KEY, "company"."name" VARCHAR(30),"company"."position" VARCHAR(20), "family"."tel" CHAR(11), "family"."age" INTEGER);
2 rows affected (1.745 seconds)
這行語句有幾個注意點
- IF NOT EXISTS可以保證如果已有建立過這個表,配置不會被覆蓋
- 作為rowkey的字段用 PRIMARY KEY標定
- 列簇用 columnFamily.columnName 來表示
- family.age 是新增的字段,我之前建立測試數據的時候沒有建立這個字段的緣由是在hbase shell下沒法直接寫入數字型,等等我用UPSERT 命令插入數據的時候你就能夠看到真實的數字型在hbase 下是如何顯示的
建立好后,查詢1下數據
0: jdbc:phoenix:localhost> SELECT * FROM "employee";
+------+--------------------------------+----------------------+-------------+------------------------------------------+
| no | name | position | tel | age |
+------+--------------------------------+----------------------+-------------+------------------------------------------+
| row1 | ted | worker | 13600912345 | null |
| row2 | michael | manager | 1894225698 | null |
+------+--------------------------------+----------------------+-------------+------------------------------------------+
插入/更改數據
插入或更改數據在Phoenix里面是1個命令叫 UPSERT 意思是 update + insert
我們插入1條數據試試
UPSERT INTO "employee" VALUES ('row3','billy','worker','16974681345',33);
查詢1下數據
0: jdbc:phoenix:localhost> SELECT * FROM "employee";
+------+--------------------------------+----------------------+-------------+------------------------------------------+
| no | name | position | tel | age |
+------+--------------------------------+----------------------+-------------+------------------------------------------+
| row1 | ted | worker | 13600912345 | null |
| row2 | michael | manager | 1894225698 | null |
| row3 | billy | worker | 16974681345 | 33 |
+------+--------------------------------+----------------------+-------------+------------------------------------------+
3 rows selected (0.195 seconds)
我們去hbase里面看1下數據
hbase(main):054:0> scan 'employee'
ROW COLUMN+CELL
row1 column=company:_0, timestamp=1425543735420, value=
row1 column=company:name, timestamp=1425543735274, value=ted
row1 column=company:position, timestamp=1425543735323, value=worker
row1 column=family:tel, timestamp=1425543735420, value=13600912345
row2 column=company:_0, timestamp=1425543735767, value=
row2 column=company:name, timestamp=1425543735608, value=michael
row2 column=company:position, timestamp=1425543735720, value=manager
row2 column=family:tel, timestamp=1425543735767, value=1894225698
row3 column=company:_0, timestamp=1425543857594, value=
row3 column=company:name, timestamp=1425543857594, value=billy
row3 column=company:position, timestamp=1425543857594, value=worker
row3 column=family:age, timestamp=1425543857594, value=x80x00x00!
row3 column=family:tel, timestamp=1425543857594, value=16974681345
3 row(s) in 0.0650 seconds
最后那個 x80x00x00! 就是數字型在hbase中序列化成了字節的存儲情勢
GUI方式的安裝方法在 http://phoenix.apache.org/installation.html 這邊不講了,由于我自己也沒弄起來,而且那個界面實在太丑了,看了不忍心使用。