测试Hive集成Sentry

2015.04.30 | Comments

本文在安装和配置Sentry基础之上测试Hive集成Sentry。注意:这里Hive中并没有配置Kerberos认证

关于配置了Kerberos的Hive集群如何集成Sentry,请参考配置安全的Hive集群集成Sentry

1. 配置Sentry

安装和配置Sentry

2. 配置Hive

Hive Metastore集成Sentry

需要在 /etc/hive/conf/hive-site.xml中添加:

    <property>
      <name>hive.metastore.pre.event.listeners</name>
      <value>org.apache.sentry.binding.metastore.MetastoreAuthzBinding</value>
    </property>
    <property>
      <name>hive.metastore.event.listeners</name>
      <value>org.apache.sentry.binding.metastore.SentryMetastorePostEventListener</value>
    </property>

Hive-server2集成Sentry

修改 /etc/hive/conf/hive-site.xml,添加以下内容:

    <property>
      <name>hive.server2.enable.impersonation</name>
      <value>true</value>
    </property>
    <property>
        <name>hive.security.authorization.task.factory</name>
        <value>org.apache.sentry.binding.hive.SentryHiveAuthorizationTaskFactoryImpl</value>
    </property>
    <property>
        <name>hive.server2.session.hook</name>
        <value>org.apache.sentry.binding.hive.HiveAuthzBindingSessionHook</value>
    </property>
    <property>
        <name>hive.sentry.conf.url</name>
        <value>file:///etc/hive/conf/sentry-site.xml</value>
    </property>

参考模板sentry-site.xml.hive-client.template在 /etc/hive/conf/ 目录创建 sentry-site.xml:

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <property>
       <name>sentry.service.client.server.rpc-port</name>
       <value>8038</value>
    </property>
    <property>
       <name>sentry.service.client.server.rpc-address</name>
       <value>cdh1</value>
    </property>
    <property>
       <name>sentry.service.client.server.rpc-connection-timeout</name>
       <value>200000</value>
    </property>

    <!--以下是客户端配置-->
    <property>
        <name>sentry.provider</name>
        <value>org.apache.sentry.provider.file.HadoopGroupResourceAuthorizationProvider</value>
    </property>
    <property>
        <name>sentry.hive.provider.backend</name>
        <value>org.apache.sentry.provider.db.SimpleDBProviderBackend</value>
    </property>
    <property>
        <name>sentry.metastore.service.users</name>
        <value>hive</value><!--queries made by hive user (beeline) skip meta store check-->
    </property>
      <property>
        <name>sentry.hive.server</name>
        <value>server1</value>
      </property>
     <property>
        <name>sentry.hive.testing.mode</name>
        <value>true</value>
     </property>
</configuration>

3. 重启Hive

在cdh1上启动或重启hiveserver2:

$ /etc/init.d/hive-server2 restart

4. 准备测试数据

参考 Securing Impala for analysts,准备测试数据:

$ cat /tmp/events.csv
10.1.2.3,US,android,createNote
10.200.88.99,FR,windows,updateNote
10.1.2.3,US,android,updateNote
10.200.88.77,FR,ios,createNote
10.1.4.5,US,windows,updateTag

然后,在hive中运行下面 sql 语句:

create database sensitive;

create table sensitive.events (
    ip STRING, country STRING, client STRING, action STRING
  ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

load data local inpath '/tmp/events.csv' overwrite into table sensitive.events;
create database filtered;
create view filtered.events as select country, client, action from sensitive.events;
create view filtered.events_usonly as select * from filtered.events where country = 'US';

在 cdh1上通过 beeline 连接 hiveserver2,运行下面命令创建角色和组:

# 注意:我设置了hiveserver2的jdbc端口为10001
$  beeline -u "jdbc:hive2://cdh1:10001/" -n hive -p hive -d org.apache.hive.jdbc.HiveDriver

执行下面的 sql 语句创建 role、group等:

create role admin_role;
GRANT ALL ON SERVER server1 TO ROLE admin_role;
GRANT ROLE admin_role TO GROUP admin;
GRANT ROLE admin_role TO GROUP hive;

create role test_role;
GRANT ALL ON DATABASE filtered TO ROLE test_role;
GRANT ROLE test_role TO GROUP test;

上面创建了两个角色:

  • admin_role,具有管理员权限,可以读写所有数据库,并授权给 admin 和 hive 组(对应操作系统上的组)
  • test_role,只能读写 filtered 数据库,并授权给 test 组。

因为系统上没有test用户和组,所以需要手动创建:

$ useradd test

5. 测试

测试admin_role角色

使用hive用户访问beeline:

$ beeline -u "jdbc:hive2://cdh1:10000/" -n hive -p hive -d org.apache.hive.jdbc.HiveDriver

查看当前系统用户是谁:

0: jdbc:hive2://cdh1:10000/> set system:user.name;
+------------------------+--+
|          set           |
+------------------------+--+
| system:user.name=hive  |
+------------------------+--+
1 row selected (0.188 seconds)

hive属于admin_role组,具有管理员权限,可以查看所有角色:

0: jdbc:hive2://cdh1:10000/> show roles;
+-------------+--+
|    role     |
+-------------+--+
| test_role   |
| admin_role  |
+-------------+--+
2 rows selected (0.199 seconds)

查看所有权限:

0: jdbc:hive2://cdh1:10000/> SHOW GRANT ROLE test_role;
+-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+
| database  | table  | partition  | column  | principal_name  | principal_type  | privilege  | grant_option  | grant_time | 
+-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+
| filtered   |        |            |         | test_role       | ROLE            | *          | false       | 1430293474047000  | 
+-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+

0: jdbc:hive2://cdh1:10000/> SHOW GRANT ROLE admin_role;
+-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+
| database  | table  | partition  | column  | principal_name  | principal_type  | privilege  | grant_option  | grant_time  | 
+-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+
| *         |        |            |         | admin_role      | ROLE            | *          | false       | 1430293473308000  
+-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+
1 row selected (0.16 seconds)

hive用户可以查看所有数据库、访问所有表:

0: jdbc:hive2://cdh1:10000/> show databases;
+----------------+--+
| database_name  |
+----------------+--+
| default        |
| filtered       |
| sensitive      |
+----------------+--+
3 rows selected (0.391 seconds)
0: jdbc:hive2://cdh1:10000/> use filtered;
No rows affected (0.101 seconds)
0: jdbc:hive2://cdh1:10000/> select * from filtered.events;
+-----------------+----------------+----------------+--+
| events.country  | events.client  | events.action  |
+-----------------+----------------+----------------+--+
| US              | android        | createNote     |
| FR              | windows        | updateNote     |
| US              | android        | updateNote     |
| FR              | ios            | createNote     |
| US              | windows        | updateTag      |
+-----------------+----------------+----------------+--+
5 rows selected (0.431 seconds)
0: jdbc:hive2://cdh1:10000/> select * from sensitive.events;
+---------------+-----------------+----------------+----------------+--+
|   events.ip   | events.country  | events.client  | events.action  |
+---------------+-----------------+----------------+----------------+--+
| 10.1.2.3      | US              | android        | createNote     |
| 10.200.88.99  | FR              | windows        | updateNote     |
| 10.1.2.3      | US              | android        | updateNote     |
| 10.200.88.77  | FR              | ios            | createNote     |
| 10.1.4.5      | US              | windows        | updateTag      |
+---------------+-----------------+----------------+----------------+--+
5 rows selected (0.247 seconds)

测试test_role角色

使用test用户访问beeline:

$ beeline -u "jdbc:hive2://cdh1:10000/" -n test -p test -d org.apache.hive.jdbc.HiveDriver

查看当前系统用户是谁:

0: jdbc:hive2://cdh1:10000/> set system:user.name;
+------------------------+--+
|          set           |
+------------------------+--+
| system:user.name=hive  |
+------------------------+--+
1 row selected (0.188 seconds)

test用户不是管理员,是不能查看所有角色的:

0: jdbc:hive2://cdh1:10000/> show roles;
ERROR : Error processing Sentry command: Access denied to test. Server Stacktrace: org.apache.sentry.provider.db.SentryAccessDeniedException: Access denied to test

test用户可以列出所有数据库:

0: jdbc:hive2://cdh1:10000/> show databases;
+----------------+--+
| database_name  |
+----------------+--+
| default        |
| filtered       |
| sensitive      |
+----------------+--+
3 rows selected (0.079 seconds)

test用户可以filtered库:

0: jdbc:hive2://cdh1:10000/> use filtered;
No rows affected (0.206 seconds)
0: jdbc:hive2://cdh1:10000/> select * from events;
+-----------------+----------------+----------------+--+
| events.country  | events.client  | events.action  |
+-----------------+----------------+----------------+--+
| US              | android        | createNote     |
| FR              | windows        | updateNote     |
| US              | android        | updateNote     |
| FR              | ios            | createNote     |
| US              | windows        | updateTag      |
+-----------------+----------------+----------------+--+
5 rows selected (0.361 seconds)

但是,test用户没有权限访问sensitive库:

0: jdbc:hive2://cdh1:10000/> use sensitive;
Error: Error while compiling statement: FAILED: SemanticException No valid privileges
 Required privileges for this query: Server=server1->Db=sensitive->Table=*->action=insert;Server=server1->Db=sensitive->Table=*->action=select; (state=42000,code=40000)

6. 排错

在CDH5的高版本中,hive cli 不建议使用,在hive集成sentry之后,再运行hive cli 会提示找不到sentry的类的遗产,解决办法是,将sentry相关的jar包链接到hive的home目录下的lib目录下:

ln -s /usr/lib/sentry/lib/sentry-binding-hive.jar /usr/lib/hive/lib/
ln -s /usr/lib/sentry/lib/sentry-core-common.jar /usr/lib/hive/lib
ln -s /usr/lib/sentry/lib/sentry-core-common-db.jar /usr/lib/hive/lib
ln -s /usr/lib/sentry/lib/sentry-policy-common.jar /usr/lib/hive/lib
ln -s /usr/lib/sentry/lib/sentry-policy-db.jar /usr/lib/hive/lib
ln -s /usr/lib/sentry/lib/sentry-policy-cache.jar /usr/lib/hive/lib
ln -s /usr/lib/sentry/lib/sentry-provider-common.jar /usr/lib/hive/lib
ln -s /usr/lib/sentry/lib/sentry-provider-db.jar /usr/lib/hive/lib
ln -s /usr/lib/sentry/lib/sentry-provider-cache.jar /usr/lib/hive/lib
ln -s /usr/lib/sentry/lib/sentry-provider-file.jar /usr/lib/hive/lib

7. 参考文章


原创文章,转载请注明: 转载自JavaChen Blog,作者:JavaChen
本文链接地址:http://blog.javachen.com/2015/04/30/test-hive-with-sentry.html
本文基于署名2.5中国大陆许可协议发布,欢迎转载、演绎或用于商业目的,但是必须保留本文署名和文章链接。 如您有任何疑问或者授权方面的协商,请邮件联系我。