Administration Manual

21 documents

Apache Hive : AdminManual

Dec 12, 2024

Apache Hive : AdminManual Hive Administrator’s Manual Installing Hive Configuring Hive Setting up Metastore Setting up Hive Server (JDBC, ODBC, Thrift, etc) Hive on Amazon Web Services

Apache Hive : AdminManual Configuration

Dec 12, 2024

Apache Hive : AdminManual Configuration Apache Hive : AdminManual Configuration Configuring Hive hive-site.xml and hive-default.xml.template Temporary Folders Log Files Derby Server Mode Configuration Variables Removing Hive Metastore Password from Hive Configuration Configuring HCatalog and WebHCat HCatalog WebHCat Configuring Hive A number of configuration variables in Hive can be used by the administrator to change the behavior for their installations and user sessions.

Apache Hive : AdminManual Installation

Dec 12, 2024

Apache Hive : AdminManual Installation Apache Hive : AdminManual Installation Installing Hive Installing from a Tarball Installing from Source Code (Hive 1.2.0 and Later) Installing from Source Code (Hive 0.13.0 and Later) Installing from Source Code (Hive 0.12.0 and Earlier) Next Steps Hive CLI and Beeline CLI Hive Metastore HCatalog and WebHCat HCatalog WebHCat (Templeton) Installing Hive You can install a stable release of Hive by downloading and unpacking a tarball, or you can download the source code and build Hive using Maven (release 0.

Apache Hive : AdminManual Metastore 3.0 Administration

Dec 12, 2024

Apache Hive : AdminManual Metastore 3.0 Administration Apache Hive : AdminManual Metastore 3.0 Administration Version Note Introduction Changes From Hive 2 to Hive 3 General Configuration RDBMS Option 1: Embedding Derby Option 2: External RDBMS Installing and Upgrading the Metastore Schema Running the Metastore Embedded Mode Metastore Server Running the Metastore Without Hive Performance Optimizations CachedStore Less Commonly Changed Configuration Parameters Version Note This document applies only to the Metastore in Hive 3.

Apache Hive : AdminManual Metastore Administration

Dec 12, 2024

Apache Hive : AdminManual Metastore Administration This page only documents the MetaStore in Hive 2.x and earlier. For 3.x and later releases please see AdminManual Metastore 3.0 Administration Apache Hive : AdminManual Metastore Administration Introduction Local/Embedded Metastore Database (Derby) Remote Metastore Database Local/Embedded Metastore Server Remote Metastore Server Supported Backend Databases for Metastore Metastore Schema Consistency and Upgrades Introduction All the metadata for Hive tables and partitions are accessed through the Hive Metastore.

Apache Hive : AdminManual SettingUpHiveServer

Dec 12, 2024

Apache Hive : AdminManual SettingUpHiveServer Setting Up Hive Server Setting Up HiveServer2 Setting Up Thrift Hive Server Setting Up Hive JDBC Server Setting Up Hive ODBC Server

Apache Hive : Copy of Hive Schema Tool - [TODO: move it under a 4.0 admin manual page, find a proper name]

Dec 12, 2024

Apache Hive : Copy of Hive Schema Tool - [TODO: move it under a 4.0 admin manual page, find a proper name] Apache Hive : Copy of Hive Schema Tool - [TODO: move it under a 4.0 admin manual page, find a proper name] About Metastore Schema Verification The Hive Schema Tool The schematool Command Usage Examples About Schema tool helps to initialise and upgrade metastore database and hive sys schema.

Apache Hive : Hive MetaTool

Dec 12, 2024

Apache Hive : Hive MetaTool Introduced in Hive 0.10.0. See HIVE-3056 and HIVE-3443. The Hive MetaTool enables administrators to do bulk updates on the location fields in database, table, and partition records in the metastore. It provides the following functionality: Ability to search and replace the HDFS NN (NameNode) location in metastore records that reference the NN. One use is to transition a Hive deployment to HDFS HA NN (HDFS High Availability NameNode).

Apache Hive : Hive on Spark: Getting Started

Dec 12, 2024

Apache Hive : Hive on Spark: Getting Started Hive on Spark provides Hive with the ability to utilize Apache Spark as its execution engine. set hive.execution.engine=spark; Hive on Spark was added in HIVE-7292. Apache Hive : Hive on Spark: Getting Started Version Compatibility Spark Installation Configuring YARN Configuring Hive Configuration property details Configuring Spark Tuning Details Common Issues (Green are resolved, will be removed from this list) Recommended Configuration Design documents Attachments: Comments: Version Compatibility Hive on Spark is only tested with a specific version of Spark, so a given version of Hive is only guaranteed to work with a specific version of Spark.

Apache Hive : Hive Schema Tool

Dec 12, 2024

Apache Hive : Hive Schema Tool Apache Hive : Hive Schema Tool Metastore Schema Verification The Hive Schema Tool The schematool Command Usage Examples Metastore Schema Verification Version Introduced in Hive 0.12.0. See HIVE-3764. Hive now records the schema version in the metastore database and verifies that the metastore schema version is compatible with Hive binaries that are going to accesss the metastore.

Apache Hive : HiveAmazonElasticMapReduce

Dec 12, 2024

Apache Hive : HiveAmazonElasticMapReduce Amazon Elastic MapReduce and Hive Amazon Elastic MapReduce is a web service that makes it easy to launch managed, resizable Hadoop clusters on the web-scale infrastructure of Amazon Web Services (AWS). Elastic Map Reduce makes it easy for you to launch a Hive and Hadoop cluster, provides you with flexibility to choose different cluster sizes, and allows you to tear them down automatically when processing has completed.

Apache Hive : HiveAws

Dec 12, 2024

Apache Hive : HiveAws = Hive and Amazon Web Services = Background This document explores the different ways of leveraging Hive on Amazon Web Services - namely S3, EC2 and Elastic Map-Reduce. Hadoop already has a long tradition of being run on EC2 and S3. These are well documented in the links below which are a must read: Hadoop and S3 Amazon and EC2 The second document also has pointers on how to get started using EC2 and S3.

Apache Hive : HiveDerbyServerMode

Dec 12, 2024

Apache Hive : Using Derby in Server Mode Hive in embedded mode has a limitation of one active user at a time. You may want to run Derby as a Network Server, this way multiple users can access it simultaneously from different systems. See Metadata Store and Embedded Metastore for more information. Apache Hive : Using Derby in Server Mode Download Derby Set Environment Starting Derby Configure Hive to Use Network Derby Copy Derby Jar Files Start Up Hive The Result Download Derby It is suggested you download the version of Derby that ships with Hive.

Apache Hive : HiveJDBCInterface

Dec 12, 2024

Apache Hive : JDBC Driver The current JDBC interface for Hive only supports running queries and fetching results. Only a small subset of the metadata calls are supported. To see how the JDBC interface can be used, see sample code. Apache Hive : JDBC Driver Integration with Pentaho Integration with SQuirrel SQL Client Integration with Pentaho Download pentaho report designer from the pentaho website.

Apache Hive : HiveODBC

Dec 12, 2024

Apache Hive : ODBC Driver These instructions are for the Hive ODBC driver available in Hive for HiveServer1. There is no ODBC driver available for HiveServer2 as part of Apache Hive. There are third party ODBC drivers available from different vendors, and most of them seem to be free. Apache Hive : ODBC Driver Introduction Suggested Reading Software Requirements Driver Architecture Building and Setting Up ODBC Components Hive Client Build/Setup unixODBC API Wrapper Build/Setup Connecting the Driver to a Driver Manager Testing with ISQL Build libodbchive.

Apache Hive : HiveServer

Dec 12, 2024

Apache Hive : HiveServer Thrift Hive Server HiveServer is an optional service that allows a remote client to submit requests to Hive, using a variety of programming languages, and retrieve results. HiveServer is built on Apache ThriftTM (http://thrift.apache.org/), therefore it is sometimes called the Thrift server although this can lead to confusion because a newer service named HiveServer2 is also built on Thrift. Since the introduction of HiveServer2, HiveServer has also been called HiveServer1.

Apache Hive : Manual Installation

Dec 12, 2024

Apache Hive : Manual Installation Apache Hive : Manual Installation Installing, configuring and running Hive Prerequisites Install the prerequisites Java 8 Maven: Protobuf Hadoop Tez Extra hadoop configurations to make everything working Installing Hive from a Tarball Installing from Source Code Installing with old version hadoop(greater than or equal 3.1.0) Next Steps Beeline CLI Hive Metastore HCatalog and WebHCat HCatalog WebHCat (Templeton) Installing, configuring and running Hive You can install a stable release of Hive by downloading and unpacking a tarball, or you can download the source code and build Hive using Maven (release 3.

Apache Hive : Replication

Dec 12, 2024

Apache Hive : Replication Apache Hive : Replication Overview Potential Uses Prerequisites Limitations Configuration Typical Mode of Operation Replication to AWS/EMR/S3 Overview Hive Replication builds on the metastore event and ExIm features to provide a framework for replicating Hive metadata and data changes between clusters. There is no requirement for the source cluster and replica to run the same Hadoop distribution, Hive version, or metastore RDBMS.

Apache Hive : Setting Up Hive with Docker

Dec 12, 2024

Apache Hive : Setting Up Hive with Docker Introduction Run Apache Hive inside docker container in pseudo-distributed mode STEP 1: Pull the image Pull the 4.0.0 image from Hive DockerHub docker pull apache/hive:4.0.0 STEP 2: Export the Hive version export HIVE_VERSION=4.0.0 STEP 3: Launch the HiveServer2 with an embedded Metastore. This is lightweight and for a quick setup, it uses Derby as metastore db. docker run -d -p 10000:10000 -p 10002:10002 --env SERVICE_NAME=hiveserver2 --name hive4 apache/hive:${HIVE_VERSION} STEP 4: Connect to beeline docker exec -it hiveserver2 beeline -u 'jdbc:hive2://hiveserver2:10000/' Note: Launch Standalone Metastore To use standalone Metastore with Derby,

Apache Hive : Setting Up HiveServer2

Dec 12, 2024

Apache Hive : Setting Up HiveServer2 HiveServer2 (HS2) is a server interface that enables remote clients to execute queries against Hive and retrieve the results (a more detailed intro here). The current implementation, based on Thrift RPC, is an improved version of HiveServer and supports multi-client concurrency and authentication. It is designed to provide better support for open API clients like JDBC and ODBC. The Thrift interface definition language (IDL) for HiveServer2 is available at https://github.

Apache Hive : User and Group Filter Support with LDAP Atn Provider in HiveServer2

Dec 12, 2024

Apache Hive : User and Group Filter Support with LDAP Atn Provider in HiveServer2 Apache Hive : User and Group Filter Support with LDAP Atn Provider in HiveServer2 User and Group Filter Support with LDAP Group Membership User Search List Custom Query String Order of Precedence User and Group Filter Support with LDAP Starting in Hive 1.3.0, HIVE-7193 adds support in HiveServer2 for