Language Manual
Apache Hive : Apache Hive SQL Conformance
Dec 12, 2024
Apache Hive : Apache Hive SQL Conformance This page documents which parts of the SQL standard are supported by Apache Hive. The information here is not a full statement of conformance but provides users detail sufficient to generally understand Hive’s SQL conformance.
This information is versioned by Hive release version, allowing a user to quickly identify features available to them.
The formal name of the current SQL standard is ISO/IEC 9075 “Database Language SQL”.
Apache Hive : CAST…FORMAT with SQL:2016 datetime formats Usage CAST(<timestamp/date> AS <varchar/char/string> [FORMAT <template>]) CAST(<varchar/char/string> AS <timestamp/date> [FORMAT <template>]) Example select cast(dt as string format 'DD-MM-YYYY') select cast('01-05-2017' as date format 'DD-MM-YYYY') Template elements, a.k.a. Tokens, a.k.a Patterns a.k.a SQL:2016 Datetime Formats Notes For all tokens:
Patterns are case-insensitive, except AM/PM and T/Z. See these sections for more details. For string to datetime conversion, no duplicate format tokens are allowed, including tokens
Apache Hive : Common Table Expression
Dec 12, 2024
Apache Hive : Common Table Expression A Common Table Expression (CTE) is a temporary result set derived from a simple query specified in a WITH clause, which immediately precedes a SELECT or INSERT keyword. The CTE is defined only within the execution scope of a single statement. One or more CTEs can be used in a Hive SELECT, INSERT, CREATE TABLE AS SELECT, or CREATE VIEW AS SELECT statement.
Version
Apache Hive : Compaction pooling
Dec 12, 2024
Apache Hive : Compaction pooling Concept: Compaction requests and workers can be assigned to pools. A worker assigned to a specific pool will only process compaction requests in that pool. Workers and compaction requests without pool assignment are implicitly belong to the default pool. The pooling concept allows fine tuning of processing compaction requests. For example it is possible to create a pool name ‘high priority compaction’, assign some frequently modified tables to it, and dedicate a set of workers to this pool.
Apache Hive : Datasketches Integration
Dec 12, 2024
Apache Hive : Datasketches Integration Apache DataSketches (https://datasketches.apache.org/) is integrated into Hive via HIVE-22939.
This enables various kind of sketch operations thru regular sql statement.
Apache Hive : Datasketches Integration Sketch functions Naming convention sketchType functionName List declared sketch functions Integration with materialized views BI mode Rewrite COUNT(DISTINCT(X)) Rewrite percentile_disc(p) withing group(order by x) Rewrite cume_dist() over (order by id) Rewrite NTILE Rewrite RANK Examples Simple distinct counting examples using HLL Use HLL to compute distinct values using an intermediate table Use HLL to compute distinct values without intermediate table Use HLL to compute distinct values transparently thru BI mode Use HLL to compute distinct values transparently thru BI mode - while utilizing a Materialized View to store the intermediate sketches.
Apache Hive : Enhanced Aggregation, Cube, Grouping and Rollup This document describes enhanced aggregation features for the GROUP BY clause of SELECT statements.
Apache Hive : Enhanced Aggregation, Cube, Grouping and Rollup GROUPING SETS clause Grouping__ID function Grouping function Cubes and Rollups hive.new.job.grouping.set.cardinality Grouping__ID function (before Hive 2.3.0) Comments: Version
Grouping sets, CUBE and ROLLUP operators, and the GROUPING__ID function were added in Hive 0.
Apache Hive : Exchange Partition
Dec 12, 2024
Apache Hive : Exchange Partition The EXCHANGE PARTITION command will move a partition from a source table to target table and alter each table’s metadata. The Exchange Partition feature is implemented as part of HIVE-4095. Exchanging multiple partitions is supported in Hive versions 1.2.2, 1.3.0, and 2.0.0+ as part of HIVE-11745.
When the command is executed, the source table’s partition folder in HDFS will be renamed to move it to the destination table’s partition folder.
Apache Hive : GenericUDAFCaseStudy
Dec 12, 2024
Apache Hive : Tutorial to write a GenericUDAF User-Defined Aggregation Functions (UDAFs) are an excellent way to integrate advanced data-processing into Hive. Hive allows two varieties of UDAFs: simple and generic. Simple UDAFs, as the name implies, are rather simple to write, but incur performance penalties because of the use of Java Reflection, and do not allow features such as variable-length argument lists. Generic UDAFs allow all these features, but are perhaps not quite as intuitive to write as Simple UDAFs.
Apache Hive : Hive Operators
Dec 12, 2024
Apache Hive : Hive Operators Operators Precedences Example Operators Description A[B] , A.identifier bracket_op([]), dot(.) element selector, dot -A unary(+), unary(-), unary(~) unary prefix operators A IS [NOT] (NULL TRUE FALSE) A ^ B bitwise xor(^) bitwise xor A * B star(*), divide(/), mod(%), div(DIV) multiplicative operators A + B plus(+), minus(-) additive operators A B A & B bitwise and(&) bitwise and A B bitwise or( Relational Operators The following operators compare the passed operands and generate a TRUE or FALSE value depending on whether the comparison between the operands holds.
Apache Hive : Hive UDFs
Dec 12, 2024
Apache Hive : Hive UDFs Hive User-Defined Functions (UDFs) are custom functions developed in Java and seamlessly integrated with Apache Hive. UDFs are routines designed to accept parameters, execute a specific action, and return the resulting value. The return value can either be a single scalar row or a complete result set, depending on the UDF’s code and the implemented interface. UDFs represent a powerful capability that enhances classical SQL functionality by allowing the integration of custom code, providing Hive users with a versatile toolset.
Apache Hive : HivePlugins
Dec 12, 2024
Apache Hive : Plugins Apache Hive : Plugins Creating Custom UDFs Deploying Jars for User Defined Functions and User Defined SerDes Creating Custom UDFs First, you need to create a new class that extends UDF, with one or more methods named evaluate.
package com.example.hive.udf; import org.apache.hadoop.hive.ql.exec.UDF; import org.apache.hadoop.io.Text; public final class Lower extends UDF { public Text evaluate(final Text s) { if (s == null) { return null; } return new Text(s.
Apache Hive : HiveQL
Dec 12, 2024
Apache Hive : HiveQL This page is deprecated
Please see the HiveQL Language Manual
Apache Hive : LanguageManual
Dec 12, 2024
Apache Hive : LanguageManual This is the Hive Language Manual. For other Hive documentation, see the Hive wiki’s Home page.
Commands and CLIs
Commands Hive CLI (old) Beeline CLI (new) Variable Substitution HCatalog CLI File Formats
Avro Files ORC Files Parquet Compressed Data Storage LZO Compression Data Types
Data Definition Statements
DDL Statements Bucketed Tables Statistics (Analyze and Describe) Indexes Archiving Data Manipulation Statements
Apache Hive : LanguageManual Archiving
Dec 12, 2024
Apache Hive : LanguageManual Archiving Archiving for File Count Reduction.
Apache Hive : LanguageManual Archiving Overview Settings Usage Archive Unarchive Cautions and Limitations Under the Hood Overview Due to the design of HDFS, the number of files in the filesystem directly affects the memory consumption in the namenode. While normally not a problem for small clusters, memory usage may hit the limits of accessible memory on a single machine when there are >50-100 million files.
Apache Hive : LanguageManual Authorization
Dec 12, 2024
Apache Hive : LanguageManual Authorization Apache Hive : LanguageManual Authorization Introduction Hive Authorization Options Use Cases Overview of Authorization Modes Addressing Authorization Needs of Multiple Use Cases Explain Authorization More Information Introduction Note that this documentation is referring to Authorization which is verifying if a user has permission to perform a certain action, and not about Authentication (verifying the identity of the user).
Apache Hive : LanguageManual Cli
Dec 12, 2024
Apache Hive : LanguageManual Hive CLI $HIVE_HOME/bin/hive is a shell utility which can be used to run Hive queries in either interactive or batch mode.
Apache Hive : LanguageManual Hive CLI Deprecation in favor of Beeline CLI Hive Command Line Options Examples The hiverc File Logging Tool to Clear Dangling Scratch Directories Hive Batch Mode Commands Hive Interactive Shell Commands Hive Resources HCatalog CLI Deprecation in favor of Beeline CLI HiveServer2 (introduced in Hive 0.
Apache Hive : LanguageManual Commands
Dec 12, 2024
Apache Hive : LanguageManual Commands Commands are non-SQL statements such as setting a property or adding a resource. They can be used in HiveQL scripts or directly in the CLI or Beeline.
Command Description quit exit Use quit or exit to leave the interactive shell. reset Resets the configuration to the default values (as of Hive 0.10: see HIVE-3202). Any configuration parameters that were set using the set command or -hiveconf parameter in hive commandline will get reset to default value.
Apache Hive : LanguageManual DDL
Dec 12, 2024
Apache Hive : LanguageManual DDL Apache Hive : LanguageManual DDL Overview Keywords, Non-reserved Keywords and Reserved Keywords Create/Drop/Alter/Use Database Create Database Drop Database Alter Database Use Database Create/Drop/Alter Connector Create Connector Drop Connector Alter Connector Create/Drop/Truncate Table Create Table Drop Table Truncate Table Alter Table/Partition/Column Alter Table Alter Partition Alter Either Table or Partition Alter Column Create/Drop/Alter View Create View Drop View Alter View Properties Alter View As Select Create/Drop/Alter Materialized View Create Materialized View Drop Materialized View Alter Materialized View Create/Drop/Alter Index Create Index Drop Index Alter Index Create/Drop Macro Create Temporary Macro Drop Temporary Macro Create/Drop/Reload Function Temporary Functions Permanent Functions Create/Drop/Grant/Revoke Roles and Privileges Show Show Databases Show Connectors Show Tables/Views/Materialized Views/Partitions/Indexes Show Columns Show Functions Show Granted Roles and Privileges Show Locks Show Conf Show Transactions Show Compactions Describe Describe Database Describe Dataconnector Describe Table/View/Materialized View/Column Describe Partition Hive 2.
Apache Hive : LanguageManual DDL BucketedTables
Dec 12, 2024
Apache Hive : LanguageManual DDL BucketedTables This is a brief example on creating and populating bucketed tables. (For another example, see Bucketed Sorted Tables.)
Bucketed tables are fantastic in that they allow much more efficient sampling than do non-bucketed tables, and they may later allow for time saving operations such as mapside joins. However, the bucketing specified at table creation is not enforced when the table is written to, and so it is possible for the table’s metadata to advertise properties which are not upheld by the table’s actual layout.
Apache Hive : LanguageManual DML
Dec 12, 2024
Apache Hive : LanguageManual DML Hive Data Manipulation Language Apache Hive : LanguageManual DML Hive Data Manipulation Language Loading files into tables Inserting data into Hive Tables from queries Writing data into the filesystem from queries Inserting values into tables from SQL Update Delete Merge Loading files into tables Hive does not do any transformation while loading data into tables.
Apache Hive : LanguageManual Explain
Dec 12, 2024
Apache Hive : LanguageManual Explain Apache Hive : LanguageManual Explain EXPLAIN Syntax Example The CBO Clause The AST Clause The DEPENDENCY Clause The AUTHORIZATION Clause The LOCKS Clause The VECTORIZATION Clause The ANALYZE Clause User-level Explain Output EXPLAIN Syntax Hive provides an EXPLAIN command that shows the execution plan for a query. The syntax for this statement is as follows:
Apache Hive : LanguageManual GroupBy
Dec 12, 2024
Apache Hive : LanguageManual GroupBy Apache Hive : LanguageManual GroupBy Group By Syntax Simple Examples Select statement and group by clause Advanced Features Multi-Group-By Inserts Map-side Aggregation for Group By Grouping Sets, Cubes, Rollups, and the GROUPING__ID Function Group By Syntax groupByClause: GROUP BY groupByExpression (, groupByExpression)* groupByExpression: expression groupByQuery: SELECT expression (, expression)* FROM src groupByClause? In groupByExpression columns are specified by name, not by position number.
Apache Hive : LanguageManual ImportExport
Dec 12, 2024
Apache Hive : LanguageManual Import/Export Apache Hive : LanguageManual Import/Export Version information Overview Export Syntax Import Syntax Replication usage Examples Version information The EXPORT and IMPORT commands were added in Hive 0.8.0 (see HIVE-1918).
Replication extensions to the EXPORT and IMPORT commands were added in Hive 1.2.0 (see HIVE-7973 and Hive Replication Development).
Overview The EXPORT command exports the data of a table or partition, along with the metadata, into a specified output location.
Apache Hive : LanguageManual Indexing
Dec 12, 2024
Apache Hive : LanguageManual Indexing Apache Hive : LanguageManual Indexing Indexing Is Removed since 3.0 Overview of Hive Indexes Indexing Resources Configuration Parameters for Hive Indexes Simple Examples Indexing Is Removed since 3.0 There are alternate options which might work similarily to indexing:
Materialized views with automatic rewriting can result in very similar results. Hive 2.3.0 adds support for materialzed views.
Apache Hive : LanguageManual JoinOptimization
Dec 12, 2024
Apache Hive : LanguageManual Join Optimization Apache Hive : LanguageManual Join Optimization Improvements to the Hive Optimizer Star Join Optimization Star Schema Example Prior Support for MAPJOIN Enhancements for Star Joins Improvements to the Hive Optimizer Version
The join optimizations described here were added in Hive version 0.11.0. See HIVE-3784 and related JIRAs.
This document describes optimizations of Hive’s query execution planning to improve the efficiency of joins and reduce the need for user hints.
Apache Hive : LanguageManual Joins
Dec 12, 2024
Apache Hive : LanguageManual Joins Apache Hive : LanguageManual Joins Join Syntax Examples MapJoin Restrictions Join Optimization Predicate Pushdown in Outer Joins Enhancements in Hive Version 0.11 Join Syntax Hive supports the following syntax for joining tables:
join_table: table_reference [INNER] JOIN table_factor [join_condition] | table_reference {LEFT|RIGHT|FULL} [OUTER] JOIN table_reference join_condition | table_reference LEFT SEMI JOIN table_reference join_condition | table_reference CROSS JOIN table_reference [join_condition] (as of Hive 0.
Apache Hive : LanguageManual LateralView
Dec 12, 2024
Apache Hive : LanguageManual LateralView Apache Hive : LanguageManual LateralView Lateral View Syntax Description Example Multiple Lateral Views Outer Lateral Views Lateral View Syntax lateralView: LATERAL VIEW udtf(expression) tableAlias AS columnAlias (',' columnAlias)* fromClause: FROM baseTable (lateralView)* Description Lateral view is used in conjunction with user-defined table generating functions such as explode(). As mentioned in Built-in Table-Generating Functions, a UDTF generates zero or more output rows for each input row.
Apache Hive : LanguageManual LZO
Dec 12, 2024
Apache Hive : LanguageManual LZO Compression Apache Hive : LanguageManual LZO Compression General LZO Concepts Prerequisites Lzo/Lzop Installations core-site.xml Table Definition Hive Queries Option 1: Directly Create LZO Files Option 2: Write Custom Java to Create LZO Files General LZO Concepts LZO is a lossless data compression library that favors speed over compression ratio. See http://www.oberhumer.com/opensource/lzo and http://www.
Apache Hive : LanguageManual ORC
Dec 12, 2024
Apache Hive : LanguageManual ORC Apache Hive : LanguageManual ORC ORC Files ORC File Format File Structure Stripe Structure HiveQL Syntax Serialization and Compression Integer Column Serialization String Column Serialization Compression ORC File Dump Utility ORC Configuration Parameters ORC Format Specification Attachments: ORC Files ORC File Format Version
Introduced in Hive version 0.11.0.
The Optimized Row Columnar (ORC) file format provides a highly efficient way to store Hive data.
Apache Hive : LanguageManual Sampling
Dec 12, 2024
Apache Hive : LanguageManual Sampling Apache Hive : LanguageManual Sampling Sampling Syntax Sampling Bucketized Table Block Sampling Sampling Syntax Sampling Bucketized Table table_sample: TABLESAMPLE (BUCKET x OUT OF y [ON colname]) The TABLESAMPLE clause allows the users to write queries for samples of the data instead of the whole table. The TABLESAMPLE clause can be added to any table in the FROM clause.
Apache Hive : LanguageManual Select
Dec 12, 2024
Apache Hive : LanguageManual Select Apache Hive : LanguageManual Select Select Syntax WHERE Clause ALL and DISTINCT Clauses Partition Based Queries HAVING Clause LIMIT Clause REGEX Column Specification More Select Syntax Select Syntax [WITH CommonTableExpression (, CommonTableExpression)*] (Note: Only available starting with Hive 0.13.0) SELECT [ALL | DISTINCT] select_expr, select_expr, ... FROM table_reference [WHERE where_condition] [GROUP BY col_list] [ORDER BY col_list] [CLUSTER BY col_list | [DISTRIBUTE BY col_list] [SORT BY col_list] ] [LIMIT [offset,] rows] A SELECT statement can be part of a union query or a subquery of another query.
Apache Hive : LanguageManual SortBy
Dec 12, 2024
Apache Hive : LanguageManual SortBy Apache Hive : LanguageManual SortBy Order, Sort, Cluster, and Distribute By Syntax of Order By Syntax of Sort By Difference between Sort By and Order By Setting Types for Sort By Syntax of Cluster By and Distribute By Order, Sort, Cluster, and Distribute By This describes the syntax of SELECT clauses ORDER BY, SORT BY, CLUSTER BY, and DISTRIBUTE BY.
Apache Hive : LanguageManual SubQueries
Dec 12, 2024
Apache Hive : LanguageManual SubQueries Apache Hive : LanguageManual SubQueries Subqueries in the FROM Clause Subqueries in the WHERE Clause Subqueries in the FROM Clause SELECT ... FROM (subquery) name ... SELECT ... FROM (subquery) AS name ... (Note: Only valid starting with Hive 0.13.0) Hive supports subqueries only in the FROM clause (through Hive 0.12). The subquery has to be given a name because every table in a FROM clause must have a name.
Apache Hive : LanguageManual Transform
Dec 12, 2024
Apache Hive : LanguageManual Transform Apache Hive : LanguageManual Transform Transform/Map-Reduce Syntax Schema-less Map-reduce Scripts Typing the output of TRANSFORM Transform/Map-Reduce Syntax Users can also plug in their own custom mappers and reducers in the data stream by using features natively supported in the Hive language. e.g. in order to run a custom mapper script - map_script - and a custom reducer script - reduce_script - the user can issue the following command which uses the TRANSFORM clause to embed the mapper and the reducer scripts.
Apache Hive : LanguageManual Types
Dec 12, 2024
Apache Hive : LanguageManual Data Types Apache Hive : LanguageManual Data Types Overview Numeric Types Date/Time Types String Types Misc Types Complex Types Column Types Integral Types (TINYINT, SMALLINT, INT/INTEGER, BIGINT) Strings Varchar Char Timestamps Intervals Decimals Union Types Literals Floating Point Types Handling of NULL Values Change Types Allowed Implicit Conversions Overview This lists all supported data types in Hive.
Apache Hive : LanguageManual UDF
Dec 12, 2024
Apache Hive : LanguageManual Operators and User-Defined Functions Apache Hive : LanguageManual Operators and User-Defined Functions Overview Built-in Operators Operators Precedences Relational Operators Arithmetic Operators Logical Operators String Operators Complex Type Constructors Operators on Complex Types Built-in Functions Mathematical Functions Collection Functions Type Conversion Functions Date Functions Conditional Functions String Functions Data Masking Functions Misc. Functions Built-in Aggregate Functions (UDAF) Built-in Table-Generating Functions (UDTF) Usage Examples explode posexplode json_tuple parse_url_tuple GROUPing and SORTing on f(column) Utility Functions UDF internals Creating Custom UDFs Attachments: Overview All Hive keywords are case-insensitive, including the names of Hive operators and functions.
Apache Hive : LanguageManual Union
Dec 12, 2024
Apache Hive : LanguageManual Union Apache Hive : LanguageManual Union Union Syntax Union Syntax select_statement UNION [ALL | DISTINCT] select_statement UNION [ALL | DISTINCT] select_statement ... UNION is used to combine the result from multiple SELECT statements into a single result set.
Hive versions prior to 1.2.0 only support UNION ALL (bag union), in which duplicate rows are not eliminated.
Apache Hive : LanguageManual VariableSubstitution
Dec 12, 2024
Apache Hive : LanguageManual VariableSubstitution Apache Hive : LanguageManual VariableSubstitution Introduction Using Variables Substitution During Query Construction Disabling Variable Substitution Introduction Hive is used for batch and interactive queries. Variable Substitution allows for tasks such as separating environment-specific configuration variables from code.
The Hive variable substitution mechanism was designed to avoid some of the code that was getting baked into the scripting language on top of Hive.
Apache Hive : LanguageManual VirtualColumns
Dec 12, 2024
Apache Hive : LanguageManual VirtualColumns Apache Hive : LanguageManual VirtualColumns Virtual Columns Simple Examples Virtual Columns Hive 0.8.0 provides support for two virtual columns:
One is INPUT__FILE__NAME, which is the input file’s name for a mapper task.
the other is BLOCK__OFFSET__INSIDE__FILE, which is the current global file position.
For block compressed file, it is the current block’s file offset, which is the current block’s first byte’s file offset.
Apache Hive : LanguageManual WindowingAndAnalytics
Dec 12, 2024
Apache Hive : LanguageManual WindowingAndAnalytics Apache Hive : LanguageManual WindowingAndAnalytics Enhancements to Hive QL Examples Enhancements to Hive QL Introduced in Hive version 0.11.
This section introduces the Hive QL enhancements for windowing and analytics functions. See “Windowing Specifications in HQL” (attached to HIVE-4197) for details. HIVE-896 has more information, including links to earlier documentation in the initial comments.
Apache Hive : LanguageManual XPathUDF
Dec 12, 2024
Apache Hive : LanguageManual XPathUDF Documentation for Built-In User-Defined Functions Related To XPath
UDFs xpath, xpath_short, xpath_int, xpath_long, xpath_float, xpath_double, xpath_number, xpath_string Functions for parsing XML data using XPath expressions. Since version: 0.6.0 Overview The xpath family of UDFs are wrappers around the Java XPath library javax.xml.xpath provided by the JDK. The library is based on the XPath 1.0 specification. Please refer to http://java.sun.com/javase/6/docs/api/javax/xml/xpath/package-summary.html for detailed information on the Java XPath library.
Apache Hive : Literals
Dec 12, 2024
Apache Hive : Literals Literals Integral types Integral literals are assumed to be INT by default, unless the number exceeds the range of INT in which case it is interpreted as a BIGINT, or if one of the following postfixes is present on the number.
Type Postfix Example TINYINT Y 100Y SMALLINT S 100S BIGINT L 100L String types String literals can be expressed with either single quotes (') or double quotes (").
Apache Hive : Managed vs. External Tables
Dec 12, 2024
Apache Hive : Managed vs. External Tables Hive fundamentally knows two different types of tables:
Managed (Internal) External Introduction This document lists some of the differences between the two but the fundamental difference is that Hive assumes that it owns the data for managed tables. That means that the data, its properties and data layout will and can only be changed via Hive command. The data still lives in a normal file system and nothing is stopping you from changing it without telling Hive about it.
Apache Hive : Materialized views
Dec 12, 2024
Apache Hive : Materialized views This page documents the work done for the supporting materialized views in Apache Hive.
Apache Hive : Materialized views Version information Objectives Management of materialized views in Hive Materialized views creation Other operations for materialized view management Materialized view-based query rewriting Example 1 Example 2 Example 3 Materialized view maintenance Materialized view lifecycle Open issues (JIRA) Version information Materialized views support is introduced in Hive 3.
Apache Hive : OperatorsAndFunctions
Dec 12, 2024
Apache Hive : OperatorsAndFunctions Hive Operators and Functions Hive Plug-in Interfaces - User-Defined Functions and SerDes
Guide to Hive Operators and Functions
Reflect UDF Generic UDAF Case Study Functions for Statistics and Data Mining
Apache Hive : Partition Filter Syntax
Dec 12, 2024
Apache Hive : Partition Filter Syntax Example: for a table having partition keys country and state, one could construct the following filter:
country = "USA" AND (state = "CA" OR state = "AZ")
In particular notice that it is possible to nest sub-expressions within parentheses.
The following operators are supported when constructing filters for partition columns (derived from HIVE-1862):
= < <= > >= <> AND OR LIKE (on keys of type string only, supports literal string template with ‘.
Apache Hive : ReflectUDF
Dec 12, 2024
Apache Hive : ReflectUDF Reflect (Generic) UDF A Java class and method often exists to handle the exact function a user would like to use in Hive. Rather than having to write a wrapper UDF to call this method, the majority of these methods can be called using reflect UDF. Reflect uses Java reflection to instantiate and call methods of objects; it can also call static functions. The method must return a primitive type or a type that Hive knows how to serialize.
Apache Hive : Scheduled Queries
Dec 12, 2024
Apache Hive : Scheduled Queries Apache Hive : Scheduled Queries Introduction Maintaining scheduled queries Create Scheduled query syntax Alter Scheduled query syntax Drop syntax scheduleSpecification syntax CRON based schedule syntax EVERY based schedule syntax ExecutedAs syntax enableSpecification syntax Defined AS syntax executeSpec syntax System tables/views information_schema.scheduled_queries information_schema.scheduled_executions Execution states Configuration Hive metastore related configuration HiveServer2 related configuration Examples Example 1 – basic example of using schedules Example 2 – analyze external table periodically Example 3 – materialized view rebuild Example 4 – Ingestion Introduction Executing statements periodically can be usefull in
Apache Hive : SQL Standard Based Hive Authorization
Dec 12, 2024
Apache Hive : SQL Standard Based Hive Authorization Apache Hive : SQL Standard Based Hive Authorization Status of Hive Authorization before Hive 0.13 SQL Standards Based Hive Authorization (New in Hive 0.13) Restrictions on Hive Commands and Statements Privileges Objects Object Ownership Users and Roles Names of Users and Roles Role Management Commands Managing Object Privileges Object Privilege Commands Examples of Managing Object Privileges Privileges Required for Hive Operations Configuration For Hive 0.
Apache Hive : StatisticsAndDataMining
Dec 12, 2024
Apache Hive : Statistics and Data Mining This page is the secondary documentation for the slightly more advanced statistical and data mining functions that are being integrated into Hive, and especially the functions that warrant more than one-line descriptions.
Apache Hive : Statistics and Data Mining ngrams() and context_ngrams(): N-gram frequency estimation Use Cases Usage Example histogram_numeric(): Estimating frequency distributions Use Cases Usage Example ngrams() and context_ngrams(): N-gram frequency estimation N-grams are subsequences of length N drawn from a longer sequence.
Apache Hive : Supported Features: Apache Hive 3.1
Dec 12, 2024
Apache Hive : Supported Features: Apache Hive 3.1 This table covers all mandatory features from SQL:2016 as well as optional features that Hive implements.
Feature ID Feature Name Implemented Mandatory Comments E011 Numeric data types Yes Mandatory E011-01 INTEGER and SMALLINT data types (including all spellings) Yes Mandatory E011-02 REAL, DOUBLE PRECISON, and FLOAT data types Yes Mandatory E011-03 DECIMAL and NUMERIC data types Yes Mandatory E011-04 Arithmetic operators Yes Mandatory E011-05 Numeric comparison Yes Mandatory E011-06 Implicit casting among the numeric data types Yes Mandatory E021 Character string types Yes Mandatory E021-01 CHARACTER data type (including all its spellings) Partial Mandatory Only support CHAR, not CHARACTER E021-02 CHARACTER VARYING data type (including all its spellings) Partial Mandatory Only support VARCHAR, not CHARACTER VARYING or CHAR VARYING E021-03 Character literals Yes Mandatory E021-04 CHARACTER_LENGTH function Yes Mandatory E021-05 OCTET_LENGTH function Yes Mandatory E021-06 SUBSTRING function Partial Mandatory Standard: SUBSTRING(val FROM startpos [FOR len]).
Apache Hive : Supported Features: Apache Hive 2.1
Dec 12, 2024
Apache Hive : Supported Features: Apache Hive 2.1 Identifier Description Hive 2.1 Comment E011 Numeric data types Yes E011-01 INTEGER and SMALLINT data types (including all spellings) Yes Int instead of Integer E011-02 REAL, DOUBLE PRECISON,and FLOAT data types Yes Double instead of Double Precision E011-03 DECIMAL and NUMERIC data types Yes E011-04 Arithmetic operators Yes E011-05 Numeric comparison Yes E011-06 Implicit casting among the numeric data types Yes E021 Character data types Yes E021-01 CHARACTER data type Yes Char instead of Character E021-02 CHARACTER VARYING data type Yes Varchar instead of Character Varying E021-03 Character literals Yes E021-04 CHARACTER_LENGTH function Partial length UDF provided E021-06 SUBSTRING function Yes E021-07 Character concatenation Yes concat UDF instead of standard E021-08 UPPER and LOWER functions Yes E021-09 TRIM function Partial leading / trailing / both from not supported E021-10 Implicit casting among the fixed-length and variablelength character string types Yes E021-12 Character comparison Yes E031 Identifiers Yes E031-01 Delimited identifiers Partial Backtick (`) used instead of (").
Apache Hive : Supported Features: Apache Hive 2.3
Dec 12, 2024
Apache Hive : Supported Features: Apache Hive 2.3 Identifier Description Hive 2.3 Comment E011 Numeric data types Yes E011-01 INTEGER and SMALLINT data types (including all spellings) Yes E011-02 REAL, DOUBLE PRECISON,and FLOAT data types Yes E011-03 DECIMAL and NUMERIC data types Yes E011-04 Arithmetic operators Yes E011-05 Numeric comparison Yes E011-06 Implicit casting among the numeric data types Yes E021 Character data types Yes E021-01 CHARACTER data type Yes Char instead of Character E021-02 CHARACTER VARYING data type Yes Varchar instead of Character Varying E021-03 Character literals Yes E021-04 CHARACTER_LENGTH function Yes E021-05 OCTET_LENGTH function Yes E021-06 SUBSTRING function Yes E021-07 Character concatenation Yes E021-08 UPPER and LOWER functions Yes E021-09 TRIM function Partial leading / trailing / both from not supported E021-10 Implicit casting among the fixed-length and variablelength character string types Yes E021-12 Character comparison Yes E031 Identifiers Yes E031-01 Delimited identifiers Yes E031-03 Trailing underscore Yes E051 Basic query specification Yes E051-01 SELECT DISTINCT Yes E051-02 GROUP BY clause Partial Empty grouping sets not supported E051-04 GROUP BY can contain columns not in Yes E051-05 Select list items can be renamed Yes E051-06 HAVING clause Yes E051-07 Qualified * in select list Yes E051-08 Correlation names in the FROM clause Yes E061 Basic predicates and search conditions Yes E061-01 Comparison predicate Yes E061-02 BETWEEN predicate Yes E061-03 IN predicate with list of values Yes E061-04 LIKE predicate Yes E061-06 NULL predicate Yes E061-08 EXISTS predicate Yes E061-09 Subqueries in comparison predicate Yes E061-11 Subqueries in IN predicate Yes E061-13 Correlated subqueries Yes E071 Basic query expressions Yes E071-01 UNION DISTINCT table operator Yes E071-02 UNION ALL table operator Yes E071-03 EXCEPT DISTINCT table operator Yes E071-05 Columns combined via table operators need not have exactly the same data type.