实现状态

此页面总结了不同 Parquet 实现支持的功能。

注意:如果您发现过时的信息,请通过提交问题或拉取请求来帮助我们提高此页面的准确性。

图例

每个单元格中的值含义:

  • ✅:支持。部分支持时添加脚注。当有数据可用时,会提供指向实现版本发布说明的链接。
  • ❌:不支持
  • (R):仅支持读取
  • (W):仅支持写入
  • (空白):无数据

实现

Physical types

Physical types are defined by the enum Type in parquet.thrift

Data Typearrowparquet-javaarrow-goarrow-rscudfhyparquetduckdb
BOOLEAN
INT32
INT64
INT961(R)(R)
FLOAT
DOUBLE
BYTE_ARRAY
FIXED_LEN_BYTE_ARRAY
Notes:
(1) This type is deprecated, but as of 2024 it's common in currently produced parquet files

Logical types

Logical types are defined by the union LogicalType in parquet.thrift and described in LogicalTypes.md

Data Typearrowparquet-javaarrow-goarrow-rscudfhyparquetduckdb
STRING
ENUM(1)
UUID(1)
8, 16, 32, 64 bit signed and unsigned INT
DECIMAL (INT32)
DECIMAL (INT64)
DECIMAL (BYTE_ARRAY)(R)
DECIMAL (FIXED_LEN_BYTE_ARRAY)
FLOAT16 (2023)(1)
DATE
TIME (INT32)
TIME (INT64)
TIMESTAMP (INT64)
INTERVAL(1)
JSON(1)(1)
BSON(1)(1)
VARIANT (2025)
GEOMETRY (2025)
GEOGRAPHY (2025)
LIST(R)
MAP(R)
UNKNOWN (always null)
Notes:
(1) Only supported to use its annotated physical type

Encodings

Encodings are defined by the enum Encoding in parquet.thrift and described in Encodings.md

Encodingarrowparquet-javaarrow-goarrow-rscudfhyparquetduckdb
PLAIN
PLAIN_DICTIONARY(R)
RLE_DICTIONARY
RLE
BIT_PACKED (deprecated)(1)(R)(R)
DELTA_BINARY_PACKED(R)
DELTA_LENGTH_BYTE_ARRAY(R)
DELTA_BYTE_ARRAY(R)
BYTE_STREAM_SPLIT (2020)(R)
BYTE_STREAM_SPLIT (Additional Types) (2024)
Notes:
(1) Partial read support, but only in the case of level data with a bitwidth of 0

Compression Codecs

Compressions are defined by the enum CompressionCodec in parquet.thrift and described in Compression.md

Compressionarrowparquet-javaarrow-goarrow-rscudfhyparquetduckdb
UNCOMPRESSED
BROTLI(R)(R)
GZIP(R)(R)
LZ4 (deprecated)(R)
LZ4_RAW(R)
LZO
SNAPPY
ZSTD(R)

Other format level features

Featurearrowparquet-javaarrow-goarrow-rscudfhyparquetduckdb
xxHash-based bloom filters (2019)(R)(R)
Bloom filter length1(R)(R)
Statistics min_value, max_value
Page index (2018)(R)(R)
Page CRC32 checksum(R)
Modular encryption (2019)R(2)
Size statistics (2023)3(R)(R)
Data Page V24
Notes:
(1) In parquet.thrift: ColumnMetaData->bloom_filter_length
(2) Partial support
(3) In parquet.thrift: ColumnMetaData->size_statistics
(4) In parquet.thrift: DataPageHeaderV2

High level data APIs for Parquet feature usage

Featurearrowparquet-javaarrow-goarrow-rscudfhyparquetduckdb
External column data1(W)
Row group "Sorting column" metadata2(W)(R)
Row group pruning using statistics(3)
Row group pruning using bloom filter(3)
Reading select columns only
Page pruning using statistics(3)
Notes:
(1) In parquet.thrift: ColumnChunk->file_path
(2) In parquet.thrift: RowGroup->sorting_columns
(3) Partial support

Minimum Version for Read Support by Year

This table shows the minimum engine version required to read Parquet files using features introduced in each year. Only includes compression, encodings, physical types, and logical types. Features without a specified format version are assumed to have been added prior to 2023.

Note: This data was originally collected in December 2025, and not all data was backfilled. It is likely older releases of each engine support reading all features for 2023 and before. As volunteers have time they are invited to add more granular details on releases. Generally, versions are expected to be accurate for any year 2025 and after.

Note: The following features are excluded from this table: ENUM, UUID, INTERVAL, JSON, BSON, BIT_PACKED (deprecated), LZ4 (deprecated), LZO.

Engine ≤2023 Features 2024 Features 2025 Features
Apache Arrow C++18.0.0
(2024-10-28)
18.0.0
(2024-10-28)
Parquet Java1.14.1
(2024-07-16)
1.14.1
(2024-07-16)
1.16.0
(2025-09-03)
Apache Arrow Go18.4.0
(2025-07-21)
18.4.0
(2025-07-21)
Apache Arrow Rust52.2.0
(2024-07-28)
53.0.0
(2024-08-31)
57.0.0
(2025-10-19)
cuDF25.12.00
(2025-12-10)
Hyparquet1.23.0
(2025-12-10)
DuckDB1.4.0
(2025-09-16)
1.4.0
(2025-09-16)
1.4.0
(2025-09-16)