实现状态
此页面总结了不同 Parquet 实现支持的功能。
注意:如果您发现过时的信息,请通过提交问题或拉取请求来帮助我们提高此页面的准确性。
图例
每个单元格中的值含义:
- ✅:支持。部分支持时添加脚注。当有数据可用时,会提供指向实现版本发布说明的链接。
- ❌:不支持
- (R):仅支持读取
- (W):仅支持写入
- (空白):无数据
实现
- arrow (C++)
- parquet-java (Java)
- arrow-go (Go)
- arrow-rs (Rust)
- cudf (cuDF C++)
- hyparquet (JavaScript)
- duckdb (C++)
Physical types
Physical types are defined by the enum Type in parquet.thrift
| Data Type | arrow | parquet-java | arrow-go | arrow-rs | cudf | hyparquet | duckdb |
|---|---|---|---|---|---|---|---|
| BOOLEAN | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| INT32 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| INT64 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| INT961 | ✅ | ✅ | ✅ | ✅ | ✅ | (R) | (R) |
| FLOAT | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| DOUBLE | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| BYTE_ARRAY | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| FIXED_LEN_BYTE_ARRAY | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
Logical types
Logical types are defined by the union LogicalType in parquet.thrift and described in LogicalTypes.md
| Data Type | arrow | parquet-java | arrow-go | arrow-rs | cudf | hyparquet | duckdb |
|---|---|---|---|---|---|---|---|
| STRING | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| ENUM | ❌ | ✅ | ✅ | ✅(1) | ❌ | ✅ | ✅ |
| UUID | ❌ | ✅ | ✅ | ✅(1) | ❌ | ✅ | ✅ |
| 8, 16, 32, 64 bit signed and unsigned INT | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| DECIMAL (INT32) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| DECIMAL (INT64) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| DECIMAL (BYTE_ARRAY) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | (R) |
| DECIMAL (FIXED_LEN_BYTE_ARRAY) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| FLOAT16 (2023) | ✅ | ✅(1) | ✅ | ✅ | ✅ | ✅ | ✅ |
| DATE | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| TIME (INT32) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| TIME (INT64) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| TIMESTAMP (INT64) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| INTERVAL | ✅ | ✅(1) | ✅ | ✅ | ❌ | ✅ | ✅ |
| JSON | ✅ | ✅(1) | ✅ | ✅(1) | ❌ | ✅ | ✅ |
| BSON | ❌ | ✅(1) | ✅ | ✅(1) | ❌ | ❌ | ❌ |
| VARIANT (2025) | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | |
| GEOMETRY (2025) | ✅ | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ |
| GEOGRAPHY (2025) | ✅ | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ |
| LIST | ✅ | ✅ | ✅ | ✅ | ✅ | (R) | ✅ |
| MAP | ✅ | ✅ | ✅ | ✅ | ✅ | (R) | ✅ |
| UNKNOWN (always null) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
Encodings
Encodings are defined by the enum Encoding in parquet.thrift and described in Encodings.md
| Encoding | arrow | parquet-java | arrow-go | arrow-rs | cudf | hyparquet | duckdb |
|---|---|---|---|---|---|---|---|
| PLAIN | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| PLAIN_DICTIONARY | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | (R) |
| RLE_DICTIONARY | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| RLE | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| BIT_PACKED (deprecated) | ✅ | ✅ | ✅ | ❌(1) | (R) | (R) | ❌ |
| DELTA_BINARY_PACKED | ✅ | ✅ | ✅ | ✅ | ✅ | (R) | ✅ |
| DELTA_LENGTH_BYTE_ARRAY | ✅ | ✅ | ✅ | ✅ | ✅ | (R) | ✅ |
| DELTA_BYTE_ARRAY | ✅ | ✅ | ✅ | ✅ | ✅ | (R) | ✅ |
| BYTE_STREAM_SPLIT (2020) | ✅ | ✅ | ✅ | ✅ | ✅ | (R) | ✅ |
| BYTE_STREAM_SPLIT (Additional Types) (2024) | ✅ | ✅ | ✅ | ✅ | ✅ |
Compression Codecs
Compressions are defined by the enum CompressionCodec in parquet.thrift and described in Compression.md
| Compression | arrow | parquet-java | arrow-go | arrow-rs | cudf | hyparquet | duckdb |
|---|---|---|---|---|---|---|---|
| UNCOMPRESSED | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| BROTLI | ✅ | ✅ | ✅ | ✅ | (R) | (R) | ✅ |
| GZIP | ✅ | ✅ | ✅ | ✅ | (R) | (R) | ✅ |
| LZ4 (deprecated) | ✅ | ❌ | ❌ | ✅ | ❌ | (R) | ❌ |
| LZ4_RAW | ✅ | ✅ | ✅ | ✅ | ✅ | (R) | ✅ |
| LZO | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
| SNAPPY | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| ZSTD | ✅ | ✅ | ✅ | ✅ | ✅ | (R) | ✅ |
Other format level features
| Feature | arrow | parquet-java | arrow-go | arrow-rs | cudf | hyparquet | duckdb |
|---|---|---|---|---|---|---|---|
| xxHash-based bloom filters (2019) | (R) | ✅ | ✅ | ✅ | (R) | ✅ | |
| Bloom filter length1 | (R) | ✅ | ✅ | ✅ | (R) | ✅ | |
| Statistics min_value, max_value | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Page index (2018) | ✅ | ✅ | ✅ | ✅ | ✅ | (R) | (R) |
| Page CRC32 checksum | ✅ | ✅ | ❌ | ✅ | ❌ | ❌ | (R) |
| Modular encryption (2019) | ✅ | ✅ | ✅ | ✅R | ❌ | ❌ | ✅(2) |
| Size statistics (2023)3 | ✅ | ✅ | (R) | ✅ | ✅ | (R) | |
| Data Page V24 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
High level data APIs for Parquet feature usage
| Feature | arrow | parquet-java | arrow-go | arrow-rs | cudf | hyparquet | duckdb |
|---|---|---|---|---|---|---|---|
| External column data1 | ✅ | ✅ | ❌ | ❌ | (W) | ✅ | ❌ |
| Row group "Sorting column" metadata2 | ✅ | ❌ | ✅ | ✅ | (W) | ❌ | (R) |
| Row group pruning using statistics | ❌ | ✅ | ✅(3) | ✅ | ✅ | ❌ | ✅ |
| Row group pruning using bloom filter | ❌ | ✅ | ✅(3) | ✅ | ✅ | ❌ | ✅ |
| Reading select columns only | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Page pruning using statistics | ❌ | ✅ | ✅(3) | ✅ | ❌ | ❌ | ❌ |
Minimum Version for Read Support by Year
This table shows the minimum engine version required to read Parquet files using features introduced in each year. Only includes compression, encodings, physical types, and logical types. Features without a specified format version are assumed to have been added prior to 2023.
Note: This data was originally collected in December 2025, and not all data was backfilled. It is likely older releases of each engine support reading all features for 2023 and before. As volunteers have time they are invited to add more granular details on releases. Generally, versions are expected to be accurate for any year 2025 and after.
Note: The following features are excluded from this table: ENUM, UUID, INTERVAL, JSON, BSON, BIT_PACKED (deprecated), LZ4 (deprecated), LZO.
| Engine | ≤2023 Features | 2024 Features | 2025 Features |
|---|---|---|---|
| Apache Arrow C++ | 18.0.0 (2024-10-28) | 18.0.0 (2024-10-28) | ❌ |
| Parquet Java | 1.14.1 (2024-07-16) | 1.14.1 (2024-07-16) | 1.16.0 (2025-09-03) |
| Apache Arrow Go | 18.4.0 (2025-07-21) | 18.4.0 (2025-07-21) | ❌ |
| Apache Arrow Rust | 52.2.0 (2024-07-28) | 53.0.0 (2024-08-31) | 57.0.0 (2025-10-19) |
| cuDF | 25.12.00 (2025-12-10) | ❌ | ❌ |
| Hyparquet | 1.23.0 (2025-12-10) | ❌ | ❌ |
| DuckDB | 1.4.0 (2025-09-16) | 1.4.0 (2025-09-16) | 1.4.0 (2025-09-16) |