You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### Rationale for this change
Closes: apache#39456
### What changes are included in this PR?
Update physical and logical type mapping from Arrow to Parquet for DATE64 type
### Are these changes tested?
Yes,
- Update expected schema mapping in existing test
- Tests asserting new behavior
- Arrow DATE64 will roundtrip -> Parquet -> Arrow as DATE32
- Arrow DATE64 _not aligned_ to exact date boundary will truncate to milliseconds at boundary of greatest full day on Parquet roundtrip
### Are there any user-facing changes?
Yes, users of `pqarrow.FileWriter` will produce Parquet files containing `DATE` logical type instead of `TIMESTAMP[ms]` when writing Arrow data containing DATE64 field(s). The proposed implementation truncates `int64` values to be divisible by 86400000 rather than validating that this is already the case, as some implementations do. I am happy to add this validation if it would be preferred, but the truncating behavior will likely break fewer existing users.
I'm not sure whether this is technically considered a breaking change to a public API and if/how it should be communicated. Any direction regarding this would be appreciated.
* Closes: apache#39456
Authored-by: Joel Lubinitsky <joel@cherre.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
0 commit comments