Skip to content

Commit a170014

Browse files
authoredJan 26, 2024
Use smallest data type for partition column in P2P (#8479)
1 parent e30a3e4 commit a170014

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed
 

‎distributed/shuffle/_arrow.py

+5-1
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,11 @@ def _create_input_partition_id_array(
102102
table: pa.Table, input_partition_id: int
103103
) -> pa.ChunkedArray:
104104
arrays = (
105-
np.full((batch.num_rows,), input_partition_id)
105+
np.full(
106+
(batch.num_rows,),
107+
input_partition_id,
108+
dtype=np.uint32(),
109+
)
106110
for batch in table.to_batches()
107111
)
108112
return pa.chunked_array(arrays)

0 commit comments

Comments
 (0)
Please sign in to comment.