Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for PostgreSQL :: style casts #25259

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dain
Copy link
Member

@dain dain commented Mar 9, 2025

Description

This PR proposes adding PostgreSQL style double colon :: casts to Trino. I find the standard SQL cast CAST(something as BIGINT) is quite wordy and difficult to write compared to :: especially when it is necessary to drop casts into existing queries (due the the strict nature of SQL type system). I find being able to drop a simple ::type on a symbol reference or function call to be super easy in PostgreSQL.

Fixes #23795

Release notes

(X) Release notes are required, with the following suggested text:

## Section
* Add support for PostgreSQL style `::` casts. ({issue}`25259`)

@dain dain requested a review from martint March 9, 2025 02:52
@cla-bot cla-bot bot added the cla-signed label Mar 9, 2025
@martint
Copy link
Member

martint commented Mar 9, 2025

See #23795 (comment)

@dain
Copy link
Member Author

dain commented Mar 9, 2025

See #23795 (comment)

@martint that sounds great. It means that we have no concern that something useful will be used for this. I don't see us ever implementing that part of the spec, so we can use this operator for something awesome.

@martint
Copy link
Member

martint commented Mar 12, 2025

I don't see us ever implementing that part of the spec, so we can use this operator for something awesome.

On the contrary. It's something I can envision supporting. E.g., you could do integer::to_hex(...), which, among other things, could help to avoid some tricky function overloading scenarios.

There are a few options for handling ambiguities:

  • Fail the query when the a both a column::type and type::method are valid candidates for a given invocation
  • Use a different operator instead of ::
  • Use a slightly different syntax (e.g., instead of 123::varchar, use 123.as(varchar)), and treat it like method invocations on values -- similar to https://www.markhneedham.com/blog/2024/08/25/duckdb-chaining-functions/)

@wendigo
Copy link
Contributor

wendigo commented Mar 12, 2025

What about syntax 11 AS BIGINT or 11 AS TYPE BIGINT ?

@electrum
Copy link
Member

The ambiguity issue seems unlikely in practice, as you wouldn't declare a function with the same name as a type, except in the case where the function is an alias for a cast. For example, we support date(x) as a cast to date, so x::date would be the same, no matter which one is invoked.

The PostgreSQL cast syntax is well-understood by users and is quite convenient, so I vote to support it.

@electrum
Copy link
Member

electrum commented Mar 12, 2025

We can resolve the ambiguity by allowing it when the function is also a cast (as in the case of date), and failing the query otherwise. It looks like date is the only such function that exists today.

FunctionMetadata can have a new cast flag to indicate that the function is an alias for a cast, and ScalarFromAnnotationsParser can set the flag when the function is also a cast operator, probably in ScalarHeader.fromAnnotatedElement().

@ebyhr
Copy link
Member

ebyhr commented Mar 14, 2025

The current behavior with literals is ambiguous to me. For instance, I expected the following statement returns timestamp type at a glance, but it returns varchar type.

SELECT TIMESTAMP '2007-08-09 9:10:11 Europe/Berlin'::VARCHAR;

@dain
Copy link
Member Author

dain commented Mar 18, 2025

The current behavior with literals is ambiguous to me. For instance, I expected the following statement returns timestamp type at a glance, but it returns varchar type.

SELECT TIMESTAMP '2007-08-09 9:10:11 Europe/Berlin'::VARCHAR;

I don't think that is ambigous. The first part of the expression is TIMESTAMP '2007-08-09 9:10:11 Europe/Berlin' is a type constructor in the grammar:

    | identifier string                                                                   #typeConstructor

Type constructors do not support an expression after the type name, and only support a literal string. Then the second part casts the timestamp to a varchar.

@electrum
Copy link
Member

It is the same as if you wrote it with parentheses:

SELECT (TIMESTAMP '2007-08-09 9:10:11 Europe/Berlin')::VARCHAR;

The syntax for type constructors is special because they only support literals, not expressions. But I agree in this instance that is appears confusing, as the cast operator goes left-to-right instead of the typical inside-out for function calls. Visually, it's also confusing because the type constructor has a space inside of it, but the there's no separation between the string literal and the cast operator.

It's good to call out this edge case in the SQL syntax, but I don't have a good solution, other than mentioning it in the documentation. PostgreSQL should have the same issue.

@dain
Copy link
Member Author

dain commented Mar 18, 2025

I expect people won't mix the syntax, and instead would do this:

'2007-08-09 9:10:11 Europe/Berlin'::TIMESTAMP

So the original expression would be:

'2007-08-09 9:10:11 Europe/Berlin'::TIMESTAMP::VARCHAR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Cast operator ::
5 participants