IF NOT EXISTS clause is not returning all columns, If the row already exists

[Question posted by a user on YugabyteDB Community Slack ]

If I use IF NOT EXISTS clause then YB is not returning all columns, if the row already exists. This is not the case with cassandra.

# YCQL
$ create table test.sample (ip inet, hr int, source text, source_id text, g map<int, int>, b map<int, int>, r map<int, int>, c map<int, blob>, primary key(ip, hr, source, source_id));
$ insert into sample(ip, hr, source, source_id, g, b, r,c) values('1.2.3.4', 456, 'alpha', 'a', {0:1}, {1:2},{2:1},{0: 0x12});
$ insert into sample(ip, hr, source, source_id, g, b, r,c) values('1.2.3.4', 456, 'alpha', 'a', {0:1}, {1:2},{2:1},{0: 0x12}) if not exists;
 [applied] | ip      | hr  | source | source_id
-----------+---------+-----+--------+-----------
     False | 1.2.3.4 | 456 |  alpha |         a
# cassandra
$ create table test.sample (ip inet, hr int, source text, source_id text, g map<int, int>, b map<int, int>, r map<int, int>, c map<int, blob>, primary key(ip, hr, source, source_id));
$ insert into sample(ip, hr, source, source_id, g, b, r,c) values('1.2.3.4', 456, 'alpha', 'a', {0:1}, {1:2},{2:1},{0: 0x12}) ;
$ insert into sample(ip, hr, source, source_id, g, b, r,c) values('1.2.3.4', 456, 'alpha', 'a', {0:1}, {1:2},{2:1},{0: 0x12}) if not exists;
 [applied] | ip      | hr  | source | source_id | b      | c         | g      | r
-----------+---------+-----+--------+-----------+--------+-----------+--------+--------
     False | 1.2.3.4 | 456 |  alpha |         a | {1: 2} | {0: 0x12} | {0: 1} | {2: 1}

It would have helpful to return all columns for IF NOT EXISTS clause, otherwise I need to read row again.

Currently we only return the primary key in this case as that fully identifies the row.
As a workaround, it is possible to use RETURN STATUS AS ROW clause.
This clause catches query errors and returns them as a status similar to the conditional case result set. It’s useful in particular for batches because it simply returns the status of each statement and allows the user to handle errors for one statement themselves (rather than abort the entire batch). But, since it returns the values for all columns, it might be useful here. For example:

ycqlsh:test> insert into sample(ip, hr, source, source_id, g, b, r,c) values('1.2.3.4', 456, 'alpha', 'a', {0:1}, {1:2},{2:1},{0: 0x12}) if not exists;
 [applied] | ip      | hr  | source | source_id
-----------+---------+-----+--------+-----------
     False | 1.2.3.4 | 456 |  alpha |         a
ycqlsh:test> insert into sample(ip, hr, source, source_id, g, b, r,c) values('1.2.3.4', 456, 'alpha', 'a', {0:1}, {1:2},{2:1},{0: 0x12}) if not exists RETURNS STATUS AS ROW;
 [applied] | [message] | ip      | hr  | source | source_id | g      | b      | r      | c
-----------+-----------+---------+-----+--------+-----------+--------+--------+--------+-----------
     False |      null | 1.2.3.4 | 456 |  alpha |         a | {0: 1} | {1: 2} | {2: 1} | {0: 0x12}

^ Above, with RETURNS STATUS AS ROW clause, you see that all columns are now returned.

There is an additional column message — if query had no error this will be null. If query had an error this will have the error message, applied will be false, and the rest of the columns will be null. The app layer would need to handle that case just in case the query fails with an error message.
See this closed issue and our docs for more info on this clause.

[credit @mihnea ]

1 Like