Optimizing YugabyteDB COPY For Large Data Loads
YugabyteDB COPY does intermediate commits every 20000 rows by default. This can lead to inconsistent data if statement fails before end.
By default, YugabyteDB COPY does intermediate commits every 20000 rows:
yugabyte=# show yb_default_copy_from_rows_per_transaction;
yb_default_copy_from_rows_per_transaction
-------------------------------------------
20000
(1 row)
yugabyte=# show yb_disable_transactional_writes;
yb_disable_transactional_writes
---------------------------------
on
(1 row)
Let's take an example with the following table:
yugabyte=# create table loaded_data ( id bigserial, data text );
CREATE TABLE
I set statement_timeout to 5 seconds to simulate a failure...