What are the Slack Archives?

It’s a history of our time together in the Slack Community! There’s a ton of knowledge in here, so feel free to search through the archives for a possible answer to your question.

Because this space is not active, you won’t be able to create a new post or comment here. If you have a question or want to start a discussion about something, head over to our categories and pick one to post in! You can always refer back to a post from Slack Archives if needed; just copy the link to use it as a reference..

Hello! In spryker package `spy_price_product_store` there is an index of `spy_price_product_store-i

2

Comments

  • UNBSW8S8K
    UNBSW8S8K Posts: 128 πŸ§‘πŸ»β€πŸš€ - Cadet

    Hmm, this would not be an option for our case, as we import all the time quite a lot of products.

  • UNBSW8S8K
    UNBSW8S8K Posts: 128 πŸ§‘πŸ»β€πŸš€ - Cadet

    So it would take forever to reindex that table.

  • UNBSW8S8K
    UNBSW8S8K Posts: 128 πŸ§‘πŸ»β€πŸš€ - Cadet

    And then P&S would bite you…

  • how does you bulk insert look like?

  • Alberto Reyer
    Alberto Reyer Posts: 690 πŸͺ - Explorer

    If you already use Postgres, have a look at the SQL based importers and use upsert statements.
    Especially Postgres is pretty good at rescheduling indices.

  • UNBSW8S8K
    UNBSW8S8K Posts: 128 πŸ§‘πŸ»β€πŸš€ - Cadet

    This is the query for one product

  • ah ok, you already use the CTE contruct

  • let me test this against my database

  • Unknown
    edited October 2019

    could take some time

  • UNBSW8S8K
    UNBSW8S8K Posts: 128 πŸ§‘πŸ»β€πŸš€ - Cadet

    Just test it when you have few millions of rows…

  • giovanni.piemontese
    giovanni.piemontese Spryker Solution Partner Posts: 871 πŸ§‘πŸ»β€πŸš€ - Cadet

    we import 2,5Mio Prices with the same CTE in ca 20min...

  • UNBSW8S8K
    UNBSW8S8K Posts: 128 πŸ§‘πŸ»β€πŸš€ - Cadet

    With updates as well or only inserts?

  • UNBSW8S8K
    UNBSW8S8K Posts: 128 πŸ§‘πŸ»β€πŸš€ - Cadet

    We have both…

  • giovanni.piemontese
    giovanni.piemontese Spryker Solution Partner Posts: 871 πŸ§‘πŸ»β€πŸš€ - Cadet

    i don't think that the problem is the index, because we have the same index and postgres

  • giovanni.piemontese
    giovanni.piemontese Spryker Solution Partner Posts: 871 πŸ§‘πŸ»β€πŸš€ - Cadet

    insert and update.. both... price in our project is always fully import

  • UNBSW8S8K
    UNBSW8S8K Posts: 128 πŸ§‘πŸ»β€πŸš€ - Cadet

    Hmm, interesting πŸ˜•

  • Alberto Reyer
    Alberto Reyer Posts: 690 πŸͺ - Explorer

    Better use


    Upserts (the ON CONFLICT clause) should be faster then building two changesets, that was at least a performance boost I found for our CTE imports.

  • giovanni.piemontese
    giovanni.piemontese Spryker Solution Partner Posts: 871 πŸ§‘πŸ»β€πŸš€ - Cadet

    I share with u our query:

        public function createPriceProductStoreSql(string $tableName, string $foreignKey, string $idProduct): string
        {
            $sql = sprintf("WITH records AS (
           SELECT
           input.id_store,
           input.id_currency,
           input.%3\$s,
           input.id_price_product,
           input.gross_price,
           input.net_price,
           spy_price_product_store.id_price_product_store as idPriceProductStore,
           input.price_data,
           input.price_data_checksum
        FROM (
               SELECT
                 unnest(?::INTEGER[]) AS id_store,
                 unnest(?::INTEGER[]) AS id_currency,
                 unnest(?::INTEGER[]) AS %3\$s,
                 unnest(?::INTEGER[]) AS id_price_product,
                 unnest(?::INTEGER[]) AS gross_price,
                 unnest(?::INTEGER[]) AS net_price,
                 json_array_elements_text(?) AS price_data,
                 unnest(?::VARCHAR[]) AS price_data_checksum
             ) input
                LEFT JOIN spy_price_product_store ON (
                    spy_price_product_store.fk_price_product = input.id_price_product AND
                    spy_price_product_store.fk_currency = id_currency AND
                    spy_price_product_store.fk_store = id_store
                )
        ),
          updated AS (
            UPDATE spy_price_product_store
            SET
              gross_price = records.gross_price,
              net_price = records.net_price,
              price_data = records.price_data,
              price_data_checksum = records.price_data_checksum
            FROM records
            WHERE spy_price_product_store.fk_price_product = records.id_price_product AND
                  spy_price_product_store.fk_store = records.id_store AND 
                  spy_price_product_store.fk_currency = records.id_currency
            RETURNING idPriceProductStore as id_price_product_store      
          ),
            inserted AS (
                INSERT INTO spy_price_product_store (
                  id_price_product_store,
                  fk_store,
                  fk_currency,
                  fk_price_product,
                  net_price,
                  gross_price,
                  price_data,
                  price_data_checksum
                ) (
                  SELECT
                    nextval('spy_price_product_store_pk_seq'),
                    id_store,
                    id_currency,
                    id_price_product,
                    net_price,
                    gross_price,
                    price_data,
                    price_data_checksum
                  FROM records
                  WHERE records.id_price_product IS NOT NULL AND records.idPriceProductStore IS NULL
          ) RETURNING id_price_product_store
      )
    SELECT updated.id_price_product_store FROM updated UNION ALL SELECT inserted.id_price_product_store FROM inserted;", $tableName, $foreignKey, $idProduct);
    
            return $sql;
        }
    

    smth is others in where condition in update query

  • please keep in mind: the upsert might bloat the sequence

  • Alberto Reyer
    Alberto Reyer Posts: 690 πŸͺ - Explorer

    please keep in mind: the upsert might bloat the sequence

    Not if you do an LEFT JOIN to get the already existing entry (if there is one) and use COALESCE(original_table.id_xyz, nextval('spy_original_table_pk_seq')) AS id_xyz

  • interesting approach

  • do you have some numbers on how this impacted performance?

  • Alberto Reyer
    Alberto Reyer Posts: 690 πŸͺ - Explorer

    @UJN2JRU4F This very much depends on what importer we had, but for one importer we achieved around 40% of performance increase (comparing the Spryer CTE approach and my approach above). But I do not have crunched numbers as we use the CTE approach only for a new data source which is in size and complexity not really comparable to the previous importers.

  • UNBSW8S8K
    UNBSW8S8K Posts: 128 πŸ§‘πŸ»β€πŸš€ - Cadet
    edited October 2019

    So, for the scale of data we have:




  • UNBSW8S8K
    UNBSW8S8K Posts: 128 πŸ§‘πŸ»β€πŸš€ - Cadet

    We have 2 price types

  • UNBSW8S8K
    UNBSW8S8K Posts: 128 πŸ§‘πŸ»β€πŸš€ - Cadet
    edited October 2019

    And 8 currencies

    By doing basic calculations, we should have approx. (301591 + 47612) * 21 * 2 * 8= ~16.8 mil. rows. And at this size, it is very slow.

  • UNBSW8S8K
    UNBSW8S8K Posts: 128 πŸ§‘πŸ»β€πŸš€ - Cadet

    And that’s the worst case scenario.

  • UNBSW8S8K
    UNBSW8S8K Posts: 128 πŸ§‘πŸ»β€πŸš€ - Cadet

    Does anyone have same size and it runs fine?

  • Alberto Reyer
    Alberto Reyer Posts: 690 πŸͺ - Explorer

    If we would have 21 stores we would already have put them onto different databases

  • UNBSW8S8K
    UNBSW8S8K Posts: 128 πŸ§‘πŸ»β€πŸš€ - Cadet

    Well we didn’t cause Spryker proposed to not to do that…