What are the Slack Archives?

It’s a history of our time together in the Slack Community! There’s a ton of knowledge in here, so feel free to search through the archives for a possible answer to your question.

Because this space is not active, you won’t be able to create a new post or comment here. If you have a question or want to start a discussion about something, head over to our categories and pick one to post in! You can always refer back to a post from Slack Archives if needed; just copy the link to use it as a reference..

Hey at all! In it is recomm

Posts: 51 πŸ§‘πŸ»β€πŸš€ - Cadet

Hey at all! In https://documentation.spryker.com/docs/dataimporter-speed-optimization it is recommended to use common table expressions (CTE) for inserting or updating big amounts of data via data importer and there is a hint to check examples in the demo shops. However, i can not find anything in the demo shops regarding this.
Is there some more documentation for using cte in importers?

Welcome!

It looks like you're new here. Sign in or register to get started.

Comments

  • Lead Spryker Solution Architect / Technical Director Posts: 690 πŸͺ - Explorer

    Here is an example: https://github.com/spryker-shop/suite/blob/master/src/Pyz/Zed/DataImport/Business/Model/ProductPrice/Writer/Sql/ProductPriceSql.php

    But they are not really optimized, instead of having a subset for update and one for insert and upsert (on conflict (...) do update) in combination with COALESECE(table.id_field, nextval(id_sequence)) will give your double the speed in most cases were your CTE execeds a few KB (depends on postgres settings), because if the CTE is bigger than the configured amount it will be persisted to disc in between which slows down the whole process).

  • Posts: 51 πŸ§‘πŸ»β€πŸš€ - Cadet

    Perfect! Thank you very much!

  • Posts: 51 πŸ§‘πŸ»β€πŸš€ - Cadet
  • Lead Spryker Solution Architect / Technical Director Posts: 690 πŸͺ - Explorer

    @UPPB2H525 Feel free to write me a PN (or even better ask here), I did a lot of work in this area in our project to speed up the import process (~300k articles, 1200 categories, 4 category trees, including all data are imported in <4h since we optimized the importers).

    @UJN2JRU4F Might be able to provide you the video capture of one of the user group meetups last year were I gave a few insides how we improved our importers.

  • need to check whether we already cleaned it up

  • Posts: 51 πŸ§‘πŸ»β€πŸš€ - Cadet

    @UL6DGRULR wow, thank you i appreciate this very much! I will get back to you if I face any problems!

  • Posts: 111 πŸ§‘πŸ»β€πŸš€ - Cadet

    waiting for this video for quite some time now πŸ˜‰

  • Technical Lead @ LΓΆffelhardt Spryker Solution Partner Posts: 871 πŸ§‘πŸ»β€πŸš€ - Cadet

    Hi, we have import for ~2,8Mio Product Abstract, ~2,8Mio Product Concrete, ~2,8Mio SalesUnit, ~4Mio Product Image, ~2,8Mio Product Quantity, ~1000 Categories, ~3Mio Product Stock etc etc... and with CTE (PostgreSql) we get a whole import in DB in circa. 6Std (Insert, Update in less than 30 mins πŸ˜‰ ).

    There is a lot of things to optimize to get this time.
    And for all regarding the events that importer have to trigger (only if necessary, otherwise the shop is always on P+S that has no sense).

    If u need some tips or help just write/ask here (not PN -> some information should be shared with everyone). I will try to provide u any help that u need.

  • Technical Lead @ LΓΆffelhardt Spryker Solution Partner Posts: 871 πŸ§‘πŸ»β€πŸš€ - Cadet
    because if the CTE is bigger than the configured amount it will be persisted to disc in between which slows down the whole process).
    

    @UL6DGRULR how did u calculated this value? or better how can i check this case? Thank u.

  • Lead Spryker Solution Architect / Technical Director Posts: 690 πŸͺ - Explorer

    We didn’t calculated it, in a previous job one of our BI team told me that big CTE’s comes with that cost, so I just tried with this simple trick if it will speed up the importer and it did.
    https://hakibenita.com/be-careful-with-cte-in-postgre-sql

    But it seems that this has changed with Postgres >= 11 (this is also new to me)

  • Technical Lead @ LΓΆffelhardt Spryker Solution Partner Posts: 871 πŸ§‘πŸ»β€πŸš€ - Cadet

    Thank you for the reply. Ok we have psql 12 and as seen in >11 they have smth optimised

  • Posts: 51 πŸ§‘πŸ»β€πŸš€ - Cadet

    Hey @UL6DGRULR! Do you have any experience regarding CTE with MariaDB 10.5 in Spryker?

  • Lead Spryker Solution Architect / Technical Director Posts: 690 πŸͺ - Explorer

    Sorry I missed your question as I was on vacation. But sadly I haven’t had any contact with CTE’s on MariaDB so far as we use Postgres in our project.

  • Posts: 51 πŸ§‘πŸ»β€πŸš€ - Cadet

    Thank you

Welcome!

It looks like you're new here. Sign in or register to get started.