What are the Slack Archives?

Itโ€™s a history of our time together in the Slack Community! Thereโ€™s a ton of knowledge in here, so feel free to search through the archives for a possible answer to your question.

Because this space is not active, you wonโ€™t be able to create a new post or comment here. If you have a question or want to start a discussion about something, head over to our categories and pick one to post in! You can always refer back to a post from Slack Archives if needed; just copy the link to use it as a reference..

Hello Spryker-Developers, we are hitting a problem with the Publish&Sync mechanism regarding Up

U01CRKGV3GW
U01CRKGV3GW Posts: 8 ๐Ÿง‘๐Ÿปโ€๐Ÿš€ - Cadet

Hello Spryker-Developers,

we are hitting a problem with the Publish&Sync mechanism regarding Updating the Elasticsearch Index.
To be specific the ProductAbstractPagePublisher is our concern. We have customized the query a little but
even in vendor code a big SQL query is constructed which joins over a lot of tables which takes time and memory.
Propel seams to be the bottleneck for reading and writing into shadow tables.
To update all our products in the index it takes about 10h. With the Collector (deprecated solution) back then it
was way faster.

Do you have any recommendations to tackle this problem? What are the biggest leverages to speed this up?
I have seen somewhere that you use Postgres CTEs for writing into DB in the Demo-Shop/Suite afaik.
What else can we do ๐Ÿ™‚

Thank you in advance!

Comments

  • Andriy Netseplyayev
    Andriy Netseplyayev Domain Lead Solution Architecture Sprykee Posts: 519 ๐Ÿง‘๐Ÿปโ€๐Ÿš€ - Cadet
    edited November 2021

    Hey Ben,
    Have you tried any APM tools (e.g. newrelic) to analyse the issue? Curious how did you conclude that Propel (not DB / query) is the bottleneck, and what exactly โ€œeatsโ€ most of the time.
    Also, do you have any hardware resource monitoring that could indicate lack of memory/processor power etc.?

  • Andriy Netseplyayev
    Andriy Netseplyayev Domain Lead Solution Architecture Sprykee Posts: 519 ๐Ÿง‘๐Ÿปโ€๐Ÿš€ - Cadet

    Solutions youโ€™ve mentioned - CTEs for bulk select/insert operations / getting rid of propel in favour of native queries - could all be a good solutions as soon as you are sure you need exactly them.
    How many products are you processing?

  • U01CRKGV3GW
    U01CRKGV3GW Posts: 8 ๐Ÿง‘๐Ÿปโ€๐Ÿš€ - Cadet

    Hello Andriy, I found out by local Xdebug Profiling and Logging on Productive that not only the query building but also the ArrayFormatting of Propel takes time. I fired the raw query propel produces and it was like 10x faster.
    n our case optimizing DB reading is the bigger leverage than writing (takes more time). I optimized the query by using the raw SQL (which Propel builds) and a custom formatter which rebuilds the propel Array structure from the generic ArrayFormatter - because the vendor coded relies on that structure. Because we know how the result should be formatted the custom solution is faster. This saved about 80% of the time there (i.e. 1s instead of 5s).
    But this was not bringing the speed that I hoped... so I found out that on every CRON run or lets say Message Chunk (i.e. 500) the Category tree data which we need in the index (parent ids, parent category names) has to be reloaded from DB and is only cached for this chunk.
    Which leads to nearly 1 minute break after each Chunk. I have not checked yet why it is that long but build a Storage/Redis Cache instead where we can get the category data hopefully way faster. The downside is that Cache data can be invalid or has to be invalidated on the right places.
    I will now check if that gives us a better throughput in message acks.
    Next thing could be to use CTEs for Insert/Update.

    Why do you use CTEs there in Demo/Suite?

    If we trigger all our abstract products we have around 100.000 messages in the queue which have to be processed = #products

  • Andriy Netseplyayev
    Andriy Netseplyayev Domain Lead Solution Architecture Sprykee Posts: 519 ๐Ÿง‘๐Ÿปโ€๐Ÿš€ - Cadet

    very good research though ๐Ÿ‘

    Why do you use CTEs there in Demo/Suite?

    I think it should be explained here: https://docs.spryker.com/docs/scos/dev/data-import/202108.0/data-importer-speed-optimization.html#pamps-and-cte

    btw, have you seen that page? You can find for some more improvement ideas there as well

  • U01CRKGV3GW
    U01CRKGV3GW Posts: 8 ๐Ÿง‘๐Ÿปโ€๐Ÿš€ - Cadet

    The Redis-Cached solution improved our Message rate from about 4000/10min to around 16000/10min. This is pretty good. Becaus all messages in 2,5 hours sounds way better than 10 hours ๐Ÿ˜ƒ

    I check your links later. thanks