Hello Spryker-Developers, we are hitting a problem with the Publish&Sync mechanism regarding Up

U01CRKGV3GW · November 2021

Hello Spryker-Developers,

we are hitting a problem with the Publish&Sync mechanism regarding Updating the Elasticsearch Index.
To be specific the ProductAbstractPagePublisher is our concern. We have customized the query a little but
even in vendor code a big SQL query is constructed which joins over a lot of tables which takes time and memory.
Propel seams to be the bottleneck for reading and writing into shadow tables.
To update all our products in the index it takes about 10h. With the Collector (deprecated solution) back then it
was way faster.

Do you have any recommendations to tackle this problem? What are the biggest leverages to speed this up?
I have seen somewhere that you use Postgres CTEs for writing into DB in the Demo-Shop/Suite afaik.
What else can we do 🙂

Thank you in advance!

Andriy Netseplyayev · November 2021

Hey Ben,
Have you tried any APM tools (e.g. newrelic) to analyse the issue? Curious how did you conclude that Propel (not DB / query) is the bottleneck, and what exactly “eats” most of the time.
Also, do you have any hardware resource monitoring that could indicate lack of memory/processor power etc.?

Andriy Netseplyayev · November 2021

Solutions you’ve mentioned - CTEs for bulk select/insert operations / getting rid of propel in favour of native queries - could all be a good solutions as soon as you are sure you need exactly them.
How many products are you processing?

U01CRKGV3GW · November 2021

Hello Andriy, I found out by local Xdebug Profiling and Logging on Productive that not only the query building but also the ArrayFormatting of Propel takes time. I fired the raw query propel produces and it was like 10x faster.
n our case optimizing DB reading is the bigger leverage than writing (takes more time). I optimized the query by using the raw SQL (which Propel builds) and a custom formatter which rebuilds the propel Array structure from the generic ArrayFormatter - because the vendor coded relies on that structure. Because we know how the result should be formatted the custom solution is faster. This saved about 80% of the time there (i.e. 1s instead of 5s).
But this was not bringing the speed that I hoped... so I found out that on every CRON run or lets say Message Chunk (i.e. 500) the Category tree data which we need in the index (parent ids, parent category names) has to be reloaded from DB and is only cached for this chunk.
Which leads to nearly 1 minute break after each Chunk. I have not checked yet why it is that long but build a Storage/Redis Cache instead where we can get the category data hopefully way faster. The downside is that Cache data can be invalid or has to be invalidated on the right places.
I will now check if that gives us a better throughput in message acks.
Next thing could be to use CTEs for Insert/Update.

Why do you use CTEs there in Demo/Suite?

If we trigger all our abstract products we have around 100.000 messages in the queue which have to be processed = #products

Andriy Netseplyayev · November 2021

very good research though 👍

Why do you use CTEs there in Demo/Suite?

I think it should be explained here: https://docs.spryker.com/docs/scos/dev/data-import/202108.0/data-importer-speed-optimization.html#pamps-and-cte

btw, have you seen that page? You can find for some more improvement ideas there as well

U01CRKGV3GW · November 2021

The Redis-Cached solution improved our Message rate from about 4000/10min to around 16000/10min. This is pretty good. Becaus all messages in 2,5 hours sounds way better than 10 hours 😃

I check your links later. thanks

Hello Spryker-Developers, we are hitting a problem with the Publish&Sync mechanism regarding Up

Comments

Categories