What are the Slack Archives?

It’s a history of our time together in the Slack Community! There’s a ton of knowledge in here, so feel free to search through the archives for a possible answer to your question.

Because this space is not active, you won’t be able to create a new post or comment here. If you have a question or want to start a discussion about something, head over to our categories and pick one to post in! You can always refer back to a post from Slack Archives if needed; just copy the link to use it as a reference..

Hi, does Spryker have any kind of functionality/module that would be responsible for generating site

U01UN6E2VM4
U01UN6E2VM4 Posts: 5 πŸ§‘πŸ»β€πŸš€ - Cadet

Hi, does Spryker have any kind of functionality/module that would be responsible for generating sitemaps? Couldn't find anything so far πŸ€”

Comments

  • U01LKKBK97T
    U01LKKBK97T Posts: 287 πŸ§‘πŸ»β€πŸš€ - Cadet

    Nope, they don't. Just built our own solution on project level yesterday.
    Poor enough that this is missing from the core.

  • giovanni.piemontese
    giovanni.piemontese Technical Lead @ LΓΆffelhardt Spryker Solution Partner Posts: 871 πŸ§‘πŸ»β€πŸš€ - Cadet

    It will be cool when u can share your solution or just describe it πŸ™‚ This is also the main principle of a community?! πŸ˜‰

  • U01LKKBK97T
    U01LKKBK97T Posts: 287 πŸ§‘πŸ»β€πŸš€ - Cadet

    Sure.
    The idea is to read all urls from spy_url that are no redirects. Then use a package like https://packagist.org/packages/thepixeldeveloper/sitemap to create your xml out of it. All of this is done in Zed, of course.
    To get it into the frontend, you'll need to create an Yves controller and a client which is reading the generated xml from Zed. Finally, the Yves controller will just return the xml response.
    Does that help to get you going?

    I'd love to provide my solution to Spryker, but I'm not having the time to build a module and clarify whether my client allows me to do so, unfortunately. You know how it is in everyday business...

  • U01LKKBK97T
    U01LKKBK97T Posts: 287 πŸ§‘πŸ»β€πŸš€ - Cadet

    If you're not having too much data, it should be good to generate the sitemap on the fly any time somebody's asking for it.
    Otherwise you might want to add some sort of custom caching to it.

  • U01AZ2MRQ9J
    U01AZ2MRQ9J Posts: 1 πŸ§‘πŸ»β€πŸš€ - Cadet

    @UPWG9AYH2 @U01TXCBF5SS I guess we can use this for our project as well πŸ™‚

  • UKEP86J66
    UKEP86J66 Posts: 208 πŸ§‘πŸ»β€πŸš€ - Cadet

    This solution sounds like a great approach and I also find this is an important missing feature in core. Rather than developing something for Spryker you can always suggest it as an idea or support similar suggestions, eg https://spryker.ideas.aha.io/ideas/search?utf8=%E2%9C%93&query=sitemap

  • U01LKKBK97T
    U01LKKBK97T Posts: 287 πŸ§‘πŸ»β€πŸš€ - Cadet

    I'll try to clarify with my client whether we can contribute it somehow.
    FYI: Our solution (incl. some clever caching and renewal via Jenkins job) takes around 40 files. Would be happy to share and save others that bulk of work.

  • UKEP86J66
    UKEP86J66 Posts: 208 πŸ§‘πŸ»β€πŸš€ - Cadet

    πŸ‘ We also solved this on the project level and I think we are using the same library too πŸ™‚
    Ours is fairly project specific but something more generic and shared would be useful for others.

  • U01LKKBK97T
    U01LKKBK97T Posts: 287 πŸ§‘πŸ»β€πŸš€ - Cadet
    edited June 2021

    There are other packages doing more or less the same thing, but some of them are forcing you to provide a target file path as a string. It would work, too, but I'd rather like to create the xml in memory and decide myself how to store it. For consistency I'm using the Spryker filesystem service for reading and writing the sitemap file.

  • U01UN6E2VM4
    U01UN6E2VM4 Posts: 5 πŸ§‘πŸ»β€πŸš€ - Cadet

    Thank you, @U01LKKBK97T πŸ™‚

  • giovanni.piemontese
    giovanni.piemontese Technical Lead @ LΓΆffelhardt Spryker Solution Partner Posts: 871 πŸ§‘πŸ»β€πŸš€ - Cadet

    @U01LKKBK97T does it work also for ca. 3 Mio Products, ca. 1200 Categories? Does it generate in bulk?

  • U01LKKBK97T
    U01LKKBK97T Posts: 287 πŸ§‘πŸ»β€πŸš€ - Cadet

    Remember that spy_url does not contain concrete urls since these are abstract urls with query params. Hence we're not printing those attributed abstract urls to the sitemap to avoid duplicate content issues.

    If you really have 3 Mio abstract urls, it might need some extra effort to read the urls in chunks from db and stream them to the output file to avoid out of memory issues.

    Since Google has a limitation of max 50k items per sitemap file, you'd need to spread your urls across several files anyway. So 50k items should be doable without caring about memory.

  • giovanni.piemontese
    giovanni.piemontese Technical Lead @ LΓΆffelhardt Spryker Solution Partner Posts: 871 πŸ§‘πŸ»β€πŸš€ - Cadet

    Yes, I have ca. 3 Mio Abstract + min. 3 Mio Concrete... Your solution generate a sitemap xml file with reference to bulk xml files, right? What u placed the configuration (i hope different for every entity type) such as frequency etc..?

  • U01LKKBK97T
    U01LKKBK97T Posts: 287 πŸ§‘πŸ»β€πŸš€ - Cadet
    edited June 2021

    In your case I'd suggest to create several xml files having max. 50k items and an additional index file referencing all of them: https://developers.google.com/search/docs/advanced/sitemaps/large-sitemaps

    In our case, I've set priority to 0.5 and changefreq to daily for any url. According to Google, these values aren't used anymore, so I'm just using the same defaults for anything, just in case that other search engines will use it.
    https://developers.google.com/search/docs/advanced/sitemaps/build-sitemap?hl=en