Understanding how store URL indexing works
In order to be accessed by any type of user, all website pages need to have their own addresses, called URLs.
It is however possible for a page to have more than one address format, thereby allowing users to access it from different URLs, such as www.storetheme.com.br and storetheme.com.br .
The two URLs in the example above are distinct and therefore indexed separately by search engines - even if they direct to the same page.
To solve this issue, each website URL needs its canonical URL, which is nothing more than the official URL - not necessarily identical - that will be given to search engines and indexed, thereby improving the site's overall content findability.
URL indexing requisites
In VTEX IO, all store canonical URLs (product, category, subcategory, brand and custom page URLs) are automatically held in Sitemap where they are indexed by search engines.
As the name suggests, Sitemap is simply the mapping tool for all canonical URLs, and displays your site's architecture to search bots and user browsers.
To ensure that every relevant URL from your site has been indexed, VTEX IO also identifies the most visited search URLs and transforms these into canonical URLs that are then added to Sitemap.
Most visited search URLs
To define the most visited search URLs, the platform calculates how many users already visited each of these site addresses, as long as the site is online.
By default, a URL needs to have been accessed at least once to enter the list of most visited site addresses.
The algorithm that calculates the accesses to each of these URLs in order to add them to the list of most visited addresses takes the following criteria into account:
- Any type of access is valid, including those stemming from workspaces or from the myvtex.com.br domain;
- URLs that are accessed from results returned by the store's search bar are not taken into account.
Once on the most visited list, URLs get their own canonical URL and are then added to Sitemap, where they're indexed by search engines.
A URL can be removed from the Sitemap listing and no longer be indexed if:
- It returns a status code 404;
- It lost its relevance (is no longer accessed) and, consequently, fell off the list of most visited site URLs.
Managing the most visited search URLs list
It's not possible to control which search URLs will enter the most visited list, since this is based on user activity.
What you can do is control the max number of most visited URLs that your site will index, by following the instructions below:
- Log in to the admin of the desired VTEX account;
- Access the Apps section, located in the admin side bar;
- Select the Store Indexer;
- In the Setup section, fill out the following field with the desired value Number of search URLs to be indexed;
- Save the configurations.
The Store Indexer app, which is responsible for indexing your site's main URLs, builds a list of the most visited site URLs according to the frequency with which those pages are accessed and the max number of possible URLs you configured in the above mentioned step-by-step.
The app automatically corrects and uniformizes the listed URLs to create canonical URLs, removing coded values such as map=c,c and /example?map=specificationFilter_X from them.
As said previously, these URLs are then added to Sitemap by the Store Indexer and thus get indexed by web search engines.