SharePoint Online, Viva , Teams, Power Platform, Azure, Identity

What is crawl and how its work in SharePoint 2013 – On Premises Environment

In broad terms, SharePoint Search is comprised of three main functional process components:

Crawling (Gathering): Collecting content to be processed
Indexing: Organizing the processed content into a structured/searchable index
Query Processing: Retrieving a relevant result set relative to a given user query

Type of Crawl in SharePoint 2013

Full Crawl
Incremental Crawl
Continuous Crawl

	Advantages	Disadvantages
Continuous Crawl	Work in Parallel mode and maintain the index as current as possible. Its Only Work for SharePoint Objects	It doesn’t work with Non-SharePoint object
Incremental / Full Crawl	It work for both SharePoint and Non-SharePoint Object	Its work in Sequential mode. Unless until first cycle doesn’t complete second can’t start and it wait till first end.

Full Crawl:

Full crawl: - crawls entire content under a content source – IT can be SharePoint Object and Non-SharePoint Object also.

Incremental Crawl:

Incremental crawl: - crawls the content which has been added/modified after last successful crawl.

Comparison between Full and Incremental Crawl

As compared with incremental crawls, full crawls chew up more memory and CPU cycles on the index.
Full crawls consume more memory and CPU cycles on the Web Front End servers when crawling content in your farm.
Full crawls use more network bandwidth than incremental crawls.

There are some scenarios where incremental crawl doesn’t work and you need to run full crawl.

Why do we need Full Crawl?

Software updates or service packs installation on servers in the farm.
When an SSP administrator added new managed property.
Crawl rules have been added, deleted, or modified.
Full crawl is required to repair corrupted index. In this case, system may attempt a full crawl (depending on severity of corruption)
A full crawl of the site has never been done.
To detect security changes those were made on file shares after the last full crawl of the file share.
In case, incremental crawl is failing consecutively. In rare cases, if an incremental crawl fails one hundred consecutive times at any level in a repository, the index server removes the affected content from the index.
To reindex ASPX pages on Windows SharePoint Services 3.0 or Office SharePoint Server 2007 sites. The crawler cannot discover when ASPX pages on Windows SharePoint Services 3.0 or MOSS sites have changed. Because of this, incremental crawls do not reindex views or home pages when individual list items are deleted.

Full Crawl	Incremental Crawl	Continuous Crawl
Crawl full items Can be scheduled Can be stop and paused When required Change content access account Added new manage properties Content enrichment web service codes change/modified. Add new I Filter	Crawl last modified content Can be scheduled Can be stop and paused When required Crawl last modified content	Index as current as possible. Cannot be scheduled Cannot be stop and paused (Once started, a "Continuous Crawl" can’t be paused or stopped, you can just disable it.) When required Content frequently changed (Multiple instance can be run in parallel). Only for SharePoint Content Source E-commerce site in crass site publishing mode.

Note: You should not pause content source crawls very often or pause multiple content source crawls as every paused crawl consumes memory on index server.

Incremental Crawl Cycle

Continuous Crawl Cycle

Ref:- http://blogs.technet.com/b/tothesharepoint/archive/2012/09/14/how-can-i-achieve-the-best-freshness-of-search-results-introducing-continuous-crawls-for-sharepoint.aspx

SharePoint Online, Viva , Teams, Power Platform, Azure, Identity

No comments:

Post a Comment