Blogs, Articles, and More

Utilizing The Power of SharePoint Search-Part 1 (Centralizing Search Results From Multiple Sources)

Posted by C/D/H Consultant on Jul 19, 2014 9:06:00 AM

Search is a powerful tool, a very welcome and useful feature on any site. SharePoint farms can be especially large, and finding what you need quickly is extremely important. The Search Service Application in SharePoint provides this key feature to its users. This feature can extend beyond just your local SharePoint farm, however.

The SharePoint Search Service Application allows you to crawl:

  • SharePoint Sites, of course
  • External Web Sites
  • File Shares
  • Exchange Public Folders
  • Lines of Business Data, which are external data connections via the BDC
  • Custom Repositories, which require a registered Custom Connector

These can all be indexed into your search results as well. This allows you to better leverage SharePoint as a centralized collaboration portal for the whole organization. Organizations typically contain many custom sites, file shares, public folders, etc., and having SharePoint Search aware of them all can help users find what they need quickly, from one centralized spot.

The first thing you’ll need to start with, assuming you already have an existing SharePoint farm with a fully configured Search Service Application up and running, is to start indexing that external content. Heading over to the Central Administration site, and navigating to your Search Service Application, let’s go ahead and create a new Content Source for crawling an external website.

spsearch1.jpg

Here I’ve set up a content source to crawl the Blue Sphere public website by setting the Content Source Type to “Web Sites” and provided the root URL of the site. I’ve also set up a fairly conservative crawl schedule, deciding to only crawl the site once a week since that matches weekly changes that are done to the content. It’s a good idea to set up a crawl schedule appropriate for the site being crawled. If the site is not updated all that often, crawl less often. This will save those precious resources for the many other things SharePoint it constantly doing.

I also set up a new Content Source for a File Share I’d like to have indexed. Same setup as an external website, only this time you’ll be providing the \\server\folder or file://server/folder location(s) you wish to have crawled.

spsearch2.jpg

In addition to this, you may need to set up a Crawl Rule to tell SharePoint how URLs from your sources are to be crawled. In the case of an anonymously accessible, public site, such as in my example, you will want to set up the rule to crawl the site anonymously. I’ve used a wildcard in the URL to be crawled, and indicated to “Include” anything that falls under that hostname. Here is also where you could provide the credentials for a specific content access account, if your site should require one. In case of your site requires authentication, it would be best to come up with a service account that has full read permissions to the site. Preferably one that is created specifically for this task. I didn’t create a Crawl Rule for my directory share, as it has integration with AD and is accessible to all users. SharePoint is going to automatically try and use the search crawler account to access the share, which is going to access it just fine.

spsearch3.jpg

If you want to give the option of limiting your search results to this new content source, such as offering the Scopes dropdown or limiting the scope of a search results web part directly, you can set up a Result Source for it. Result Sources in SharePoint 2013 took the place of Search Scopes from previous versions of SharePoint. I’ve set up a custom Result Source that simply limits the scope to that of the www.bluesphereinc.com Content Source I created earlier, with the Query “{searchTerms} ContentType:www.bluesphereinc.com”. But, using the Query Builder available here, you can get a lot more involved than that depending on what you’re looking for.

spsearch4.jpg

Now that we have our Content Sources and Crawl Rules setup we need to crawl the sites. The first crawl performed after any changes to a Content Source will need to be a Full Crawl. Incremental Crawls can be performed after that for the most part, they require far less resources and run a lot quicker as they index only the additions and changes to the environment since the last run. I like to schedule frequent incremental crawls, and also weekly full crawls to ensure database consistency. Again, this depends on the size of your sites being crawled. Our public facing site is small enough, with infrequent enough updates, that I just do the weekly full crawls… performing incrementals more frequently than that when I know there aren’t any changes taking place would simply be a waste of system resources. I do 30 minute incrementals and weekly full crawls on our SharePoint intranet sites. The file share I set up for crawling is pretty large though, and changes very rarely. So that one I’ve set up to incrementally crawl once a day, and fully crawl once a month.

After getting some results indexed I was able to hop over to our Search Center site and conduct my first search containing mixed results from the various different sources.

spsearch5.jpg

It doesn’t stop here, though. SharePoint Search doesn’t have to be a SharePoint only thing. The SharePoint Search Service Application can provide this powerful feature to more than just SharePoint, and to more than just SharePoint users working inside SharePoint alone. Utilizing the SharePoint Client Object Model, you can incorporate SharePoint Search as the backend search provider for your custom websites outside of SharePoint as well. I go into greater detail on this in the second part of this article Utilizing The Power Of SharePoint Search – Part 2 (Using SharePoint As A Backend Search Provider).

Topics: SharePoint