Needed to write a script to validate that the SharePoint crawler was picking up all items in a library.
One of my document libraries had over a 100,000 document, and I needed to make sure all of them were being indexed.
Document library can support millions of documents, IF you use a good foldering structure.
function GetSearchIndex ($site, $file)
{
$fOutPut = $null
$kq = new-object Microsoft.Office.Server.Search.Query.KeywordQuery($site)
$kq.ResultTypes= [Microsoft.Office.Server.Search.Query.ResultType]::RelevantResults
$kq.QueryText = $file
$kq.HiddenConstraints = 'scope:"All Sites"'
$kq.RowLimit = 10
$res = $kq.Execute()
$table = new-object System.Data.DataTable
$table.Load($res[$kq.ResultTypes],[System.Data.LoadOption]::OverwriteChanges)
if($table.Rows.Count -eq 0)
{
$fOut = "Failed"
}
else
{
$fOut = "Passed"
}
return $fOut
}
$file = "c:\indexCheck.txt"
$cfcSite = Get-SPWeb "http://SOMEsite.sharepointed.com/sites/test"
$nv = $cfcSite.Lists["bigLibrary"]
$spQuery = New-Object Microsoft.SharePoint.SPQuery
$spQuery.ViewAttributes = "Scope='Recursive'"
$spQuery.RowLimit = 2000
$caml = '<OrderBy Override="TRUE"><FieldRef Name="ID"/></OrderBy>'
$spQuery.Query = $caml
do
{
$listItems = $nv.GetItems($spQuery)
$spQuery.ListItemCollectionPosition = $listItems.ListItemCollectionPosition
foreach($item in $listItems)
{
$sResult = GetSearchIndex "http://test.sharepointed.com" $item.Name
if($sResult -eq "Failed")
{
$item.Name | Out-File $file -Append
}
}
}
while ($spQuery.ListItemCollectionPosition -ne $null)
Query the list/library in batches of 2,000.
Looping through the returned items.
Call function to see if the item is in the index (query SharePoint).
From the function, return a value of Passed or Failed.
If Failed is true, log it to a text file on the C:\ drive.