Introduction
Hi friends, in this post we will see how we can use PnP.Core SDK to query large lists without hitting the throttling issue in SharePoint Online. In my previous post How to use PnP.Core SDK to secure communication with SharePoint Online, I have shown you the step-by-step approach to the initial setup and configuration. From this post and for future posts, I will be using the same code repo and will add additional methods based on the post.
Why we need to be cautious for large lists?
When we try to access the list items programmatically, by default only 100 items are returned, and the maximum we can get is 5000 items in a single call because 5000 is the threshold limit. We cannot go beyond the threshold limit, and we don’t have control over the threshold limit for SharePoint online. In order to fetch items more than the threshold limit, we have to use paging or batch queries. So we will see how we can use the paging mechanism to fetch more than 5K items without hitting the threshold limit.
Focus on the Code
Note: I will use the same repo from the previous article so the configuration and other settings we made on the Azure Portal remain the same. The sample list that we use here has around 10K items.
Enable logs on the application
We have to make a minor change to the appsettings.json file to enable PnPCore SDK to print the transaction info to learn how many calls are made to SharePoint online.

Comment the line shown below in the Auth.cs file.

We can load the list items by using LoadItemsByCamlQuery & ListDataAsStream. For both the approach we will use CAML query with minor changes. Below is the method that will load the list items using the caml query without RowLimit mentioned in the query.
async Task<List<IListItem>> GetListDataWithCamlQueryWithoutPaging(PnPContext ctx, string listName, string camlQuery)
{
List<IListItem> retItems = null;
try
{
var myList = ctx.Web.Lists.GetByTitle(listName, p => p.Title,
p => p.Fields.QueryProperties(p => p.InternalName,
p => p.FieldTypeKind,
p => p.TypeAsString,
p => p.Title));
// Load all the needed data using paged requests
var output = await myList.LoadListDataAsStreamAsync(new RenderListDataOptions()
{
ViewXml = camlQuery,
RenderOptions = RenderListDataOptionsFlags.ListData,
DatesInUtc = true
}).ConfigureAwait(false);
// Iterate over the retrieved list items
retItems = myList.Items.AsRequested().ToList();
}
catch (PnPException ex)
{
Console.WriteLine(ex.ToString());
}
return retItems;
}
Below is the CAML query used in the above method
public static string qry_AllService = @"
<View>
<Query>
<OrderBy>
<FieldRef Name='ID' Ascending='TRUE'/>
</OrderBy>
</Query>
<ViewFields>
<FieldRef Name='ID'/>
</ViewFields>
</View>
";
When we use the above query and call the method defined above, we should receive the error, since we didn’t mention any row limit in the query and when our application tries to load all the 10K items, it will throw the below error.

So, is it not possible to get any items from the list without paging if the list has exceeded the threshold? Yes, we can for that we need to do some minor change to the query that we used. Add the RowLimit attribute and seit it to 5000 and you should be able to get 5000 items from the list which has 10K items.


The last line shows the item count as 5000 in the above result screenshot, and also you can see some other url and api that was hit by the PnP for us to check the URL’s that are called to load the items. So when there is a row limit specified, it will load only the items mentioned in the row limit and the max it can load around 5K items.
When we tried to use the same query with paging, we will be able to retrieve the entire list items without hitting the throttling limit
async Task<List<IListItem>> GetListDataWithPaging(PnPContext ctx, string listName, string camlQuery, bool useMemory)
{
List<IListItem> retItems = null;
try
{
var myList = ctx.Web.Lists.GetByTitle(listName, p => p.Title,
p => p.Fields.QueryProperties(p => p.InternalName,
p => p.FieldTypeKind,
p => p.TypeAsString,
p => p.Title));
// Load all the needed data using paged requests
bool paging = true;
string nextPage = null;
int pages = 0;
int pageSize = 5000;
int totalItemsLoaded = 0;
while (paging)
{
if (!useMemory) myList.Items.Clear();
await myList.LoadItemsByCamlQueryAsync(new CamlQueryOptions()
{
ViewXml = camlQuery,
PagingInfo = nextPage ?? null,
DatesInUtc = true
});
pages++;
if (useMemory) totalItemsLoaded = myList.Items.Length;
else totalItemsLoaded = totalItemsLoaded + myList.Items.Length;
if (totalItemsLoaded == pages * pageSize)
{
nextPage = $"Paged=TRUE&p_ID={myList.Items.AsRequested().Last().Id}";
}
else
{
paging = false;
}
if (!useMemory)
{
// Iterate over the retrieved list items
if (retItems != null) retItems = retItems.Union(myList.Items.AsRequested().ToList()).ToList();
else retItems = myList.Items.AsRequested().ToList();
}
}
if (useMemory) retItems = myList.Items.AsRequested().ToList();
}
catch (PnPException ex)
{
Console.WriteLine(ex.ToString());
}
return retItems;
}
In the above method, we define the page size and we will iterate through the pages and then will return once all the pages are iterated. The additional parameter of using the inbuilt memory is to make use of caching the items in the memory or not.

When we look at the result, it returned all the 10367 items and there are 3 api calls made since we gave the page size as 5000.
Below is the same approach but using ListDataAsStream method which will give us more control on how the field data to be rendered.
async Task<List<IListItem>> GetListDataAsStreamWithPaging(PnPContext ctx, string listName, string camlQuery, bool useMemory)
{
List<IListItem> retItems = null;
try
{
var myList = ctx.Web.Lists.GetByTitle(listName, p => p.Title,
p => p.Fields.QueryProperties(p => p.InternalName,
p => p.FieldTypeKind,
p => p.TypeAsString,
p => p.Title));
// Load all the needed data using paged requests
bool paging = true;
string nextPage = null;
while (paging)
{
if (!useMemory) myList.Items.Clear();
var output = await myList.LoadListDataAsStreamAsync(new RenderListDataOptions()
{
ViewXml = camlQuery,
RenderOptions = RenderListDataOptionsFlags.ListData,
Paging = nextPage ?? null,
DatesInUtc = true
}).ConfigureAwait(false);
if (output.ContainsKey("NextHref"))
{
nextPage = output["NextHref"].ToString().Substring(1);
}
else
{
paging = false;
}
if (!useMemory)
{
// Iterate over the retrieved list items
if (retItems != null) retItems = retItems.Union(myList.Items.AsRequested().ToList()).ToList();
else retItems = myList.Items.AsRequested().ToList();
}
}
if (useMemory) retItems = myList.Items.AsRequested().ToList();
}
catch (PnPException ex)
{
Console.WriteLine(ex.ToString());
}
return retItems;
}

If you see the result, the ListDataAsStream method use batch method to load the items in paging.
Conclussion
I hope you had learned something new on PnP.Core SDK. In my future blog post, I am planning to cover the methods related to the files and how to perform batch operations.
I welcome your suggestions and feedback which will help me in contributing the information that help others to complete their tasks or achieve the business process.
https://github.com/sudharsank/PnPCoreDemo
Happy Coding…