Use Python to upload a LARGE file to SharePoint

In this post, I will quickly show how to use the Office365-REST-Python-Client library to upload a large file to a SharePoint library.

For this to work, you will need a certificate, Azure App registration, and access to the target SharePoint site. I outlined all the necessary parts in this post: Modernizing Authentication in SharePoint Online Note: the linked post will output a .PFX cert, and the script below will need a .PEM cert. You can use this Python command to convert the cert:

from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.primitives.asymmetric import rsa
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import hashes
from cryptography import x509
from cryptography.hazmat.primitives.serialization import pkcs12

# Load the PFX file
pfx_file = open('C:\\path_to_cert\\EXAMPLE.pfx', 'rb').read()  # replace with your pfx file path
(private_key, certificate, additional_certificates) = pkcs12.load_key_and_certificates(pfx_file, None, default_backend())

with open('NewCERT.pem', 'wb') as f:

# install this library if needed
# pip install cryptography

Ok, with that out of the way, you can use this script to upload to a SharePoint library. In the script, I’ve commented out the line that would be used to upload to a folder within a library.

import os
from office365.sharepoint.client_context import ClientContext

cert_credentials = {
    "tenant": "abcxyz-1234-4567-8910-0e3d638792fb",
    "client_id": "abcddd-4444-4444-cccc-123456789111",
    "thumbprint": "7D8D8DF7D8D2F4DF8DF45D4FD8FD48DF5D8D",
    "cert_path": "RestClient\\NewCERT.pem"
ctx = ClientContext("").with_client_certificate(**cert_credentials)
current_web = ctx.web.get().execute_query()

filename = "LargeExcel.xlsx"
folder_path = "C:\\code\py"

def print_upload_progress(offset):
    # type: (int) -> None
    file_size = os.path.getsize(local_path)
        "Uploaded '{0}' bytes from '{1}'...[{2}%]".format(
            offset, file_size, round(offset / file_size * 100, 2)

#upload to a folder
#target_url = "Shared Documents/folderA/folderB"

target_url = "Shared Documents"
target_folder = ctx.web.get_folder_by_server_relative_url(target_url)
size_chunk = 1000000
local_path = os.path.join(folder_path, filename)
with open(local_path, "rb") as f:
    uploaded_file = target_folder.files.create_upload_session(
        f, size_chunk, print_upload_progress, filename

print("File {0} has been uploaded successfully".format(uploaded_file.serverRelativeUrl))

If you receive an error stating you don’t have access, double-check that you’ve added the App Registration to the target SharePoint site permissions. Again, this is noted in the blog post linked at the being of this post.

Consider this a workaround until MS Graph is out of its latest beta and there’s more support for easily uploading to SharePoint.

What if you need to upload a file and set a column value? When working with SharePoint via the API, you must be mindful of the column names. The column name in the UI might not be the same as the internal name, so I will use the script above as my starting point and add the following script to the end. In this example, I’m setting two fields: ReportName and ReportDate.

#get the file that was just uploaded
file_item = uploaded_file.listItemAllFields

# Define a dictionary of field names and their new values
fields_to_update = {
    "ReportName": "My TPS Report",
    # Add more fields here as needed

# Iterate over the dictionary and update each field
for field_name, new_value in fields_to_update.items():
    file_item.set_property(field_name, new_value)

# Commit the changes

print("Report fields were updated")

How do you get a list of all the columns in a list or library? The script below will output all the column’s internal and display names.

from office365.sharepoint.client_context import ClientContext

cert_credentials = {
    "tenant": "abcxyz-1234-4567-8910-0e3d638792fb",
    "client_id": "abcddd-4444-4444-cccc-123456789111",
    "thumbprint": "7D8D8DF7D8D2F4DF8DF45D4FD8FD48DF5D8D",
    "cert_path": "RestClient\\NewCERT.pem"

ctx = ClientContext("").with_client_certificate(**cert_credentials)
current_web = ctx.web.get().execute_query()

# Get the target list or library
list_or_library = ctx.web.lists.get_by_title('TPS-Reports')

# Load the fields
fields = list_or_library.fields.get().execute_query()

# Print the field names
for field in fields:
    print("Field internal name: {0}, Field display name: {1}".format(field.internal_name, field.title))

SharePoint Audit Using Purview

Today, I had a customer ask the following question:
“How can I pull a report showing all of the files in a library that have not been viewed?”
Typically, users request a report showing all the items or files accessed, so this request was unique.

You can run an audit search on most everything in a cloud tenant using Microsoft Purview. Every file downloaded, viewed, list item opened, edited, deleted, page views, Onedrive actions, Teams actions, and the list goes on and on.

In Purview, click on the Audit link in the left nav, and it will open the audit search page.
Select the time range you want to target
Activities: Accessed file
Record types: SharePointFileOperation
Search name: this can be anything you want, i.e. SP Library Search
File, folder, or site:*
Workload: SharePoint
The key items to note are the record type and file options. You can use a wildcard * to return results for everything in the target library. This will return much information, so filtering after the report is downloaded is needed. Once you’ve populated the fields, click Search, then wait a bit for it to complete. The amount of data in your tenant and current workloads will determine how long the search will take.

The completed search will look like this:

Clicking on the report name will open a detailed view of the audit search. From the results page, click the Export button and wait a few minutes for the file to be generated. If the page gets stuck at 0%, refresh your browser, and it should trigger the download.

Next, I needed to get all the files in the SharePoint library. To do this, I used PowerShell to connect to the target site and then downloaded the file info to a CSV.

# Connect to the SharePoint site interactively
Connect-PnPOnline -Url "" -Interactive

# Specify the target library
$libraryName = "TheLibrary"

# Get all files from the target library
$files = Get-PnPListItem -List $libraryName -Fields "FileLeafRef", "ID", "FileRef", "Created", "Modified", "UniqueId", "GUID"

# Create an empty array to store the file metadata
$fileMetadata = @()

# Loop through each file and extract the relevant metadata
foreach ($file in $files) {
    $fileMetadata += [PSCustomObject]@{
        FileName = $file["FileLeafRef"]
        ID = $file["ID"]
        GUID = $file["GUID"]
        UniqueId = $file["UniqueId"]
        URL = $file["FileRef"]
        Created = $file["Created"]
        Modified = $file["Modified"]

# Export the file metadata to a CSV file
$fileMetadata | Export-Csv -Path "C:\code\library_audit.csv" -NoTypeInformation

If you take anything away from this post, please take note of this: Purview uses a field named ListItemUniqueId to identify a SharePoint file or list item. My first thought was to use the GUID from the SharePoint library to match up to the Purview data. This is 100% incorrect! From SharePoint, UniqueId is the field that matches the Purview field ListItemUniqueId.

Basic logic:
SELECT SharePoint.*
FROM SharePoint
ON SharePoint.UniqueId = Purview.ListItemUniqueId

I used Power BI to format and mash the exported Purview data with the SharePoint data. Power BI is unnecessary; you can easily use Power Query in Excel to do the same thing. Below, I’m including my M code statement that parses the JSON from the Purview file, and counts how many times the files were accessed and the last time they were accessed.

    Source = Csv.Document(File.Contents("C:\ian240329\Purview_Audit.csv"),[Delimiter=",", Columns=8, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"RecordId", type text}, {"CreationDate", type datetime}, {"RecordType", Int64.Type}, {"Operation", type text}, {"UserId", type text}, {"AuditData", type text}, {"AssociatedAdminUnits", type text}, {"AssociatedAdminUnitsNames", type text}}),
    #"Parsed JSON" = Table.TransformColumns(#"Changed Type",{{"AuditData", Json.Document}}),
    #"Expanded AuditData" = Table.ExpandRecordColumn(#"Parsed JSON", "AuditData", {"ListItemUniqueId", "SourceFileExtension", "SourceFileName", "ObjectId"}, {"ListItemUniqueId", "SourceFileExtension", "SourceFileName", "ObjectId"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded AuditData",{"AssociatedAdminUnits", "AssociatedAdminUnitsNames", "RecordId"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"ObjectId", "File URL"}, {"SourceFileName", "File Name"}, {"SourceFileExtension", "File Extension"}}),
    #"Filtered Rows" = Table.SelectRows(#"Renamed Columns", each ([File Extension] <> "aspx")),
    #"Filtered Rows1" = Table.SelectRows(#"Filtered Rows", each true),
    #"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows1",{"Operation", "RecordType"}),
    #"Grouped Rows" = Table.Group(#"Removed Columns1", {"ListItemUniqueId"}, {{"View Count", each Table.RowCount(_), Int64.Type}, {"Last Viewed", each List.Max([CreationDate]), type nullable datetime}})
    #"Grouped Rows"

Still working in Power Query, I created a new query to show what SharePoint files had not been accessed. My Purview license is limited to 6 months‘ worth of data, so this is one hindrance to painting a full picture of what has/has not been accessed.

    Source = Table.NestedJoin(SharePoint_Audit, {"UniqueId"}, Purview_Audit, {"ListItemUniqueId"}, "Purview_Audit", JoinKind.LeftAnti),
    #"Expanded Purview_Audit" = Table.ExpandTableColumn(Source, "Purview_Audit", {"File Name"}, {"Purview_Audit.File Name"}),
    #"Sorted Rows1" = Table.Sort(#"Expanded Purview_Audit",{{"Purview_Audit.P File Name", Order.Descending}}),
    #"Filtered Rows1" = Table.SelectRows(#"Sorted Rows1", each true)
    #"Filtered Rows1"

With this data, I can now report to the user what files have not been accessed in the past 6 months.

Effortlessly Trigger a Flow from a Power App: A Simple Step-by-Step Example

In this post, I want to show how easy it is to call a Flow from a Power App. The goal of the Power App is to pass values to the Flow, have it add them together, and return a result.

Starting from the Power Apps portal
click Create –> Blank app, Black canvas app, name the app, for the format option, select tablet, then click Create button.

Power App overview:

Field TypeField NameOption
Text InputTextInputOneFormat: Number
Text InputTextInputTwoFormat: Number

Flow overview:
The Flow can be created directly in the Power App designer or the Power Platform portal. For this example, I’m going to use the portal.

Click on New flow and select Automated cloud flow

Click the Skip button at the bottom of the window (this will make sense in a min.)

With the Flow designer open, click PowerApps or search for it, then click on PowerApps (V2)

In this step, add two number inputs to the action

I named my number inputs as follow: inputNumberOne and inputNumberTwo

The Flow will respond to the app using the Repost to a PowerApp or flow action. For the output, again select number, and I named mine outputNumber .

the formula should be: add(triggerBody()[‘number’],triggerBody()[‘number_1’])

Name the Flow as Flow do Math, and save it. You can test the Flow simply by clicking the Test button and supplying two input values. The Flow can be named something different, but this name aligns with the below example.

Back in the PowerApp, click the Power Automate icon.

With the Power Automate window open, click on Add flow and select the newly created Flow, or search for it and select it.

On the app design surface, select the button and update its OnSelect property to:
Set(varNumber, FlowDoMath.Run(TextInputOne.Text,TextInputTwo.Text).outputnumber)

Select the LabelMathResult field and set its Text value to varNumber

Run the app, input values in the text fields, then click the button.

What just happened?

The values of the two text input fields were passed to the Flow, it added them together and returned the value in the outputnumber field; that value was then set to the varNumber variable.

In future posts, I will dive deeper into more complex examples.

How do you find ALL the Flows that reference a SharePoint site or list?

I asked this question when I first started down the path of learning about Flow:
How do you find all the Flows running on or referencing a SharePoint list?

Before you start on this, please ensure that your account or the account you are using to run the script has sufficient permissions to the target environment(s).

$oneFlow = Get-AdminFlow -FlowName "00000-ae95-4cab-96d8-0000000" -EnvironmentName "222222-4943-4068-8a2d-11111111"

$refResources = $
Write-Host $refResources

If you run that command and look at the returned properties and see an error, that means you do not have the correct permissions to move forward. You can check your permissions in the Power Platform admin center:

/end of update

Think about it: someone in your company creates a Flow that runs when a SharePoint item is updated. Fast forward a year or so, and that coworker has moved on, and the Flow needs to be updated. If you work for a small company or one that hasn’t fallen in love with Power Platform and Flow, you’re likely in luck, and finding the Flow will take a few minutes. In my case, there are currently 2,712 Flows in my tenant that span several environments.

The PowerShell script I’ve created will query a tenant using the Get-AdminFlow command, return all Flows, and then loop through them. The script can be adjusted to target a single environment using the EnvironmentName parameter. Note: running the script using the Get-Flow action will return all the Flows your AD account can access.

#Install-Module AzureAD
#Install-Module -Name Microsoft.PowerApps.Administration.PowerShell  
#Install-Module -Name Microsoft.PowerApps.PowerShell -AllowClobber 


function Get-UserFromId($id) {
    try {
        $usr = Get-AzureADUser -ObjectId $id
        return $usr.displayName
    catch {
        return $null

#get all flows in the tenant
$adminFlows = Get-AdminFlow 

#set path for output
$Path = "$([Environment]::GetFolderPath('Desktop'))\Flow_Search_for_SharePoint_$(Get-Date -Format "yyyyMMdd_HHmmss").csv"

#set target site
$targetSPSite = ""
$targetSPList = "4f4604d2-fa8f-4bae-850f-4908b4708b07"
$targetSites = @()

foreach ($gFlow in $adminFlows) {

    #check if the flow references the target site
    $refResources = $ | Where-Object { $ -eq $targetSPSite }

    #check if the flow references the target list
    #$refResources = $ | Where-Object { $_.resource.list -eq $targetSPList }

    if ($refResources -ne $null) {

        #optional - get the user who created the Flow
        $createdBy = Get-UserFromId($

        $row = @{}
        $row.Add("EnvironmentName", $gFlow.EnvironmentName)
        $row.Add("Name", $gFlow.DisplayName)
        $row.Add("FlowEnabled", $gFlow.Enabled)
        $row.Add("FlowGUID", $gFlow.FlowName)
        $row.Add("CreatedByUser", $createdBy)
        $row.Add("CreatedDate", $gFlow.CreatedTime)
        $row.Add("LastModifiedDate", $gFlow.lastModifiedTime)
        $targetSites += $(new-object psobject -Property $row)

#output to csv
$targetSites | Export-Csv -Path $Path -NoTypeInformation

If you don’t want to get the display name of the user who created the Flow, comment out the part of the script that calls the Get-UserFromId function, and you won’t need to connect to Azure.

And to answer my original question: How do you find all the Flows running on or referencing a SharePoint list?
In the script, comment out the part of the script that references $targetSPSite and un-comment $targetSPList. You can get the GUID of the list by navigating to list settings and looking at the URL. Another option is to open the list, view the Page Source, then look for the “listId” property.

In a future post(s), I will outline how to search for all Flows that use different connectors, Dynamics 365 tables (dataverse), triggered from Power Apps, or other objects. All of the info is in the properties of the Flow; getting to it can be a little fun.

Power App and SharePoint List Form Hide Field on New Item Form

How do you hide a field on a PowerApp when opening a new form? The approach below uses a single screen form instead of multiple screens for the various forms.

I started by creating a new SharePoint list and added two text fields:
Not on New Form
On New Form
Using the customize form option, I entered the Power App designer.

When the PowerApp designer opens, it will look like this:

To help see what’s going on with the form mode, add a text label to the form and set its Text property to: "Form Mode: " & Text(SharePointForm1.Mode)

Select the field (Data Card) that should not appear on the new item form, then select the Visible property. For the Visible property, enter the following: If(SharePointForm1.Mode = 1, false, true) . If your SharePointForm1 is named something else, use it instead of the value I presented.

Breaking down the formula a little: If the SharePoint form mode is equal to 1, visible should be false, else true.

Save and publish the app, then check if it’s functional as planned.

New item form with Form Mode: 1

Display item form with Form Mode: 2

Edit item form with Form Mode: 0

How Do You Get a Power Page Attachment That’s Stored in Blob Storage

My Power App Portal (Power Pages) environments are configured to use Azure blob storage for form attachments. One of the primary reasons for doing this is to avoid filling up expensive dataverse storage with endless attachments submitted by enduers.

This article outlines how to set up Azure storage: link

What I’m going to demo is how to get ONE attachment that’s uploaded to a form. If your form allows multiple attachments, you’d simply loop through them.

In the example, I’m using the soon-to-be-obsolete dataverse connector, but the same basic flow design applies to the normal connector.

When a row is added to my table, the flow is triggered.
The flow then queries the Note (annotation) table using the ID from the source table.
filter query: (_objetid_value eq souce_table_id)

The list rows notes query will result in an array being returned, but I’m only dealing with one attachment, so there’s no need to loop through it. To avoid an unnecessary loop, a function can be used to target a single object from the array: first(body(‘List_rows_Notes’)?[‘value’])?[‘annotationid’]

From the Get row note action, annotationid and filename will be needed to help form the path to the blob. Using the concat function I’m combing the container name, annotationid, and filename. Also, note the transformation on annotationid, the hyphens need to be removed, and the string needs to be lowercase. The last part of the transformation is to remove .azure.txt from the filename.

concat('/blobcontainer/',toLower(replace(outputs('Get_row_Note')?['body/annotationid'], '-', '')),'/', split(outputs('Get_row_Note')?['body/filename'], '.azure.txt')[0])

The end result of the transformation will be:
/blobcontainer/annotationid/filename /blobcontainer/cf03e4cf7f72ad118561002248881923/example.pdf

With the path to the blob formed, the get blob content action can retrieve the file.

It’s that simple.

A couple of notes:
It would be wise to leverage a virus-scanning tool like Cloudmersive.
If you haven’t already noticed, when a user uploads a file that contains special characters in the name…it’s saved to the Note table without the special characters, but when it’s moved to blob storage, the characters will be in the name. Yes, that’s a bug Microsoft has yet to fix. You can avoid this by adding Javascript to the upload page to block files that fall into this category. OR. Write another flow to clean file names before the form is processed.
Uploaded filename: my report 1:2:3.pdf
Note table: my report 123.pdf
Blob: my report 1:2:3.pdf

Azure Runbook Job Name error: Token request failed..Exception

When you move from a SharePoint on-prem environment to SharePoint Online, you lose the server-side environment you’d normally use to run PowerShell scripts or tasks to interact with SharePoint. In my opinion, and please correct me if I’m wrong, the closest thing to a server-side environment in a cloud environment is Azure Runbooks or Azure Function Apps. I went with Azure Runbooks due to its ability to handle long-running tasks.

The error I recently encountered in my runbook was: runbook name error: Token request failed..Exception . At first, I thought there might be something wrong with the way I was connecting to Keyvault, but that wasn’t it. Next was my connection to SharePoint, this is handled using a SharePoint-generated client ID and secret. Oddly enough, I had just updated this a few months back, so it wasn’t an obvious candidate for a failure point.

I went to my target SharePoint site, created a new set of credentials using siteName/_layouts/15/AppRegNew.aspx and siteName/_layouts/15/appinv.aspx. After creating the credentials, I went back to the runbook and plugged them in, and it worked!

Long story short, if you get this error: Token request failed..Exception try creating a new client ID and secret and see if it helps clear things up.

You can also use this script to test your client id and secret. Connect-PnPOnline | PnP PowerShell

$siteUrl = "https://taco.sharepointonline/sites/burrito"
$testConn = Connect-PnPOnline -Url $siteUrl -AppId "1111-2222-3333-4444-555555555555" -AppSecret "X3tssvCebdl/c/gvXsTACOajvBurrito=" -ReturnConnection
$list = Get-PnPList "Tacos"
Write-Output $list

Create Approvals That NEVER Expire

If you are reading this, you likely ran into an issue where you created an approval flow, but it expired before the recipient had time to approve or reject it. The timeout for an approval or any flow is thirty days; then, it stops running. Yes, there are some clever workaround to alert if the flow times out, but who wants to mess with that?

The approach I took to solve this was to leverage some of the existing tooling, then add to them. When you create an approval, a row is created in the dataverse Approval table. As we all know, a flow is trigger-based, so why not create one that simply monitors the Approval table, then handles things from there?

At a high level, here is the basic approach.

Start by creating a simple flow that initiates an approval, then run it. In my example, note the value in the Item Link field; this will come into play later.

Next, navigate to, expand the Dataverse section, and click on Tables. After the page loads, click the All link under Tables, then search for approval. If you search for approval and do not get a result, make sure you click the All link.

Open the Approval table; in it, you will see your approval, possibly more depending on how old your environment is or if many people in your company are using approvals. When looking at the data, the takeaway is what is stored in the table and what can be used in the flow that handles the outcome of the approval. In my case, using the Item Link field is key to handling the approval response. With it, I can filter the value and know if I need to take action on the item or not.

When creating the flow that responds to the approval, you can filter it at the design level or in the trigger settings. I went with the trigger setting due to the number of approvals that could be firing across my organization in our default tenant. Why do you need to filter it? Just assume other approvals might be writing to the same dataverse table.

Trigger Conditions


@not(equals(triggerBody()?['msdyn_flow_approval_result'], null))

The above conditions filter the value I passed in the create approval flow (Item Link) and if the item has been approved or rejected.

Here is an overview of the flow that handles the outcome of the approval. I mixed dataverse connector types due to an issue with the trigger condition not working with the green dataverse connector. In the Expand Query field, I used the Fetch XML builder to query over to the Approval Response table to get the comment field; not used in the example, but nonetheless, it’s there. From the Get a row by ID action, the response of the approval is available to use to handle the outcome (Result) of the approval.

To my knowledge, there is no reason why you can’t create an approval that is active for months, if not years.

1) You can access and review the approval records using PowerBI, Flow, Access, ___
2) You can bulks update the records using PowerShell, Flow, Access (be real careful), __
3) You can pass items in the Details field, then parse them out when handling the approval. Here is one simple example where I’m passing a SharePoint item ID from the approval and parsing it in the response flow:

Response flow compose statement that parses the Details field.

Expression: last(split(triggerBody()?['msdyn_flow_approval_details'],'**SPItemID:** '))

YES, this is a lot, but the general idea is simple; create an approval and handle the response.

Create Dynamic Hyperlinks And Send An Email Action

Over the years, some updates to Flow have been better than others, and others, not so much. If memory serves, the send an email action would use dynamic hyperlinks without much work, but something went sideways with one of the updates causing dynamic hyperlinks not to work as you’d want.

Here is a basic example of including a hyperlink in an outgoing email; further down the page, I’ll provide a more realistitc example.

In this example, I setup the Flow to trigger when an item is added to a SharePoint library. The key thing to note in this example is the value of the varHyerplink variable; note the double quotes around the link to item.

Use Power Automate to Update a SharePoint Person Field

Using the SharePoint HTTP flow action to update a person or group field, I kept getting this error:

A 'PrimitiveValue' node with non-null value was found when trying to read the value of a navigation property; however, a 'StartArray' node, a 'StartObject' node, or a 'PrimitiveValue' node with null value was expected.

The field I was attempting to update is named Submitted By, with an internal name of Submitted_x0020_By. Each time I tried to update the field I was seeing the error noted above. It wasn’t until I looked at one of my previous flow runs did I notice what the issue was. It turns out, that the field name I should be using is Submitted_x0020_ById.

Update flow:

How do you update a Person field if the field allows for multiple selections? The example below will update the field with two different user values, but clearly, this could be extended to be more dynamic.


    "__metadata": {
    "SubmittedByIDsId": {
         "results": [