How to upload a large file to SharePoint using the Microsoft Graph API

What started as a simple question from a co-worker turned into a rabbit hole exploration session that lasted a bit longer than anticipated. ‘Hey, I need to upload a report to SharePoint using Python.’

In the past, I’ve used SharePoint Add-in permissions to create credentials allowing an external service, app, or script to write to a site, library, list, or all of the above. However, the environment I’m currently working in does not allow Add-in permissions, and Microsoft has been slowly depreciating the service for a long time.

As of today (March 18, 2024) this is the only way I could find to upload a large file to SharePoint. Using the MS Graph SDK, you can upload files smaller than 4mb, but that is useless in most cases.

For the script below, the following items are needed:
Azure App Registration:
Microsoft Graph application permissions:
Files.ReadWrite.All
Sites.ReadWrite.All
SharePoint site
SharePoint library (aka drive)
File to test with

import requests
import msal
import atexit
import os.path
import urllib.parse
import os

TENANT_ID = '19a6096e-3456-7890-abcd-19taco8cdedd'
CLIENT_ID = '0cd0453d-cdef-xyz1-1234-532burrito98'
CLIENT_SECRET  = '.i.need.tacos-and.queso'
SHAREPOINT_HOST_NAME = 'tacoranch.sharepoint.com'
SITE_NAME = 'python'
TARGET_LIBRARY = 'reports'
UPLOAD_FILE = 'C:\\code\\test files\\LargeExcel.xlsx'
UPLOAD_FILE_NAME = 'LargeExcel.xlsx'
UPLOAD_FILE_DESCRIPTION = 'A large excel file' #not required

AUTHORITY = 'https://login.microsoftonline.com/' + TENANT_ID
ENDPOINT = 'https://graph.microsoft.com/v1.0'

SCOPES = [
    'Files.ReadWrite.All',
    'Sites.ReadWrite.All'
]

cache = msal.SerializableTokenCache()

if os.path.exists('token_cache.bin'):
    cache.deserialize(open('token_cache.bin', 'r').read())

atexit.register(lambda: open('token_cache.bin', 'w').write(cache.serialize()) if cache.has_state_changed else None)

SCOPES = ["https://graph.microsoft.com/.default"]

app = msal.ConfidentialClientApplication(CLIENT_ID, authority=AUTHORITY, client_credential=CLIENT_SECRET, token_cache=cache)

result = None
result = app.acquire_token_silent(SCOPES, account=None)

drive_id = None

if result is None:
    result = app.acquire_token_for_client(SCOPES)

if 'access_token' in result:
    print('Token acquired')
else:
    print(result.get('error'))
    print(result.get('error_description'))
    print(result.get('correlation_id')) 

if 'access_token' in result:
    access_token = result['access_token']
    headers={'Authorization': 'Bearer ' + access_token}

    # get the site id
    result = requests.get(f'{ENDPOINT}/sites/{SHAREPOINT_HOST_NAME}:/sites/{SITE_NAME}', headers=headers)
    result.raise_for_status()
    site_info = result.json()
    site_id = site_info['id']

    # get the drive / library id
    result = requests.get(f'{ENDPOINT}/sites/{site_id}/drives', headers=headers)
    result.raise_for_status()
    drives_info = result.json()
    
    for drive in drives_info['value']:
        if drive['name'] == TARGET_LIBRARY:
            drive_id = drive['id']
            break

    if drive_id is None:
        print(f'No drive named "{TARGET_LIBRARY}" found')

    # upload a large file to
    file_url = urllib.parse.quote(UPLOAD_FILE_NAME)
    result = requests.post(
        f'{ENDPOINT}/drives/{drive_id}/root:/{file_url}:/createUploadSession',
        headers=headers,
        json={
            '@microsoft.graph.conflictBehavior': 'replace',
            'description': UPLOAD_FILE_DESCRIPTION,
            'fileSystemInfo': {'@odata.type': 'microsoft.graph.fileSystemInfo'},
            'name': UPLOAD_FILE_NAME
        }
    )

    result.raise_for_status()
    upload_session = result.json()
    upload_url = upload_session['uploadUrl']

    st = os.stat(UPLOAD_FILE)
    size = st.st_size
    CHUNK_SIZE = 10485760
    chunks = int(size / CHUNK_SIZE) + 1 if size % CHUNK_SIZE > 0 else 0
    with open(UPLOAD_FILE, 'rb') as fd:
        start = 0
        for chunk_num in range(chunks):
            chunk = fd.read(CHUNK_SIZE)
            bytes_read = len(chunk)
            upload_range = f'bytes {start}-{start + bytes_read - 1}/{size}'
            print(f'chunk: {chunk_num} bytes read: {bytes_read} upload range: {upload_range}')
            result = requests.put(
                upload_url,
                headers={
                    'Content-Length': str(bytes_read),
                    'Content-Range': upload_range
                },
                data=chunk
            )
            result.raise_for_status()
            start += bytes_read

else:
    raise Exception('no access token')

In the script, I’m uploading the LargeExcel file to a library named reports in the python site. It is important to note that the words drive and library are used interchangeably when working with MS Graph. If you see a script example that does not specify a target library but only uses root, it will write the files to the default Documents / Shared Documents library.

Big thank you to Keath Milligan for providing the foundation of the script.
https://gist.github.com/keathmilligan/590a981cc629a8ea9b7c3bb64bfcb417

How to Find Your Microsoft Forms Data: Locating the Linked Excel File

This started as a simple question: where is the backend Excel file for my group Forms form stored?

By default, a group form will save responses to an Excel file in the SharePoint site associated with the group. Within that site, the file is stored in the Documents, aka Shared Documents library.

Here is a quick way to track down the file:

With the form open, click on Response.

Click on Open in Excel.

Depending on how your SharePoint library is configured, the file will either download to your computer or open in the browser. Open the file and click on the name or click the down arrow next to it.

When the window opens, it will show exactly where the file is stored.

In this example, the file is stored in the Shared Documents library on the Testing site. Again, this example shows Shared Documents, but on the site, it’s actually named Documents.

STOP, I don’t see the window noted in the above screenshot! This more than likely means you are working with a personal form.

Where is the Excel file stored for personal forms? Not where you’d guess and not anywhere worthwhile. The file is more or less saved with the form and is inaccessible other than downloading it.

What if I copy my personal form to a group? What will happen to the Excel file?
Don’t do this; just recreate the form from scratch. The copied form will retain the behavior of storing the file with the form, not in SharePoint.

How can I save form responses to a SharePoint list or Dataverse table? You would need to create a Flow to intercept the form response and then save it to the destination.

Will creating a Flow that saves form responses to another destination impact the form saving to Excel? No, the form will always use the backend Excel file as its data storage.

If I download a copy of the backend Excel file, will the downloaded copy be updated with new form submissions? No, the copy is disconnected from the source.

How do you find ALL the Flows that reference a SharePoint site or list?

I asked this question when I first started down the path of learning about Flow:
How do you find all the Flows running on or referencing a SharePoint list?

UPDATE / EDIT – READ THIS Part
Before you start on this, please ensure that your account or the account you are using to run the script has sufficient permissions to the target environment(s).

$oneFlow = Get-AdminFlow -FlowName "00000-ae95-4cab-96d8-0000000" -EnvironmentName "222222-4943-4068-8a2d-11111111"

$refResources = $oneFlow.Internal.properties.referencedResources
Write-Host $refResources



If you run that command and look at the returned properties and see an error, that means you do not have the correct permissions to move forward. You can check your permissions in the Power Platform admin center: https://admin.powerplatform.microsoft.com/

/end of update

Think about it: someone in your company creates a Flow that runs when a SharePoint item is updated. Fast forward a year or so, and that coworker has moved on, and the Flow needs to be updated. If you work for a small company or one that hasn’t fallen in love with Power Platform and Flow, you’re likely in luck, and finding the Flow will take a few minutes. In my case, there are currently 2,712 Flows in my tenant that span several environments.

The PowerShell script I’ve created will query a tenant using the Get-AdminFlow command, return all Flows, and then loop through them. The script can be adjusted to target a single environment using the EnvironmentName parameter. Note: running the script using the Get-Flow action will return all the Flows your AD account can access.

#Install-Module AzureAD
#Install-Module -Name Microsoft.PowerApps.Administration.PowerShell  
#Install-Module -Name Microsoft.PowerApps.PowerShell -AllowClobber 

#connect-AzureAD

function Get-UserFromId($id) {
    try {
        $usr = Get-AzureADUser -ObjectId $id
        return $usr.displayName
    }
    catch {
        return $null
    }
}

#get all flows in the tenant
$adminFlows = Get-AdminFlow 

#set path for output
$Path = "$([Environment]::GetFolderPath('Desktop'))\Flow_Search_for_SharePoint_$(Get-Date -Format "yyyyMMdd_HHmmss").csv"

#set target site
$targetSPSite = "https://yourTenant.sharepoint.com/sites/yourSITE"
$targetSPList = "4f4604d2-fa8f-4bae-850f-4908b4708b07"
$targetSites = @()

foreach ($gFlow in $adminFlows) {

    #check if the flow references the target site
    $refResources = $gFlow.Internal.properties.referencedResources | Where-Object { $_.resource.site -eq $targetSPSite }

    #check if the flow references the target list
    #$refResources = $gFlow.Internal.properties.referencedResources | Where-Object { $_.resource.list -eq $targetSPList }

    if ($refResources -ne $null) {

        #optional - get the user who created the Flow
        $createdBy = Get-UserFromId($gFlow.internal.properties.creator.userId)

        $row = @{}
        $row.Add("EnvironmentName", $gFlow.EnvironmentName)
        $row.Add("Name", $gFlow.DisplayName)
        $row.Add("FlowEnabled", $gFlow.Enabled)
        $row.Add("FlowGUID", $gFlow.FlowName)
        $row.Add("CreatedByUser", $createdBy)
        $row.Add("CreatedDate", $gFlow.CreatedTime)
        $row.Add("LastModifiedDate", $gFlow.lastModifiedTime)
        
        $targetSites += $(new-object psobject -Property $row)
    }
}

#output to csv
$targetSites | Export-Csv -Path $Path -NoTypeInformation

If you don’t want to get the display name of the user who created the Flow, comment out the part of the script that calls the Get-UserFromId function, and you won’t need to connect to Azure.

And to answer my original question: How do you find all the Flows running on or referencing a SharePoint list?
In the script, comment out the part of the script that references $targetSPSite and un-comment $targetSPList. You can get the GUID of the list by navigating to list settings and looking at the URL. Another option is to open the list, view the Page Source, then look for the “listId” property.

In a future post(s), I will outline how to search for all Flows that use different connectors, Dynamics 365 tables (dataverse), triggered from Power Apps, or other objects. All of the info is in the properties of the Flow; getting to it can be a little fun.

Power App and SharePoint List Form Hide Field on New Item Form

How do you hide a field on a PowerApp when opening a new form? The approach below uses a single screen form instead of multiple screens for the various forms.

I started by creating a new SharePoint list and added two text fields:
Not on New Form
On New Form
Using the customize form option, I entered the Power App designer.

When the PowerApp designer opens, it will look like this:

To help see what’s going on with the form mode, add a text label to the form and set its Text property to: "Form Mode: " & Text(SharePointForm1.Mode)

Select the field (Data Card) that should not appear on the new item form, then select the Visible property. For the Visible property, enter the following: If(SharePointForm1.Mode = 1, false, true) . If your SharePointForm1 is named something else, use it instead of the value I presented.

Breaking down the formula a little: If the SharePoint form mode is equal to 1, visible should be false, else true.

Save and publish the app, then check if it’s functional as planned.

New item form with Form Mode: 1

Display item form with Form Mode: 2

Edit item form with Form Mode: 0

Azure Runbook Job Name error: Token request failed..Exception

When you move from a SharePoint on-prem environment to SharePoint Online, you lose the server-side environment you’d normally use to run PowerShell scripts or tasks to interact with SharePoint. In my opinion, and please correct me if I’m wrong, the closest thing to a server-side environment in a cloud environment is Azure Runbooks or Azure Function Apps. I went with Azure Runbooks due to its ability to handle long-running tasks.

The error I recently encountered in my runbook was: runbook name error: Token request failed..Exception . At first, I thought there might be something wrong with the way I was connecting to Keyvault, but that wasn’t it. Next was my connection to SharePoint, this is handled using a SharePoint-generated client ID and secret. Oddly enough, I had just updated this a few months back, so it wasn’t an obvious candidate for a failure point.

I went to my target SharePoint site, created a new set of credentials using siteName/_layouts/15/AppRegNew.aspx and siteName/_layouts/15/appinv.aspx. After creating the credentials, I went back to the runbook and plugged them in, and it worked!

Long story short, if you get this error: Token request failed..Exception try creating a new client ID and secret and see if it helps clear things up.

You can also use this script to test your client id and secret. Connect-PnPOnline | PnP PowerShell

$siteUrl = "https://taco.sharepointonline/sites/burrito"
$testConn = Connect-PnPOnline -Url $siteUrl -AppId "1111-2222-3333-4444-555555555555" -AppSecret "X3tssvCebdl/c/gvXsTACOajvBurrito=" -ReturnConnection
$list = Get-PnPList "Tacos"
Write-Output $list

Use Python to Query a LARGE SharePoint list.

When querying a SharePoint list that has more than 5,000 items, you’ll likely receive an error like this:

This view cannot be displayed because it exceeds the list view threshold (5000 items) enforced by the administrator. 


Microsoft.SharePoint.SPQueryThrottledException', 'The attempted operation is prohibited because it exceeds the list view threshold.', "500 Server Error: Internal Server Error for url

Or, your query will only return the default 100 items. To get around this, pagination can be used to query the list and return ALL of the items.
Example:
all_items = list_to_export.items.paged(1000).get().execute_query()

Full script using VS Code:

from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext


app_settings = {
    'url': 'https://taco.sharepoint.com/sites/queso/',
    'client_id': 'ID here',
    'client_secret': 'shhhh its a secret',
}

context_auth = AuthenticationContext(url=app_settings['url'])
context_auth.acquire_token_for_app(client_id=app_settings['client_id'], client_secret=app_settings['client_secret'])

#connect to the site
ctx = ClientContext(app_settings['url'], context_auth)
ctx.execute_query()

#get the target list
list_title = "List of Tacos"
list_to_export = ctx.web.lists.get_by_title(list_title)

#get all of the list items
all_items = list_to_export.items.paged(1000).get().execute_query()
list_items = [item for item in all_items]

print("Item count: {0}".format(len(list_items)))

The example above connects to a SharePoint site using a client ID and secret, then queries the list. Again, the key here is using pagination (paged). You can adjust the page size to better fit your needs, but be sure to leave it under 5,000, or you will be back to square one.

Create Approvals That NEVER Expire

If you are reading this, you likely ran into an issue where you created an approval flow, but it expired before the recipient had time to approve or reject it. The timeout for an approval or any flow is thirty days; then, it stops running. Yes, there are some clever workaround to alert if the flow times out, but who wants to mess with that?

The approach I took to solve this was to leverage some of the existing tooling, then add to them. When you create an approval, a row is created in the dataverse Approval table. As we all know, a flow is trigger-based, so why not create one that simply monitors the Approval table, then handles things from there?

At a high level, here is the basic approach.

Start by creating a simple flow that initiates an approval, then run it. In my example, note the value in the Item Link field; this will come into play later.

Next, navigate to make.powerapps.com, expand the Dataverse section, and click on Tables. After the page loads, click the All link under Tables, then search for approval. If you search for approval and do not get a result, make sure you click the All link.



Open the Approval table; in it, you will see your approval, possibly more depending on how old your environment is or if many people in your company are using approvals. When looking at the data, the takeaway is what is stored in the table and what can be used in the flow that handles the outcome of the approval. In my case, using the Item Link field is key to handling the approval response. With it, I can filter the value and know if I need to take action on the item or not.

When creating the flow that responds to the approval, you can filter it at the design level or in the trigger settings. I went with the trigger setting due to the number of approvals that could be firing across my organization in our default tenant. Why do you need to filter it? Just assume other approvals might be writing to the same dataverse table.

Trigger Conditions

@contains(triggerBody()?['msdyn_flow_approval_itemlink'],'https://www.sharepointed.com/stuff/')

@not(equals(triggerBody()?['msdyn_flow_approval_result'], null))

The above conditions filter the value I passed in the create approval flow (Item Link) and if the item has been approved or rejected.

Here is an overview of the flow that handles the outcome of the approval. I mixed dataverse connector types due to an issue with the trigger condition not working with the green dataverse connector. In the Expand Query field, I used the Fetch XML builder to query over to the Approval Response table to get the comment field; not used in the example, but nonetheless, it’s there. From the Get a row by ID action, the response of the approval is available to use to handle the outcome (Result) of the approval.

To my knowledge, there is no reason why you can’t create an approval that is active for months, if not years.

Notes:
1) You can access and review the approval records using PowerBI, Flow, Access, ___
2) You can bulks update the records using PowerShell, Flow, Access (be real careful), __
3) You can pass items in the Details field, then parse them out when handling the approval. Here is one simple example where I’m passing a SharePoint item ID from the approval and parsing it in the response flow:



Response flow compose statement that parses the Details field.

Expression: last(split(triggerBody()?['msdyn_flow_approval_details'],'**SPItemID:** '))




YES, this is a lot, but the general idea is simple; create an approval and handle the response.

Use Power Automate to Update a SharePoint Person Field

Using the SharePoint HTTP flow action to update a person or group field, I kept getting this error:

A 'PrimitiveValue' node with non-null value was found when trying to read the value of a navigation property; however, a 'StartArray' node, a 'StartObject' node, or a 'PrimitiveValue' node with null value was expected.

The field I was attempting to update is named Submitted By, with an internal name of Submitted_x0020_By. Each time I tried to update the field I was seeing the error noted above. It wasn’t until I looked at one of my previous flow runs did I notice what the issue was. It turns out, that the field name I should be using is Submitted_x0020_ById.



Update flow:

How do you update a Person field if the field allows for multiple selections? The example below will update the field with two different user values, but clearly, this could be extended to be more dynamic.

body('Send_an_HTTP_request_to_SharePoint_User_1')?['d']?['Id']
body('Send_an_HTTP_request_to_SharePoint_User_2')?['d']?['Id']

concat('[',outputs('Compose_1'),',',outputs('Compose_2'),']
{
    "__metadata": {
        "type":"SP.Data.AssignedToListListItem"
    },
    "SubmittedByIDsId": {
         "results": [
                 6,
                 54
          ]
    }
}

Download a File From SharePoint Online Using Python

How do you download a file from a SharePoint Online library using Python?

Update – If you scroll to the bottom, I’ve outlined another approach that uses a username and password to connect via the SharePlum library.

Items needed to run the script in this example:
Office365 Rest Python Client library:
https://pypi.org/project/Office365-REST-Python-Client/
SharePoint App Only Client Id and Secret:
Microsoft documentation:
https://docs.microsoft.com/en-us/sharepoint/dev/solution-guidance/security-apponly-azureacs
You can create an app principle that is limited to a single site, list, library, or a combination of them:
https://piyushksingh.com/2018/12/26/register-app-in-sharepoint/

from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.files.file import File

app_settings = {
    'url': 'https://YOURtenant.sharepoint.com/sites/somesite/',
    'client_id': '12344-abcd-efgh-1234-1a2d12a21a2121a',
    'client_secret': 'Oamytacohungry234343224534543=',
}

context_auth = AuthenticationContext(url=app_settings['url'])
context_auth.acquire_token_for_app(client_id=app_settings['client_id'], client_secret=app_settings['client_secret'])

ctx = ClientContext(app_settings['url'], context_auth)
web = ctx.web
ctx.load(web)
ctx.execute_query()

response = File.open_binary(ctx, "/Shared Documents/Invoice.pdf")
with open("./Invoice.pdf", "wb") as local_file:
    local_file.write(response.content)

If the above script does not work, step back and ensure you are connected to the site. The following script connects to a site and outputs its title. This is useful to validate that a site connection can be made.

from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.files.file import File

app_settings = {
    'url': 'https://YOURtenant.sharepoint.com/sites/somesite/',
    'client_id': '12344-abcd-efgh-1234-1a2d12a21a2121a',
    'client_secret': 'Oamytacohungry234343224534543=',
}

context_auth = AuthenticationContext(url=app_settings['url'])
context_auth.acquire_token_for_app(client_id=app_settings['client_id'], client_secret=app_settings['client_secret'])

ctx = ClientContext(app_settings['url'], context_auth)
web = ctx.web
ctx.load(web)
ctx.execute_query()

print("Site title: {0}".format(web.properties['Title']))

SharePlum connection example using a username and password to connect to SharePoint Online. More details about SharePlum can be found here: https://github.com/jasonrollins/shareplum

from shareplum import Site
from shareplum import Office365

sharepoint_url = 'https://YOURtenant.sharepoint.com/sites/spdev'
username = 'You@YourDomain.com'
password = 'Password'

authcookie = Office365('https://YOURtenant.sharepoint.com',
                       username=username,
                       password=password).GetCookies()
site = Site('https://YOURtenant.sharepoint.com/sites/DEV/',
            authcookie=authcookie)
sp_list = site.List('Your List')
data = sp_list.GetListItems('All Items', row_limit=200)

If you get this error, you won’t be able to connect with a username and password, and you’ll need to use an App Password.

File “C:\Python311\Lib\site-packages\shareplum\office365.py”, line 80, in get_security_token
raise Exception(‘Error authenticating against Office 365. Error from Office 365:’, message[0].text)
Exception: (‘Error authenticating against Office 365. Error from Office 365:’, “AADSTS50076: Due to a configuration change made by your administrator, or because you moved
to a new location, you must use multi-factor authentication to access ”.”)

I’ve created this post to outline how to upload a large file to a SharePoint library:
https://www.sharepointed.com/2024/03/how-to-upload-a-large-file-to-sharepoint-using-the-microsoft-graph-api/

Use Flow to Get Files Created or Modified Today in a SharePoint Library

Scenario:
Each day I have a couple of Azure Runbooks export SharePoint list items and upload them to a SharePoint library. If one of the Runbooks fails, I needed to send an email alert that something went wrong.

Basic logic:
If files created today in SharePoint <> X, send an email.

The easy solution would have been to loop through the files, check their created date, increment a variable, then make a condition statement.

More-better way:
Run flow daily at 6:00 PM
Send an HTTP request to SharePoint to get files
Parse the response
Condition statement
— if true, send an email

Uri text from the HTTP call:

_api/search/query?querytext='Path%3Ahttps%3A%2F%2Fsharepointed.sharepoint.com%2Fsites%2Fsitename%2Fsubsite%2Fexports%2F*%20LastModifiedTime%3Dtoday'

Parse JSON schema

{
    "type": "object",
    "properties": {
        "odata.metadata": {
            "type": "string"
        },
        "ElapsedTime": {
            "type": "integer"
        },
        "PrimaryQueryResult": {
            "type": "object",
            "properties": {
                "RelevantResults": {
                    "type": "object",
                    "properties": {
                        "TotalRows": {
                            "type": "integer"
                        },
                        "TotalRowsIncludingDuplicates": {
                            "type": "integer"
                        }
                    }
                }
            }
        }
    }
}

Edit –
If you want to search for files created before or after a date, you can adjust the API like this:
and created %3E 2021-12-12T19:07:51.0000000Z
This will fetch any files created after Dec 12th 2021.
The unicode for greater than is %3E and less than is %3C