By: Nai Biao Zhou | Updated: 2021-04-01 | Comments (5) | Related: > SharePoint
Problem
According to Microsoft, over 200,000 organizations and 190 million people have SharePoint for intranets, team sites, and content management (Microsoft, 2021). Nowadays, multiple versions of SharePoint are in use. Several open-source Python libraries can interact with SharePoint. People may come across compatibility issues when running Python programs to handle files in SharePoint document libraries. These Python programs usually upload, download, delete, and search for documents. IT professionals must spend much time on a trial and error learning process to solve the compatibility issues. They want to look at some sample programs and execute them in a free trial environment to speed up learning.
Solution
We first introduce the "Microsoft 365 Developer Program" that allows IT professionals to access a SharePoint Online environment (Graham, 2019). People can join the program without a credit card and any payment (Microsoft 365, 2021). At the time of this writing, all participants can get a free Microsoft 365 E5 developer subscription. We then work on a fictitious project for a digital publishing platform, like MSSQLTips.com, and provide Python scripts to handle files in a SharePoint document library. We use the Python library "Office365-REST-Python-Client" (Gremyachev, 2020) to connect SharePoint document libraries. The official site of this Python library provides many useful sample scripts. We can practice these Python scripts in the trial environment as well. The article comprises two parts, and readers who have an Office 365 environment can skip the first part:
In the first part, we go through the steps to create an office 365 account and access a SharePoint Online environment. To simulate a small corporate environment, we install sample data packs. The "Users" sample data pack creates 16 fictitious users, and the SharePoint sample data pack can add several SharePoint site templates. To allow Python programs to access SharePoint document libraries, we disable security defaults. However, at work, we should consult administrators to configure the security for Python program access.
The second part explores steps to establish a connection with a SharePoint document library using the Python library "Office365-REST-Python-Client 2.3.1." We then cover the process to upload a document to a SharePoint document library and add the document’s metadata. Next, we show steps to download and delete files from the library. The document library may have folders and sub-folders. We show a way of searching for files recursively. We also provide sample scripts to handle documents in sub-folders.
The author tested all the programs used in this article with Microsoft Visual Studio Community 2019 and Python 3.7 (64-bit) on Windows 10 Home 10.0 <X64>. During testing, all the Python programs were connected to the SharePoint Online environment provided by the "Microsoft 365 Developer Program." The web browser "Opera" was used to access the SharePoint document library.
1 – Creating a SharePoint Online Environment
An effective strategy to learn computer skills is to learn by doing. During practicing, we may want to separate our studying environments from working environments. This section creates a new environment (a sandbox) using no personal information except a cell phone number. We need a cell phone to receive a verification code when setting up our Microsoft 365 developer subscription. The Office 365 account registration process needs an email account. We sign up for a personalized email at https://www.mail.com. We also download the web browser "Opera" for accessing the information on the Internet.
1.1 Creating a Microsoft Account
At the time of this writing, we can access the landing page of the Microsoft 365 Developer Program at https://developer.microsoft.com/en-us/microsoft-365/dev-program. Figure 1 illustrates the landing page.
Figure 1 The Landing page of the Microsoft 365 Developer Program
After clicking on the button "Join Now" on the landing page, we are asked to sign in to microsoftonline.com, as illustrated in Figure 2.
Figure 2 The Microsoft account sign-in dialog
To create a new Microsoft account, we click on the link button "Create one!" The dialog for creating a Microsoft account appears, as shown in Figure 3. We enter an email address, click on the "Next" button, and then follow the instructions on the screens to complete the registration process. At the end of this step, we should have a Microsoft account.
Figure 3 The dialog to create a Microsoft account
1.2 Participating in the Microsoft 365 Developer Program
We go back to the Microsoft account sign-in page, illustrated in Figure 2. Following the instructions on the screens, we sign into the Microsoft account. During the signing-in process, the program asks us a few questions, as shown in Figure 4. Answer these questions and complete the form.
Figure 4 Join the Microsoft 365 developer program
We click on the button "Next" and follow instructions on the screens to complete the signing-in process. Then, we arrive at the profile page, as shown in Figure 5.
Figure 5 The profile page of the Microsoft Developer Program
1.3Setting up an Office 365 Developer Subscription
As illustrated in Figure 5, the Microsoft 365 developer program membership entitles us to a free Microsoft 365 E5 subscription with all Office 365 apps, including SharePoint, OneDrive, Outlook/Exchange, Teams, Planner, Word, Excel, PowerPoint, and more (Microsoft 365, 2021). We use this subscription to build a SharePoint Online environment and run Python programs. We click on the "SET UP E5 SUBSCRIPTION" button on the web page to start the subscription process. A dialog appears, and it should look like Figure 6.
Figure 6 Set up Microsoft 365 E5 developer subscription
We must select a country (or region), a username, a domain, and a password to complete the form. When the button "Continue" becomes enabled, we click on the button. The "Add phone number for security" dialog appears, as shown in Figure 7. We should use a valid cell phone number to receive a code. After we enter the code, the "Set up" button becomes enabled. We click on the button to continue the sign-up process.
Figure 7 Add phone number for security
We follow instructions on the screens to complete the setup process. Then, we land on a dashboard page, as shown in Figure 8. The username and domain selected in Figure 6 form the administrator account. In the "Sample data packs" section, we can choose to install sample data packs. The Users sample data pack creates 16 fictitious users on the subscription, and the SharePoint sample data pack contains several site templates (Microsoft 365, 2021). We install the User data packs and a SharePoint team site template. The installation process may take a few minutes.
Figure 8 A dashboard in the Microsoft 365 dev center
1.4 Turning off Security Defaults
By default, Microsoft automatically enforces a set of basic identity security mechanisms. To enable Python programs to access SharePoint sites, we disable the security defaults. However, at work, we should consult Office 365 administrators for any changes in security configurations. All configurations in this article are for demonstration. The goal is to show how to use Python scripts to handle files in a SharePoint document library. We present a quick and straightforward way to create a document library for Python applications to access. We should not apply this method in production.
We use the following steps to turn off the security defaults. First, let us sign into the site https://www.office.com/ using the administrator account. We then find the "App launcher" icon on the top left, as shown in Figure 9. Clicking on the icon, we see a list of Apps in a pop-up pane. The "Admin" app icon appears in the pane. Alternatively, we can find the "Admin" app icon in the vertical toolbar on the left.
Figure 9 Click on the App launcher icon to show some frequently used Microsoft Apps
Clicking on the "Admin" icon, we land on the "Microsoft 365 admin center" page, as shown in Figure 10. We can expand the User menu to view the 16 fictitious users. We expect to find the "Azure Active Directory" menu item, but it is not visible. Clicking on the "Show all" button, we see a complete list of menu items.
Figure 10 The landing page of the Microsoft 365 admin center
After clicking on the "Show all" button, we expand the left pane and see the "Azure Active Directo…" menu item, as shown in Figure 11.
Figure 11 Expand the admin centers
When we click on the "Azure Active Directo…" menu item, a pop-up dialog shows up, as shown in Figure 12.
Figure 12 Use administrator account to log in Microsoft Azure
We select the administrator account to access the "Azure Active Directory admin center," and the landing page should look like Figure 13. On the left pane, there is an "Azure Active Directory" link button.
Figure 13 Azure Active Directory admin center
We click on the link button "Azure Active Directory," then choose the "Properties" menu item. The "Manage Security defaults" link button appears, as shown in Figure 14. Clicking on the button, we can disable the security defaults on the right pane. Finally, we click on the "Save" button to save the change. We want to emphasize again that we should consult Office 365 administrators for any security configuration changes. This article introduces a quick way, but not a very practical one, to give access to Python applications.
Figure 14 The process to disable security defaults
1.5 Creating a SharePoint Document Library
SharePoint is a website-based collaboration system that empowers business teams to work together and implement business process automation (Satapathi, 2020). A fictitious company ABC provides a digital publishing platform for writers. The company wants to use a SharePoint document library to manage articles and some corporate documents. After writers submit their articles through a web form, A Python program moves the articles to the SharePoint document library. Then, ABC editors review these articles. If articles do not fit the platform, the editors tag the articles with the "Rejected" label. Otherwise, the editors put the "Approved" label on the articles. Then, another Python program carries these articles to a file server for publication. The program places corporate documents in a sub-folder of the document library.
To create a SharePoint document library for the company ABC, we first need to build a SharePoint site. If we sign out of the site https://www.office.com/, we need to sign in again with the administrator’s account. We click on the "App launcher" icon. On the pop-up pane shown in Figure 9, we select the "SharePoint" app. We then land on a page shown in Figure 15. The "Create site" button on this page allows us to create a SharePoint site using the installed site templates. Microsoft provides SharePoint Online training tutorial: Create a site in SharePoint. Following their instructions, we create a SharePoint site, "ABC," and add a "Publications" document library.
Figure 15 Create a SharePoint Site
We add two columns to the library: Category and Status. The Category column is a free form text field. A writer can determine a category for his/her article during uploading process. Editors can recommend a category for an article. The Status column is a choice column with four options: Submitted, In Process, Rejected, and Approved. The default value of the choice column is "Submitted." After completing the setup, we upload a document to test the library. The document library should look like Figure 16. In the next section, we use Python scripts to upload, download, and delete documents. That section also covers steps to add values to the Category column, create a folder and handle files in the folder.
Figure 16 The SharePoint library with one document
2 – Handling Files in the SharePoint Document Library
At the ABC company, we use a Python program to move the uploaded files from a webserver to the SharePoint document library. After editors review articles and determine to publish them, we use another Python program to move the latest versions of the files from the document library to a file server. We also use Python code to upload some corporate documents to a sub-folder of the library. During development, we created a Python application project using Visual 2019 (Zhou, 2020). We create a virtual environment for this project using this requirement.txt:
certifi==2020.12.5 cffi==1.14.5 chardet==4.0.0 cryptography==3.4.6 idna==2.10 msal==1.10.0 Office365-REST-Python-Client==2.3.1 pip==21.0.1 pycparser==2.20 PyJWT==2.0.1 requests==2.25.1 setuptools==54.1.1 urllib3==1.26.3
The Python library "Office365-REST-Python-Client" comes with many examples, and we can study these examples at https://github.com/vgrem/Office365-REST-Python-Client/tree/master/examples/sharepoint. These examples cover most business requirements we may encounter at work. If we cannot find detailed information about a class or a method, we can view the source code at https://github.com/vgrem/Office365-REST-Python-Client/tree/master/office365/sharepoint. These Python scripts are also useful resources for us to study Python programming. In situation where some operations, for example, upload and download documents are used repeatedly in a project, it is good idea to create helper methods for them.
2.1 Connecting to the SharePoint Library
A document library is a special type of SharePoint list that contains documents and metadata (Katzer & Crawford, 2013). We can use the Python library "Office365-REST-Python-Client" to construct a file object or a list item object. The file object has a property "listItemAllFields" representing the corresponding list item. Likewise, the list item has a property "file" that allows us to access the associated file object. Since we created the document library and uploaded a file to the library already, we now use the following Python code to connect the SharePoint library, list all files, and access metadata through their associated list item properties:
fromoffice365.sharepoint.client_contextimportClientContext SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/' SP_DOC_LIBRARY ='Publications' USERNAME ='[email protected]' PASSWORD ='******' # 1. Create a ClientContext object and use the user’s credentials for authentication ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) # 2. Read file entities from the SharePoint document library files = ctx.web.lists.get_by_title(SP_DOC_LIBRARY).root_folder.files ctx.load(files) ctx.execute_query() # 3. loop through file entities for filein files: # 4. Access the file object properties print(file.properties['Name'], file.properties['UniqueId']) # 5. Access list item object through the file object item = file.listItemAllFields ctx.load(item) ctx.execute_query() print('Access metadata - Category: {0}, Status: {1}'.format(item.properties['Category'], item.properties['Status'])) # 4. The Output: # File Handling in SharePoint Document Library Using Python.docx 77819f08-5fbe-450f-9f9b-d3ae2862cbb5 # Access metadata - Category: Python, Status: Submitted
As aforementioned, a document library is a special type of SharePoint list. We also can find all list items in the library and then access the associated file objects. Like the preceding Python code block, we create a "ClientContext" object to represent the target SharePoint resources and use the user credentials for authentication. Then, the Python code retrieves a SharePoint list from the document library. A list item has an associated file object; therefore, we can access all documents in a library by accessing all the list items.
fromurllibimportparse fromoffice365.sharepoint.client_contextimportClientContext SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/' SP_DOC_LIBRARY ='Publications' USERNAME ='[email protected]' PASSWORD ='******' # 1. Create a ClientContext object and use the user’s credentials for authentication ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) # 2. Retrieve a SharePoint list from the document library oList = ctx.web.lists.get_by_title(SP_DOC_LIBRARY) items = oList.items ctx.load(items) ctx.execute_query() # 3. Look through list items for itemin items: # 4. Access list item properties. Parse file unique Id from the ServerRedirectedEmbedUri property of the list item dic_query_string =parse.parse_qs(parse.urlsplit(item.properties['ServerRedirectedEmbedUri']).query) unique_id = dic_query_string['sourcedoc'][0][1:-1] print('Unique Id:{0}, Category: {1}, Status: {2}'.format(unique_id, item.properties['Category'], item.properties['Status'])) # 5. Retrieve the file object from the list item file = item.file ctx.load(file) ctx.execute_query() print('Unique Id:{0}, Name: {1}'.format(file.properties['UniqueId'], file.properties['Name'])) # 6. The output: #Unique Id:77819f08-5fbe-450f-9f9b-d3ae2862cbb5, Category: Python, Status: Submitted #Unique Id:77819f08-5fbe-450f-9f9b-d3ae2862cbb5, Name: File Handling in SharePoint Document Library Using Python.docx
We usually place all settings in a standalone configure file. Gremyachev gives an example of a setting file. We can also use Python ConfigParser to read configuration data from a file with a structure like Microsoft Windows INI files (Bodnar, 2020). Another method is to place the configuration data in a JSON file (TutsWiki, 2020). However, the code we presented here did not use any of these methods but put all Python statements in a single file. This coding style is for demonstration, and we should not apply this style in actual development.
2.2 Uploading Files to the Document Library
When a writer uploads his/her article to ABC company, the writer chooses the category "R Language" for the article. The web application places the article on the webserver. A Python program should transport this file to the SharePoint library and tag this article with the category "R Language." The program shown in the following code block implements these steps.
importos fromoffice365.sharepoint.client_contextimportClientContext fromoffice365.sharepoint.listitems.caml.caml_queryimportCamlQuery SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/' SP_DOC_LIBRARY ='Publications' USERNAME ='[email protected]' PASSWORD ='******' # Assuming the file path on the webserver file_path ="C:/temp/articles/Index Numbers Explained with Examples in R.docx" file_name =os.path.basename(file_path) # 1. Create a ClientContext object and use the user’s credentials for authentication ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) # 2. Read the content of the file with open(file_path,'rb')as file_obj: file_content = file_obj.read() # 3. Upload the content to the SharePoint document library sp_folder = ctx.web.lists.get_by_title(SP_DOC_LIBRARY).root_folder sp_file = sp_folder.upload_file(file_name, file_content) ctx.execute_query() # 4. Add metadata to the document by assigning a value to the Category property item = sp_file.listItemAllFields item.set_property("Category","R Language") item.update() ctx.execute_batch()
We execute the code through Studio 2019 and then check the document library using the "Opera." The document "Index Numbers Explained with Examples in R.docx" should be added to the library. The Status column shows the default value "Submitted," and the Category column of the document should have a value of "R Language," as shown in Figure 17.
Figure 17 Update Metadata of a file in the SharePoint library
2.3 Downloading Files from the Document Library
At the company ABC, the editors review articles and correct spelling, punctuation, and grammatical errors. When they work on an article in the document library, they change the article status from "Submitted" to "In Process." Then, they may change the status to "Approved" or "Rejected" when they complete reviewing. Next, a Python program moves the approved article to a file server for publication. The program first searches all articles with status "Approved" status. Then, the program downloads all these articles to the file server. The following Python code implements these steps.
importos fromoffice365.sharepoint.client_contextimportClientContext fromoffice365.sharepoint.listitems.caml.caml_queryimportCamlQuery fromoffice365.sharepoint.files.fileimportFile SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/' SP_DOC_LIBRARY ='Publications' USERNAME ='[email protected]' PASSWORD ='******' FILE_SERVER_PATH ="C:/temp/publications/" # 1. Create a ClientContext object to connect the document library ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) # 2. Build a CAML query to find all list items that associate the approved articles qry_text ='''<Where> <Eq> <FieldRef Name='Status'/> <Value Type='Text'>Approved</Value> </Eq> </Where>''' caml_query =CamlQuery.parse(qry_text) caml_query.FolderServerRelativeUrl = SP_DOC_LIBRARY # 3. Retrieve list items based on the CAML query oList = ctx.web.lists.get_by_title(SP_DOC_LIBRARY) items = oList.get_items(caml_query) ctx.execute_query() # 4. Loop through all list items for itemin items: # 5. Query the SharePoint document library to find the associated file of a list item file = item.file ctx.load(file) ctx.execute_query() # 6. Construct the file path on the file server download_file_path =os.path.join(FILE_SERVER_PATH, file.properties['Name']) # 7. Download file from the document library to the file server with open(download_file_path,'wb')as download_file: download_file.write(file.read()) print('File downloaded :{0}'.format(download_file_path))
The code block also shows the CAML syntax. The Collaborative Application Markup Language (CAML), an XML-based language, can query a SharePoint list (McCarthy, 2017). In the preceding code block, the simple CAML query definition defines a filter where the column field equals a value of "Approved" using the "Eq" element. We also can use CAML to customize SharePoint sites (Juvonen & Caputo, 2014).
2.4 Deleting Files from the Document Library
The ABC company does not want to maintain the rejected articles. The company uses a Python program to remove these files from the document library. Similar to the downloading process, the deleting process searches for all articles with the status "Rejected." Then, the program deletes all these articles.
fromoffice365.sharepoint.client_contextimportClientContext fromoffice365.sharepoint.listitems.caml.caml_queryimportCamlQuery fromoffice365.sharepoint.files.fileimportFile SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/' SP_DOC_LIBRARY ='Publications' USERNAME ='[email protected]' PASSWORD ='******' # 1. Create a ClientContext object to connect the document library ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) # 2 Build a CAML query to find all list items that associate the rejected articles qry_text ='''<Where> <Eq> <FieldRef Name='Status'/> <Value Type='Text'>Rejected</Value> </Eq> </Where>''' caml_query =CamlQuery.parse(qry_text) caml_query.FolderServerRelativeUrl = SP_DOC_LIBRARY # 3 Retrieve list items based on the CAML query oList = ctx.web.lists.get_by_title(SP_DOC_LIBRARY) items = oList.get_items(caml_query) ctx.execute_query() # 4 Loop through all list items for itemin items: # 5 Query the SharePoint document library to find the associated file file = item.file ctx.load(file) ctx.execute_query() # 6 Invoke the delete_object method to delete the file file.delete_object() ctx.execute_query() print('File deleted :{0}'.format(file.properties['Name']))
2.5 Handling Files in SharePoint Document Library Folders
A document library may contain folders, which are similar to folders in the Windows File System. To organize corporate documents, the ABC company wants to group various documents in different folders. In preceding sections, when we uploaded files to the Publication library, the files went to the root folder. We can create folders inside the root folder. These folders may have independent permissions.
2.5.1 Uploading Files to the Document Library Folder
The following Python code accesses the folder "Administrator" in the SharePoint library. If the folder does not exist, the program is going to create it. The Python library "Office365-REST-Python-Client" provides a folder class to represent a SharePoint folder. We use the folder object’s upload file method to upload file content.
importos fromoffice365.sharepoint.client_contextimportClientContext fromoffice365.sharepoint.listitems.caml.caml_queryimportCamlQuery SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/' SP_DOC_LIBRARY ='Publications' SP_DOC_LIBRARY_FOLDER ='Administrator' # The folder in the document library USERNAME ='[email protected]' PASSWORD ='******' # 1. Assuming the location of the uploading file file_path ="C:/temp/administration/Request Form2.docx" file_name =os.path.basename(file_path) # 2. Create a ClientContext object to access the SharePoint library ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) # 3. Access the SharePoint folder or create the folder if it doesn’t exist sp_folder_path ="/{0}/{1}".format(SP_DOC_LIBRARY,SP_DOC_LIBRARY_FOLDER) sp_folder = ctx.web.ensure_folder_path(sp_folder_path).execute_query() # 4. Read the content of the file with open(file_path,'rb')as file_obj: file_content = file_obj.read() # 5. Upload the file to the folder in the SharePoint library sp_file = sp_folder.upload_file(file_name, file_content) ctx.execute_query() # 6. Add metadata to the uploaded document item = sp_file.listItemAllFields item.set_property("Category","Form") item.update() ctx.execute_batch()
2.5.2 Downloading Files from the Document Library Folder
We want to download all forms (files marked with Category "Form") in the Administrator folder. We first create a folder object to represent the document library folder. The folder object tells us the folder's relative URL in SharePoint Online. A CAML query then uses this URL to find all files in the folder. Next, we download the file to the file server.
importos fromoffice365.sharepoint.client_contextimportClientContext fromoffice365.sharepoint.listitems.caml.caml_queryimportCamlQuery fromoffice365.sharepoint.files.fileimportFile SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/' SP_DOC_LIBRARY ='Publications' SP_DOC_LIBRARY_FOLDER ='Administrator' # The folder in the document library USERNAME ='[email protected]' PASSWORD ='******' FILE_SERVER_PATH ="C:/temp/forms/" # 1. Create a ClientContext object to connect the document library ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) # 2. Connect to the SharePoint folder and construct a SharePoint folder object sp_folder_path ="/{0}/{1}".format(SP_DOC_LIBRARY,SP_DOC_LIBRARY_FOLDER) sp_folder = ctx.web.ensure_folder_path(sp_folder_path).execute_query() # 3. Build a CAML query to find all list items that are in the "Form" category qry_text ='''<Where> <Eq> <FieldRef Name='Category'/> <Value Type='Text'>Form</Value> </Eq> </Where>''' caml_query =CamlQuery.parse(qry_text) caml_query.FolderServerRelativeUrl = sp_folder.properties['ServerRelativeUrl'] # 4. Retrieve all list items based on the CAML query oList = ctx.web.lists.get_by_title(SP_DOC_LIBRARY) items = oList.get_items(caml_query) ctx.execute_query() # 5. Loop through all list items for itemin items: # 6. Query the SharePoint document library to find the associated file file = item.file ctx.load(file) ctx.execute_query() # 7. Construct the file path on the file server download_file_path =os.path.join(FILE_SERVER_PATH, file.properties['Name']) # 8. Download the file from the document library folder with open(download_file_path,'wb')as download_file: download_file.write(file.read()) print('File downloaded :{0}'.format(file.properties['Name']))
2.5.3 Deleting Files from the Document Library Folder
To delete a file from the document library folder, we should first find the list item that contains the file. From the list item, we can construct a file object that provides a delete object method. Next, we invoke the delete method to remove the file from the folder. The following code deletes all forms from the library.
importos fromoffice365.sharepoint.client_contextimportClientContext fromoffice365.sharepoint.listitems.caml.caml_queryimportCamlQuery fromoffice365.sharepoint.files.fileimportFile SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/' SP_DOC_LIBRARY ='Publications' SP_DOC_LIBRARY_FOLDER ='Administrator' USERNAME ='[email protected]' PASSWORD ='******' # 1. Create a ClientContext object to connect the document library ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) # 2. Construct a SharePoint folder object sp_folder_path ="/{0}/{1}".format(SP_DOC_LIBRARY,SP_DOC_LIBRARY_FOLDER) sp_folder = ctx.web.ensure_folder_path(sp_folder_path).execute_query() # 3. Build a CAML query to find all list items that associate to the form qry_text ='''<Where> <Eq> <FieldRef Name='Category'/> <Value Type='Text'>Form</Value> </Eq> </Where>''' caml_query =CamlQuery.parse(qry_text) caml_query.FolderServerRelativeUrl = sp_folder.properties['ServerRelativeUrl'] # 4. Retrieve all list items based on the CAML query oList = ctx.web.lists.get_by_title(SP_DOC_LIBRARY) items = oList.get_items(caml_query) ctx.execute_query() # 5. Loop through all list items for itemin items: # 6. Query the SharePoint document library to construct a file object file = item.file ctx.load(file) ctx.execute_query() # 7. Invoke the delete_object method to delete the file file.delete_object() ctx.execute_query() print('File deleted :{0}'.format(file.properties["ServerRelativeUrl"]))
2.5.4 Searching for Files in the Document Library Recursively
The ABC company may need to find all files being in the category "Approved." Since a document library has folders, and the folders may have sub-folders, the program should search for files recursively. When constructing a CAML query, we can specify the scope for returning list items and list folders. When we pass the scope "ViewScope.Recursive" to the CamlQuery.parse() function, the function searches for files that satisfy the query in all folders.
fromoffice365.sharepoint.client_contextimportClientContext fromoffice365.sharepoint.listitems.caml.caml_queryimportCamlQuery,ViewScope SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/' SP_DOC_LIBRARY ='Publications' USERNAME ='[email protected]' PASSWORD ='******' # 1. Create a ClientContext object to connect the document library ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) # 2. Build a CAML query to find all list items being in the category "Approved" qry_text =''' <Where> <Eq> <FieldRef Name='Status'/> <Value Type='Text'>Approved</Value> </Eq> </Where>''' # 3. Define the scope "Recursive" that search for files in all the folders. caml_query =CamlQuery.parse(qry_text,ViewScope.Recursive) caml_query.FolderServerRelativeUrl = SP_DOC_LIBRARY # 4. Retrieve list items based on the CAML query oList = ctx.web.lists.get_by_title(SP_DOC_LIBRARY) items = oList.get_items(caml_query) ctx.execute_query() # 5. Loop through all list items for itemin items: # 6. Query the SharePoint document library to find the associated file file = item.file ctx.load(file) ctx.execute_query() # 7. Find the Server Relative URL of the file file_relative_url =file.properties["ServerRelativeUrl"] # 8. Print the Relative URL of the file print('Relative File Path: {0}'.format(file_relative_url))
Summary
The article demonstrated how to set up a SharePoint Online environment and create a SharePoint document library. We then used Python scripts to handle files in the document library. These scripts also showed that the document library is a SharePoint list. To show how to use these scripts at work, we assumed a fictitious project that enabled business process automation for a digital publishing platform. We provided a proof-of-concept solution to the project.
We covered steps to create a Microsoft account, participate in the Microsoft 365 Developer Program, set up an Office 365 Developer subscription, and create a SharePoint site. For the sake of demonstration and simplicity, we turned off security so that Python programs could access the SharePoint site. We also installed sample data packs so that users could practice Python scripts in a small corporate environment. The article explored a few steps to add a SharePoint document library to the SharePoint site.
Next, the article showed how to connect to the SharePoint document library, upload files to the library, and download and delete files from the library. The Python programs provided a way of adding metadata to an uploaded document. We also demonstrated techniques to access SharePoint document library folders and handle files in the folders. The last piece of the Python code could search for files in the document library recursively.
Reference
Bodnar, J. (2020). Python ConfigParser tutorial. https://zetcode.com/python/configparser/.
Graham, L. (2019). Microsoft 365 Developer Program FAQ. https://docs.microsoft.com/en-us/office/developer-program/microsoft-365-developer-program-faq.
Juvonen, V. & Caputo, L. (2014). Introduction to Collaborative Application Markup Language (CAML). https://docs.microsoft.com/en-us/sharepoint/dev/schema/introduction-to-collaborative-application-markup-language-caml.
Katzer, M. & Crawford, D. (2013). Office 365: Migrating and Managing Your Business in the Cloud. Apress
McCarthy, C. (2017). Learn to Use CAML Query – Part 1. https://www.brightwork.com/blog/caml-joins-caml-syntax-brightwork.
Microsoft (2021). SharePoint. https://www.microsoft.com/en-ca/microsoft-365/sharepoint/collaboration/.
Microsoft 365 (2021). Join the Microsoft 365 Developer Program today! https://developer.microsoft.com/en-us/microsoft-365/dev-program.
Gremyachev, V. (2020). Office365-REST-Python-Client 2.3.1. https://pypi.org/project/Office365-REST-Python-Client/
Zhou, N. (2020). Learning Python in Visual Studio 2019. /sqlservertip/6442/learning-python-in-visual-studio-2019/.
Chan, E. (2013). Scopes in a CAML Query. http://www.codegrimoire.com/2013/08/scopes-in-caml-query.html.
Satapathi, S. (2020). Performing A CRUD Operation On A SharePoint List Using Python. https://www.c-sharpcorner.com/article/performing-crud-operation-on-sharepoint-list-using-python/.
TutsWiki (2020). WRITING AND READING JSON CONFIG FILE IN PYTHON. https://tutswiki.com/read-write-json-config-file-in-python/
Next Steps
- In this article, the author disabled the security defaults so that Python programs could access the SharePoint library. We should consult Office 365 administrators to work out a security solution for the workplaces. The coding style in this article is for demonstration. We should not use this style at work. Lu provides a solution to upload files to Microsoft 365 SharePoint. Even though Lu used another open-source Python library, "SharePlum," his sample code is well-structured. He put all configurations into a JSON file, and he also adopted the object-oriented programming approach. We can find his YouTube video at https://youtu.be/BV6SLhOF3zU.
- Check out these related tips:
- Create a Python Django Website with a SQL Server Database
- CRUD Operations in SQL Server using Python
- Learning Python in Visual Studio 2019
- Python Programming Tutorial with Top-Down Approach
- Recursive Programming Techniques using Python
- Reading SharePoint Lists with Integration Services 2017
- What is Microsoft Power Apps
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2021-04-01