By: Rob Fisch | Updated: 2007-09-07 | Comments (6) | Related: > SharePoint
Problem
Sharepoint is a great collaborative tool, but its internal reporting capability is limited. Wouldn't it be so powerful if you could report against one or more Sharepoint lists from Microsoft's Reporting Services? Or how about just having the ability to integrate the data into some other SQL application and/or data warehouse? There is very little documentation on the internal Sharepoint databases, so how can you gather data from Sharepoint and make meaningful reports using Reporting Services?
Disclaimer This procedure is not for the faint of heart. When reporting directly against a production server, there could be performance implications. I highly recommend doing all testing in a non-production environment. I also recommend you check with your Sharepoint administrator and/or SQL DBA before doing any of this. (Hopefully both of those are you ;-). Furthermore, this procedure will defeat any internal Sharepoint security configurations, so you should be careful to consider what data your are letting out of the bottle. Check with business owner's before going live with anything. This procedure works with Sharepoint services (WSS 2.0 or 3.0). I have no first hand knowledge about any of the full Portal Server products, but I imagine it is similar since the Portal product is based upon WSS. Enough on the disclaimer. Let's proceed... |
Solution
Scope:
The scope of this solution is limited to list data. This would include event/calendars, announcements, custom lists, issues, links, contacts. Document and picture libraries, discussions, wikis, and the like will not be discussed, but it's possible the techniques discussed may help with those as well.
Additionally, as mentioned in the disclaimer, we're not going to be able to get at the security information relating to the list, though we will be able to display the "Created By" and "Modified By" fields.
Materials needed:
- Read only access to the Sharepoint database. This can be WSS2 or WSS3.
- Another SQL database as a working database for reporting.
- Plenty of coffee...and lot's of pizza.
Let's start with getting the data into a place where we can comfortably work on it. Create a new database on a test server.
Export the following tables and/or views from your production Sharepoint database to your new test database.
- Webs
- UserInfo
- Lists *
- UserData *
*IMPORTANT: In the case of WSS 3.0, these (with an asterisk *) are Views, not Tables.
In any event, get all these objects into your test (or working) database as tables.
These four tables contain all the information you need to report on your Sharepoint content. Below is a brief description of these tables/views.
Webs - This is simply a table listing all your Sharepoint "sites" and "sub-sites". (Not site collections.) The important fields are:
- Id
- FullURL
- Title.
UserInfo - What you would expect. Pertinent fields are:
- tp_ID
- tp_Login
- tp_Title
- tp_email.
Lists - This is a table of all the Sharepoint lists in your Sharepoint database. All of them! Needed fields are:
- tp_WebId
- tp_ID
- tp_Title
- tp_Created
- tp_Modified
- tp_Fields
UserData - This is where all the content is stored. All of it! However...not so fast, buster. This is where the hard work begins. Because the content is stored in generically named fields.
What do I mean? I mean, let's say you have a list. Let's say it's a Sharepoint contact list. You will have fields like, FirstName, LastName, Business Phone, Cell Phone, Address, etc. However, if you peak, you will see that the actual database field names are more like nvarchar1, nvarchar2, nvarchar3, nvarchar4, ntext1, ntext2, datetime1, datetime2, int1, int2, int3, etc.
Now your are exploring the limits of Sharepoint. You will see exactly how many datetime fields you can have in a single list, just by looking at the number of datetime fields. (WSS3 extends the capability a bit with a concept called site columns...but that's outside the scope of this discussion as well.)
Mapping the field names
So how to map the named list fields with the generic database fields? Well as it turns out, all this information is stored in a field called tp_fields in the Lists table. But it's an ugly map.
Below is the contents of the tp_fields field for a list I created to illustrate this example. (Don't even try to read this yet. Skip down below it.
<FieldRef Name="ContentTypeId"/><FieldRef Name="Title" ColName="nvarchar1"/><FieldRef Name="_ModerationComments" ColName="ntext1"/><FieldRef Name="File_x0020_Type" ColName="nvarchar2"/><Field Type="Text" DisplayName="Single_line_of_text" Required="FALSE" MaxLength="255" ID="{61337d5c-35eb-4af3-9593-7611a617290c}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Single_line_of_text" Name="Single_line_of_text" ColName="nvarchar3" RowOrdinal="0"/><Field Type="Text" DisplayName="Single_line_of_text_two" Required="FALSE" MaxLength="255" ID="{a7da379f-7daf-4fba-b99d-8d9b2f17ade3}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Single_line_of_text_2" Name="Single_line_of_text_2" ColName="nvarchar4" RowOrdinal="0" Version="1"><Default/></Field><Field Type="Note" DisplayName="Multiple_lines_of_text_PlainText" Required="FALSE" NumLines="6" RichText="FALSE" Sortable="FALSE" ID="{5450f7ac-53cd-4760-b440-b8539750bdd2}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_PlainText" Name="Multiple_lines_of_text_PlainText" ColName="ntext2" RowOrdinal="0"/><Field Type="Note" DisplayName="Multiple_lines_of_text_RichText" Required="FALSE" NumLines="6" RichText="TRUE" RichTextMode="Compatible" Sortable="FALSE" ID="{cdf42c72-2b38-4e45-bc03-811432cb3f6a}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_RichText" Name="Multiple_lines_of_text_RichText" ColName="ntext3" RowOrdinal="0"/><Field Type="Note" DisplayName="Multiple_lines_of_text_EnhanceRichText" Required="FALSE" NumLines="6" RichText="TRUE" RichTextMode="FullHtml" IsolateStyles="TRUE" Sortable="FALSE" ID="{875cfe1f-df40-4741-8a70-b0df287a30e7}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_EnhanceRi" Name="Multiple_lines_of_text_EnhanceRi" ColName="ntext4" RowOrdinal="0" Version="1"/><Field Type="Choice" DisplayName="Choice_dropdown" Required="FALSE" Format="Dropdown" FillInChoice="FALSE" ID="{59f15fae-20ee-45bc-aede-fb560e44a9cd}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Choice_dropdown" Name="Choice_dropdown" ColName="nvarchar5" RowOrdinal="0"><Default>Red</Default><CHOICES><CHOICE>Red</CHOICE><CHOICE>Ye llow</CHOICE><CHOICE>Blue</CHOICE></CHOICES></Field><Field Type="Number" DisplayName="Number_column" Required="FALSE" ID="{97935b88-1752-4f87-a571-34d42fe08b5e}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Number_column" Name="Number_column" ColName="float1" RowOrdinal="0"/><Field Type="Currency" DisplayName="Currency_column" Required="FALSE" LCID="1033" ID="{a2803d64-a1b5-4986-a651-fb1fdbe37532}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Currency_column" Name="Currency_column" ColName="float2" RowOrdinal="0"/><Field Type="DateTime" DisplayName="Date_and_Time" Required="FALSE" Format="DateOnly" ID="{3c394c06-51cb-48a6-8eaf-0c373e1c320f}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Date_and_Time" Name="Date_and_Time" ColName="datetime1" RowOrdinal="0"/><Field Type="Boolean" DisplayName="Yes_No" ID="{ffc5b6c2-d3ee-4a6d-9b4a-db938a1f8e7d}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Yes_No" Name="Yes_No" ColName="bit1" RowOrdinal="0"><Default>1</Default></Field><Field Type="User" DisplayName="Person_or_Group" List="UserInfo" Required="FALSE" ShowField="ImnName" UserSelectionMode="PeopleAndGroups" UserSelectionScope="0" ID="{feae87e2-dfb0-45ba-85d6-4d782d064987}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Person_or_Group" Name="Person_or_Group" ColName="int1" RowOrdinal="0"/><Field Type="URL" DisplayName="Hyperlink" Required="FALSE" Format="Hyperlink" ID="{5db05486-d8ff-42bc-82a8-fc6bc3228d88}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Hyperlink" Name="Hyperlink" ColName="nvarchar6" RowOrdinal="0" ColName2="nvarchar7" RowOrdinal2="0"/>
Whatever this stuff is, if you are good with Perl you might be able to parse it. But that's also outside the scope of this article. We're going to do it the dirty way.
The first thing to do is make it somewhat readable. I look for statements that begin with "<FieldRef Name=" or "<Field Type=" and put a line break in order to make the long spaghetti code readable. These mark the beginning of each field. Then I will mark in bold the mappings and make notations where I wanted to make comments. One more thing. My Sharepoint column names are things like "Single_line_of_text", "Multiple_lines_of_text" and "date_and_time", because I wanted to see the mapping of what Sharepoint calls 'field types' to actual database field types. These names are incidental to this discussion and was only done as a matter of interest to me. Hopefully, your Sharepoint field names will be a bit more meaningful.
Here comes the mapping...
<FieldRef Name="ContentTypeId"/><FieldRef Name="Title" ColName="nvarchar1"/>
<FieldRef Name="_ModerationComments" ColName="ntext1"/>
I don't really care about this one, but it's here for the sake of completion.
<FieldRef Name="File_x0020_Type" ColName="nvarchar2"/>
Don't care about this either.
<Field Type="Text" DisplayName="Single_line_of_text" Required="FALSE" MaxLength="255" ID="{61337d5c-35eb-4af3-9593-7611a617290c}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Single_line_of_text" Name="Single_line_of_text" ColName="nvarchar3" RowOrdinal="0"/>
<Field Type="Text" DisplayName="Single_line_of_text_two" Required="FALSE" MaxLength="255" ID="{a7da379f-7daf-4fba-b99d-8d9b2f17ade3}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Single_line_of_text_2" Name="Single_line_of_text_2" ColName="nvarchar4" RowOrdinal="0" Version="1"><Default/></Field>
Notice that the DisplayName no longer maps to StaticName or Name. That's because I changed the field name in Sharepoint, but Sharepoint keeps a history of the original name. A watchout is to look for DisplayName...not just Name.
<Field Type="Note" DisplayName="Multiple_lines_of_text_PlainText" Required="FALSE" NumLines="6" RichText="FALSE" Sortable="FALSE" ID="{5450f7ac-53cd-4760-b440-b8539750bdd2}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_PlainText" Name="Multiple_lines_of_text_PlainText" ColName="ntext2" RowOrdinal="0"/>
<Field Type="Note" DisplayName="Multiple_lines_of_text_RichText" Required="FALSE" NumLines="6" RichText="TRUE" RichTextMode="Compatible" Sortable="FALSE" ID="{cdf42c72-2b38-4e45-bc03-811432cb3f6a}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_RichText" Name="Multiple_lines_of_text_RichText" ColName="ntext3" RowOrdinal="0"/>
<Field Type="Note" DisplayName="Multiple_lines_of_text_EnhanceRichText" Required="FALSE" NumLines="6" RichText="TRUE" RichTextMode="FullHtml" IsolateStyles="TRUE" Sortable="FALSE" ID="{875cfe1f-df40-4741-8a70-b0df287a30e7}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_EnhanceRi" Name="Multiple_lines_of_text_EnhanceRi" ColName="ntext4" RowOrdinal="0" Version="1"/>
Note that all types of text that is multple lines map to the ntext database fields.
<Field Type="Choice" DisplayName="Choice_dropdown" Required="FALSE" Format="Dropdown" FillInChoice="FALSE" ID="{59f15fae-20ee-45bc-aede-fb560e44a9cd}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Choice_dropdown" Name="Choice_dropdown" ColName="nvarchar5" RowOrdinal="0"><Default>Red</Default><CHOICES><CHOICE>Red</CHOICE><CHOICE>Ye llow</CHOICE><CHOICE>Blue</CHOICE></CHOICES></Field>
Note the choices for the dropdown are stored here.
<Field Type="Number" DisplayName="Number_column" Required="FALSE" ID="{97935b88-1752-4f87-a571-34d42fe08b5e}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Number_column" Name="Number_column" ColName="float1" RowOrdinal="0"/>
<Field Type="Currency" DisplayName="Currency_column" Required="FALSE" LCID="1033" ID="{a2803d64-a1b5-4986-a651-fb1fdbe37532}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Currency_column" Name="Currency_column" ColName="float2" RowOrdinal="0"/>
<Field Type="DateTime" DisplayName="Date_and_Time" Required="FALSE" Format="DateOnly" ID="{3c394c06-51cb-48a6-8eaf-0c373e1c320f}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Date_and_Time" Name="Date_and_Time" ColName="datetime1" RowOrdinal="0"/>
<Field Type="Boolean" DisplayName="Yes_No" ID="{ffc5b6c2-d3ee-4a6d-9b4a-db938a1f8e7d}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Yes_No" Name="Yes_No" ColName="bit1" RowOrdinal="0"><Default>1</Default></Field>
<Field Type="User" DisplayName="Person_or_Group" List="UserInfo" Required="FALSE" ShowField="ImnName" UserSelectionMode="PeopleAndGroups" UserSelectionScope="0" ID="{feae87e2-dfb0-45ba-85d6-4d782d064987}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Person_or_Group" Name="Person_or_Group" ColName="int1" RowOrdinal="0"/>
<Field Type="URL" DisplayName="Hyperlink" Required="FALSE" Format="Hyperlink" ID="{5db05486-d8ff-42bc-82a8-fc6bc3228d88}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Hyperlink" Name="Hyperlink" ColName="nvarchar6" RowOrdinal="0" ColName2="nvarchar7" RowOrdinal2="0"/>
So much for the mapping. You will use your map to create field aliases in your select statement.
Bringing it home
No all we have to do is link some tables and we're home-free. You can create a view and base your reports on the view of your Sharepoint data. Here's how the tables link up:
- To get the title of your list, link UserData.tp_ListId to Lists.tp_ID
- To get the site information (FullURL & Title) link Lists.tp_WebId to Webs.Id
- To get user information link UserData.tp_Author and UserDate.tp_Editor to UserInfo.tp_Id. You will have to do this with twin aliased versions of UserInfo.
You might end up with something that looks like this:
SELECT USERDATA.NVARCHAR1 AS TITLE,
USERDATA.TP_AUTHOR AS AUTHORID,
USERINFO_EDITORS.TP_TITLE AS EDITORNAME,
USERINFO_AUTHORS.TP_EMAIL AS AUTHOREMAIL,
USERDATA.TP_EDITOR AS EDITORID,
USERINFO_AUTHORS.TP_TITLE AS AUTHORNAME,
USERDATA.TP_MODIFIED,
USERDATA.TP_CREATED,
USERDATA.NVARCHAR2 AS ADDRESS,
USERDATA.NVARCHAR3 AS PHONE,
USERDATA.NTEXT1 AS COMMENTS,
USERDATA.NTEXT2 AS NOTES,
USERDATA.NTEXT3 AS [STATUS REPORT],
LISTS.TP_TITLE AS LIST_TITLE,
USERINFO_EDITORS.TP_LOGIN AS USERLOGIN,
WEBS.FULLURL AS SITEURL,
WEBS.TITLE AS SITETITLE
FROM USERDATA
INNER JOIN LISTS
ON USERDATA.TP_LISTID = LISTS.TP_ID
INNER JOIN USERINFO AS USERINFO_AUTHORS
ON USERDATA.TP_AUTHOR = USERINFO_AUTHORS.TP_ID
INNER JOIN USERINFO AS USERINFO_EDITORS
ON USERDATA.TP_EDITOR = USERINFO_EDITORS.TP_ID
INNER JOIN WEBS
ON LISTS.TP_WEBID = WEBS.ID
WHERE (LISTS.TP_ID = 'the-id-of-your-list')
To find the id of your list for the WHERE part of the statement above, use the following query:
SELECT LISTS.TP_TITLE AS LIST_TITLE,
WEBS.FULLURL AS SITEURL,
WEBS.TITLE AS SITETITLE,
LISTS.TP_ID
FROM LISTS
INNER JOIN WEBS
ON LISTS.TP_WEBID = WEBS.ID
WHERE (LISTS.TP_TITLE = 'name-of-your-list')
Check the FullUrl in case there are duplicates.
Next Steps
- Sharepoint is a great collaborative tool. Now you can make it even better using the robust reporting options in Microsoft SQL Reporting Services. Just think of what you can create!
- Take a look at these other Reporting Services tips
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2007-09-07