Metadata is information about information. Eg. Title and Author of a book are metadata. SharePoint provides options to manage metadata in many ways like adding metadata information to Lists/Libraries, creating metadata term sets, terms and translation of the terms.   Adding metadata information can be beneficial in several ways.

Some of the benefits of using Metadata:

1. Can control the term sets and terms entered by the user so that the metadata values can be consistent throughout.

2. Users can be restricted to create the term sets/terms, so that only the authorized users can add terms/term sets

3. Content can be easily searched by using the metadata classification

4. Setup Metadata navigation for the lists/libraries such that the tree view structure is organized based on the metadata term  hierarchy

5. Can dynamically add new terms to the set when new items are added, so that the metadata collection is maintained up to date.

 

Scenario and Approach

We have implemented managed metadata for one of the customers and this post gives an overview on the scenario and the approach we have taken in the implementation of the required solution.

The customer was having a large number of documents and he wanted to migrate them to SharePoint so that they are accessible through the world wide web. Since the number of documents was more and the total size was of several GBs we felt that they should also be classified appropriately and should be made easily searchable. To achieve this, we have used document Library where the metadata terms were added. Our solution consisted of the following major steps:

  • Analysis of document metadata structure and creation of document library
  • Creation of managed metadata application and metadata term sets
  • Add documents to library and update metadata columns
  • Create Search application and crawling of the site

Each of the steps are further explained below.

Analysis of document metadata structure and creation of document library

1. Browse through the list of documents to identify the possible metadata term groups and terms
2. Create the necessary metadata term groups and terms for each of the term groups
3. Structure the document library folder structure as per the needs of the customer considering also the metadata term hierarchy
4. Create the metadata columns in the document library and associate the columns with the respective term sets
5. Create a SharePoint solution using Visual Studio and add new item using the List definition template.
6. Add the required columns in the document library other than the metadata columns
7. When adding the metadata columns, create corresponding note field for each of the metadata columns. For example we had month and year metadata term sets and we have modified the schema.xml to add 2 note fields associated to the year and month columns

         <ContentType ID=”0x0100290a06a9bb86497fa9961ab70b361c50″Name=”DocLibContentType”>

            <FieldRefs>

                 <FieldRef ID=”{bc8ae74a-7cf5-4690-b10d-2884449cd2f5}“Name=”Month” />

                <FieldRef ID=”{A4560F54-3BCB-42BC-8D35-17607AE20527}”Name=”MonthHTField0″ />

                <FieldRef ID=”{39e1d8a5-17bc-4197-b5d9-f766f07fa4a2}”Name=”Year” />

                <FieldRef ID=”{246536E8-4468-4FB1-B80F-855F1DEAE4C9}”Name=”YearHTField0″ />

               …

         </ContentType>

     <ContentTypeRef ID=”0x0100290a06a9bb86497fa9961ab70b361c50″ />

   </ContentTypes>

     <Fields>

<FieldType=”TaxonomyFieldType” DisplayName=”Month” ShowField=”Term1033″ Required=”FALSE” ID=”{bc8ae74a-7cf5-4690-b10d-2884449cd2f5}” StaticName=”Month” Name=”Month” Mult=”FALSE” Group=”DMSGRP”></Field>

<FieldType=”Note” DisplayName=”Month_0″ StaticName=”MonthHTField0″ Name=”MonthHTField0″ ID=”{A4560F54-3BCB-42BC-8D35-17607AE20527}” Mult=”FALSE” Hidden=”TRUE” DisplaceOnUpgrade=”TRUE” Group=” DMSGRP ” Required=”FALSE” />

<FieldType=”TaxonomyFieldType” DisplayName=”Year” ShowField=”Term1033″ Required=”FALSE” ID=”{39e1d8a5-17bc-4197-b5d9-f766f07fa4a2}” StaticName=”Year” Name=”Year” Mult=”FALSE” Group=” DMSGRP “></Field>

<FieldType=”Note” DisplayName=”Year_0″ StaticName=”MonthHTField0″ Name=”YearHTField0″ ID=”{246536E8-4468-4FB1-B80F-855F1DEAE4C9}” Mult=”FALSE” Hidden=”TRUE” DisplaceOnUpgrade=”TRUE” Group=” DMSGRP ” Required=”FALSE” />

 Make sure that the GUID of Month, MonthHTField0 etc. in the <FieldRef> node are same as the GUID of the Month, MonthHTField0 etc. in the <Field> node. See the bolded values above.

8. Connect the note fields to the metadata fields inside the FeatureActivated event

public override void FeatureActivated(SPFeatureReceiverProperties properties)

{

// declare constants

const string MONTHFIELDID = “{bc8ae74a-7cf5-4690-b10d-2884449cd2f5}”;

const string MONTHNOTEFIELDID = “{A4560F54-3BCB-42BC-8D35-17607AE20527}”;

const string YEARFIELDID = “{39e1d8a5-17bc-4197-b5d9-f766f07fa4a2}”;

const string YEARNOTEFIELDID = “{246536E8-4468-4FB1-B80F-855F1DEAE4C9}”;

SPSite site = properties.Feature.Parent as SPSite;

 

ConnectNoteField(site, MONTHFIELDID, MONTHNOTEFIELDID, “Month”);

ConnectNoteField(site, YEARFIELDID, YEARNOTEFIELDID, “Year”);

}

 

private void ConnectNoteField(SPSite site, string fieldid, string notefieldid, string termsetname)

{

if (site != null)

{

SPList list = site.RootWeb.Lists[“DMSLibrary”];

// get the taxonomyfield from the sitecollection

TaxonomyField field = list.Fields[new Guid(fieldid)] as TaxonomyField;

if (field != null)

{

  // attach the note field to the metadata field

  field.TextField = new Guid(notefieldid);

// set up the field for my termstore

TaxonomySession session = new TaxonomySession(site);

if (session.TermStores.Count > 0)

{

// get termstore values

TermStore ts = session.TermStores[0];

bool createdNew = false;

Group group = ts.Groups.Where(x => x.Name == “DMSGRP”).FirstOrDefault();

 

if (group == null)

{

group = ts.CreateGroup(“DMSGRP”);

createdNew = true;

}

// simillarly you can check for Term and TermSet

//Create a new Term Set in the new Group

TermSet TermSet = group.TermSets.Where(s => s.Name == termsetname).FirstOrDefault();

if (TermSet == null)

{

TermSet = group.CreateTermSet(termsetname);

createdNew = true;

}

if (createdNew)

ts.CommitAll();

// actually setup the field for using the TermStore

field.SspId = ts.Id;

field.TermSetId = TermSet.Id;

}

// update the changes to the field

field.Update();

}

}

}

We also need to connect the note fields to the metadata fields inside the FeatureActivated event of the document library feature.   Refer the above code sample.

Note that it is important to create note fields corresponding to all the metadata columns and the GUID of the FieldRefs and Fields should be matching. Also the note fields need to be connected to the metadata fields through code as shown above. Then only the metadata values can be programmatically updated for the metadata columns of the library.

9. Deploy the document library with the above structure into the SharePoint site.

When we create the library using the definition we created, the document library is available with the necessary fields and metadata columns. Now we need to add the metadata term sets and terms and update the term sets for each of the library items.

Creation of managed metadata application and metadata term sets

1. Create Managed Metadata Service Application using Central Administration

2. Go to the application and create Term Set Group named “DMSGRP” which is specified in the above XML schema.

3. Create the necessary term sets and terms for each of the term sets through the metadata application in the central admin

Our basic solution with the document library and the metadata term group/sets are in place. We need to add the documents of the client into our document library and devise a solution to update the metadata columns.

Add documents to library and update metadata columns

1. Add the required documents into our document library maintaining the necessary folder structures in place.

2. At this stage the metadata columns will be empty which should be filled for the added document library items.

3. We can device a logic for parsing the name of the document or the folder structure to determine the values of the metadata columns for the item. Eg. if the document is of the name Invoice_Apr2013, then the Year metadata value will be 2013 and Apr or April will be the value of the Month metadata. Note that the month values should be consistent throughout, eg. If its Apr, throughout it should be Apr or if April, it should be represented by April everywhere. This is ensured with the help of Terms and is one of the advantages of the Metadata management.

4. We should also maintain the same naming convention throughout and going forward the documents should be uploaded with the same conventions, so that we can get the necessary metadata values from the document being added.

5. A tool can be created to go through the document library items and add the metadata values of the metadata columns. If any metadata terms are new, they should also be added to the term store by the tool.

See the following code snippet which does this:

SPListItem oListItemAvailable = oList.GetItemById(Convert.ToInt32(oListItemAvailableTemp[“ID”]));

//use some logic to get the month from the document library

month = GetMonth(fileName);

//similarly retrieve the other metadata values

TaxonomyField taxFieldMonth = oListItemAvailable.Fields[“Month”] as TaxonomyField;

if (!string.IsNullOrEmpty(month))

{

AddTerm(cSite, “Month”, month, out termMonth);

 

TaxonomyFieldValue taxFieldValueMonth = new TaxonomyFieldValue(taxFieldMonth);

taxFieldValueMonth.TermGuid = termMonth.Id.ToString();

taxFieldValueMonth.Label = termMonth.Name;

oListItemAvailable[“Month”] = taxFieldValueMonth;

}

//Update the other metadata column values

//Finally update the list item and list

oListItemAvailable.Update();

oList.Update();

//The term is added to the tem store if not existing, if the term is already existing, the term is retrieved from the term store

private bool AddTerm(SPSite site, string termSetName, string termValue, outTerm createdTerm)

{

TaxonomySession session = new TaxonomySession(site);

createdTerm = null;

if (session.TermStores.Count > 0)

{

// get termstore values

TermStore ts = session.TermStores[0];

 

Group group = ts.Groups.Where(x => x.Name == “DMSGRP”).FirstOrDefault();

if (group == null)

throw new Exception(“Group was not found in the termstore”);

 

TermSet termSet = group.TermSets.Where(s => s.Name.Trim().ToLower() ==

termSetName.Trim().ToLower()).FirstOrDefault();

if (termSet == null)

termSet = group.CreateTermSet(termSetName);

 

Term term = termSet.Terms.Where(s => s.Name.Trim().ToLower() == termValue.Trim().ToLower()).FirstOrDefault();

if (term == null)

{

createdTerm = termSet.CreateTerm(termValue, 1033);

ts.CommitAll();

return true;

}

else

{

createdTerm = term;

return true;

}

}

return false;

}

The above code can be used in a standalone tool to add all the metadata columns of all the document library items at once. Or we can place it in an event receiver (Eg. ItemUpdated /ItemAdded) such that the metadata values will be updated automatically whenever any new document is added or a change occurs to a document library item.

 

Once the metadata values are added using the tool and we have the event receivers in place, whenever any new documents are added the metadata values will be updated, going forward.

See the below images of the document library and metadata columns after creation.

This image shows the email correspondence document library structure with Company Name, Year folder hierarchy

 

Document Library Folder Structure

Document Library Folder Structure

This image shows the link between the SharePoint column and Metadata term value.

Metadata Column and Linkage

Metadata Column and Linkage

Now we’ll only need to enable the search and do a crawling of the site collection.

Create Search application and crawling of the site

1. Create the Search service application in the central admin.

2. Create a content source and do a full crawling of the entire site so that our document library contents are scanned and indexes are created for search.

3. Set up a daily schedule for crawl so that the crawling happens automatically for the changed documents.

4. Now the document library can be searched using any of the metadata values or the names of any other fields of the document library items.

Example searches: by using Month metadata column values: ‘Apr’ or year metadata value : ‘2013’ should retrieve all the documents that have the Month metadata as ‘Apr’ or Year metadata column value as 2013.

The above description mentions our needs in a project and explains the approach we followed to implement of the Taxonomy/Managed metadata functionality of SharePoint 2013 for the document library. In this way the documents from windows/file server can be migrated to the SharePoint library such that they are assigned metadata and made searchable by the users.