Metadata extraction with Microsoft Syntex

Maciej Wasienczak

Loading

The world of AI

In recent times the main goal for many companies, including Microsoft was making work-life easier by allowing users to utilize the power of AI so that users can spend less time on mundane activities.

Microsoft Syntex is a user-friendly, low-code solution that offers simplicity in both comprehension and setup. It leverages artificial intelligence (AI) to provide content understanding, processing, and compliance services. By employing intelligent document processing and advanced machine learning, it seamlessly identifies, organizes, and categorizes documents within your SharePoint libraries.

In this blog we will take a look at how we can utilise Microsoft Syntex to create a simple yet effective model to extract key information from an email and use it as metadata to enrich our files with.

Case for an Email extractor

Big part of what we do in Infotechtion revolves around discovering what our clients’ needs are, and finding the right solution to cover those needs – be it our own products or tools provided by Microsoft. Our clients archive many different types of documents, correspondence documentation being one of them. During workshops with our clients, we noticed that there is a need for automation around correspondence like .msg files from Outlook being stored in SharePoint. We all know that to store an email from Outlook all you really need to do is to drag the email over to a SharePoint library and the file will be uploaded and enriched with all corporate metadata values. The challenge begins when we want to apply context metadata (Learn more about corporate and context metadata here) on the file. Context metadata in this case is information like email sender, receiver or date sent/received. Microsoft Syntex is just the right tool for this task.

Model creation

Creating the model starts from Syntex Content Centre site which is distributed automatically with Syntex license.

When we create a model, we have 6 different ones to choose from, three custom models and three models that are prebuild for us by Microsoft.

In this blog we won’t go in depth on how the models work, but you can read about them here.

The model that fits our case best is the one called “Teaching method”. It is an “Unstructured document model”, also known as “Document understanding model”. Select it, choose previously created content type.

A tip from us

Before we go any further, here’s a good tip. Consider if you want to connect your model to an existing content type or create a new one. This will be essential if you want your emails to have corporate metadata values that might be used on the site where those will be stored. Our suggestion is to create a new content type for the model that includes all corporate metadata together with context metadata which are needed for the model, like sender, receiver and received date.

The model will inherit the name of the content type, that is why “Model name” field is not editable.

Add your training documents. Remember that the more documents you train on, the more precise the model will be.

Creating extractors

For this model we will need three different extractors, one for email sender, one for receiver and one for date sent. Remember the context metadata we added to our content type?

You will be able to connect those with your extractors here. Go on and label your documents and explanations. What explanations you use will depend on what you labelled in your training documents, and which values you wish to pull out of your email. Read more about extractors and explanations here.

Adjust your explanations as needed and aim for at least 85% accuracy.

Publishing the model

When you are done with training your extractors and are satisfied with how your model works, you can now deploy it on to different document libraries where correspondence files will be stored. You can read a deeper explanation on how to apply your models to libraries here.

We hope this blog provided you with some insights on how Microsoft Syntex can be utilized. Contact us and our experts can help implement Microsoft Sytex in your company!

 © 2024 Infotechtion. All rights reserved 

Facebook
Twitter
LinkedIn
Email

By submitting this form you agree that Infotechtion will store your details and send future resources. You may opt-out any time.

Recent posts

Job application.

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorestandard dummy text ever since.

Please fill the form

Job application.

Join Infotechtion for an impactful career filled with passion, innovation, and growth. Embrace diversity, collaboration, and continuous learning. Discover your potential with us. Exciting opportunities await!

Please fill the form

By submitting the form, you confirm that you do not require a visa sponsorship to work in the country of application.