An effective metadata approach is a must for enterprises executing data management for thousands of documents every day. Explore what document metadata is, how it works, and why it’s fundamental to every modern organization.
Templafy’s Director of Product Management, Oskar Konstantyner, has been developing our solution and its metadata capabilities for over eight years. As one of the most qualified Templafyers to explore the subject, we asked him to guide us through the most commonly asked metadata questions.
How would you describe metadata in documents?
Metadata isn’t just about the findability of data on web pages or social media for search engine optimization (SEO) purposes. Regarding documents, Metadata is everything you don’t see in the document. It’s the structure around what the document is and what it’s about.
There are many different types of metadata. The most common forms of metadata include structural metadata, administrative metadata, and descriptive metadata. They can cover a range of things, such as:
- Who created the document
- Which department the document author works for
- When the document was created
- What the document content is about
When dealing with metadata elements for documents, we’re usually talking about administrative metadata which includes identifiers like technical metadata, preservation metadata, and rights metadata.
Metadata can also indicate a document’s classification. For example, it can tell you if the document is confidential, for internal use only, or if the file can be made publicly available.
Interestingly, metadata is often independent of what type of document it is – whether it’s a letter or a pitch deck.
The important thing about metadata is that it provides the vital information needed across different kinds of documents so we can collate, find and control them later on.
What is the difference between metadata, schema, document properties, custom document properties, and custom XMLs?
Basically, document properties, custom document properties, and custom XMLs are all types of metadata that can be found in a document. Metadata is simply an umbrella term for everything that isn't visible in the document.
Document properties are the most basic. They are built into every document you create. The most common examples include:
- Who the author of the document is
- When the document was made
- When the document was last updated
- Which department the document belongs to
If you right-click on a document in Windows, you’ll be able to see all its properties, because they are recognized by all systems and easily accessed.
Custom document properties are more advanced versions of these basic document properties. They’re usually required by larger enterprises who need to go beyond automatically created metadata. Examples of these include a citizen ID number or a classification of the sensitivity of a document.
When an organization needs metadata that’s bigger and more complex, it tends to require custom XML. Custom XML is all the data that the user should not mess around with. It shouldn’t be touched by humans, only via systems that have the means to understand them. This is especially true for large relational databases (i.e., collections of spreadsheets, databases, etc.).
The great thing about custom XML is that it enables more complex data structures. For example, if you’re using XML together with SharePoint, you can handle more advanced data types and make better relationships between elements within the custom XML parts.
Metadata is information about data and data sets. Schema, on the other hand, is the map (i.e., structure and layout) for that data.
Why is document metadata important for businesses?
Metadata plays a crucial role in an organization’s overall document governance strategy – where it looks to minimize risk by ensuring content across the board undergoes a validation process and is compliant with company protocols. Generally, a company’s metadata is most useful when it’s based on well-defined taxonomies.
The use of metadata is normal for all enterprises, but the bigger its metadata management requirements. If you’re a large or document-heavy enterprise of, say, 100,000 employees, you’re probably creating thousands of documents a day. That stacks up very, very quickly.
To have any control over these documents (i.e., permissions, the lifecycle of data, etc.) in terms of what is going out, at what time, and to who, you need to find and track a massive amount of documents and set clear metadata standards. To do this, you need to create some structure in the document.
And that is where the importance of structural metadata comes in: create a structure so you can find the documents you need and control these files. It’s no different than if you’re filing newspapers: you could file the newspaper according to the topic, year, or writer, and then easily find it later.
Metadata is the same, it’s just in a digital manner where you have even more data to sort through, categorize, use and reuse.
What impact does metadata have on enterprise IT infrastructure?
Modern enterprises work with many different applications. Today, an employee could be using 20 different cloud applications in one working day. Very often, what binds these systems together are documents and the metadata within these documents. It's what allows interoperability (i.e., how different systems ‘talk’ to each other and work together).
As documents get passed on from one solution to the next, metadata (i.e., meta tags, meta descriptions, title tags, etc.) enables each system to gain context from the metadata. It gives it all the information it needs to treat the document accordingly.
Metadata gives structure to a document and assigns values that can easily be found, whether that's a case ID, classification, or keywords.
Metadata allows the systems to know everything they need to know about the document - what the document is about, who created it, who it should be sent to, who it shouldn’t be sent to, what its classification is, and so on. It’s like if you went into a shop and the shop owner knew exactly what you needed from the get-go.
If it’s a new document or you have a document without any metadata, then you would have to start all over again, and your systems wouldn’t talk to each other. It’s metadata that enables systems to work more seamlessly together.
How can metadata deliver more productive workflows?
Metadata allows you to search more efficiently. If you didn’t have metadata, searching for the right document or file would be incredibly difficult. You’d have to use some really complex algorithms, machine learning, and artificial intelligence to help you find what you’re looking for.
Again, this is because metadata gives structure to a document and assigns values that can easily be found, whether that's a case ID, classification, or keywords. When powered by the correct metadata, your document will contain all the information you need and show up instantly when you search for it.
How is metadata used to increase enterprise security across information systems?
All enterprise security relies on metadata. Metadata also tells us exactly what a document is about, allowing us to determine its sensitivity (e.g., classified, internal-use only, public) and what actions are then attached to that classification level.
If you cannot set the metadata, then none of the systems you use will know anything about your document. They won’t know if they should block it, or encrypt it, or send it on. This also makes metadata crucial for any data loss prevention system, such as Azure Information Protection (
Read next: Keeping documents safe with AIP
How does Templafy work with document metadata?
One of the main issues for most modern companies is that they use many different systems, which means they need a lot of metadata to go into their documents.
As we’ve discussed, metadata is great because it allows systems to work together effectively and makes employees’ lives much easier. However, these systems also rely on the user manually inputting the metadata so their documents can be found and used later.
What Templafy does is make sure that metadata is always there. When you create a document with Templafy, it sets the metadata up from the very start.
If an enterprise has to rely on end-users putting in metadata every single time, things start to get very difficult. Not only does an employee have to do this correctly, but if they forget, then the company quickly loses control over who is creating what.
You could be creating the exact same document as your colleague sitting right next to you and never know because there’s no way to correlate or trace links between the two documents. Multiply this by 100,000 people, and it’s complete anarchy.
What Templafy does is make sure that metadata is always there. When you create a document with Templafy, it sets the metadata up from the very start. Based on your gating questions, user profile, or where you’re based, our platform automatically generates the metadata you need. It takes away the responsibility of the user, so they don’t need to think about it.
Templafy makes sure the right foundations are in place to enable a company’s systems and processes to work together, helping employees to simplify and streamline their workflow.
Can you talk us through some examples of Templafy in action?
SharePoint is an excellent example of how Templafy helps companies implement effective metadata practices.
SharePoint deals with a huge volume of metadata. There are so many metadata policies that can be set up, such as who can send content to who, what should be encrypted, when it should be sent out, and if a document is in draft form, ready to publish or archive. With so many types of metadata to set up, the chances of a user remembering to do this (or having the time to do this correctly) are slim.
When a user is working with Templafy, all those different metadata types can be set up automatically, requiring no user input. For example, say you’re working in management, Templafy can pre-set metadata so all your documents will be classified. This then automatically sets up all your metadata correctly and adds visual elements into the document, such as a watermark stating ‘classified.’
Here’s another great benefit of Templafy: we bridge the gap between metadata and visual document content. Although metadata is all the things you don’t see in the document, with Templafy we can set the right metadata to trigger corresponding visual elements such as watermarks, headers, or footers.
Another great example of how Templafy works with metadata is with Salesforce. Say you want to create a quote - you could go into Templafy and pull all the data from Salesforce, including the deal size, stage, industry of the account, and so on, and use all that data to create and populate a document based on enterprise data and intel.
By letting Templafy know you’re creating a quote, and therefore are very early in the sales cycle, it can assign a sensitivity classification to the document to make sure it doesn’t get sent out externally. Templafy also adds the appropriate visual markers to signal to your colleagues that it is in a draft phase. It works with you to set up policies in the document’s AIP section, which always encrypts any document created for a certain industry or worth a certain amount of money.
Templafy gets data from so many sources, whether that’s your user profile, gating questions, or the document itself, and this means that the possibilities of what we can do for your enterprise regarding metadata are almost endless.