The term metadata has been thrown around since the late 20th century. But it’s been gaining a larger portion of tech headlines the past few years, highlighting the importance of understanding what this concept means for your personal data, privacy, and security.
Before making your decision on whether metadata deserves all the attention it’s been getting, you first need to understand what it means and how it can affect your digital life. So what is it? What can metadata reveal about you? And what can you do about it?
A literal translation of the word metadata is “about data”. While metadata is rarely categorized as useful data on its own, it’s often a summary of a much larger data set—anything from an audio file and communication to images and videos. But metadata isn’t just a useless addition to an already complete set of information.
You can think of metadata as the information on the outside of a book along with the table of contents. They don’t spoil the entire book; they allow you to categorize it properly without having to read the whole thing.
As for types, metadata is often categorized depending on the type of information it reveals about the source file. A single file can contain more than one type of metadata in order to allow electronic systems, and users alike, to better organize and categorize files.
As the name suggests, descriptive metadata describes the contents of the file in question. The information within descriptive metadata is typically used for filtering and searching through a large library of files—often of the same type.
It’s the most commonly used type of metadata. Descriptive metadata generally includes the file creator’s name, date of creation, and other crucial information like the genre, album, and even a cover art image if it’s an audio file and the ISBN and author’s name for books.
Structural metadata provides information about the composition and layout of the data within a specific file. While this information can be used for filtering, it’s often dedicated to more in-depth exploration and categorization of files.
Structural metadata comes in a variety of types such as the length of an audio file, the number of pages in a book, the table of contents, and the titles of chapters.
Administrative metadata is technical in nature. It contains information on how to open and run the file, including info like the file’s format. This type of metadata is present in almost all files and is read by your device and the software or app you use to run the file.
In some instances, administrative metadata is also categorized as rights metadata, covering information regarding the intellectual property of the file and who has rightful access to it.
Legal metadata strictly provides information regarding the file’s legal status. This includes who or what the copyright of the file belongs to the type of public or private licensing it contains and any additional, legally binding agreements.
Regardless of what file format you’re using, and whether you created the file, got it from a friend, or downloaded it, metadata plays a role in your everyday digital life. And while the information metadata contains may be brief and mostly insignificant on its own, it can be manipulated and patched together to breach your privacy and security.
If the metadata of one or more of your files were ever exposed, it doesn’t reveal the contents of the file. Instead, it answers foundational questions such as:
- Who does this file belong to?
- What type of information does it contain?
- Where was it created and saved?
- When was it created and was it edited by the current owner?
But the answers to all of those questions combined still mean very little. How much can this surface-level information reveal about a person?
On its own, information collected from a handful of files and web actions is minuscule.
The problem, however, arises when metadata about one person is collected from thousands of sources over a long period of time. This includes who you most frequently chat with on the phone and your email correspondence, even if they don’t have access to the contents of your conversation.
With the introduction of metadata tracking of new types of information, such as images, more information about you is exposed. Regular selfies and photos of your dinner uploaded to Twitter and Instagram can reveal the locations you frequent the most, even if you don’t tag them—that’s why your camera app requests access to your location.
On its own, exposed metadata is a privacy breach. It allows anyone with access to it to track your movements and communication patterns. But with enough information and a well-made AI system, they can even begin to predict your upcoming movements and activities.
While you may find that tracking when you talk to your friends and family, the conclusions reached by advanced analytics systems can be more invasive. After all, companies that track you now know when you contact your healthcare and insurance providers, and what type of information you were searching for online.
A survey by Security.org looked into the type of data the biggest websites on the internet collect, even when they don’t have to. The survey included social media websites like Facebook and Twitter, and even Google, and found that the majority of them kept user information that they didn’t need.
Data included unique identifiers, personal information, location, and user activity. More often than not, you can’t opt-out of this type of data collection, even if you allow for only the strictly necessary tracking and cookies.
How Do You Protect Yourself?
Use a VPN to mask any identifiers you may leave online, accept the least number of cookies and trackers you can when visiting any website, and anti-tracking browser extensions. As for more personal information, make sure you erase metadata from any file before you upload it to the internet or even send it to a friend.
While it can be near impossible to entirely avoid leaving data tracks online, especially on websites you have accounts on, you can minimize the information they have on you.
Developers will be shown a “string of zeros” when they try to access the advertising ID of users who opt out of tracking.
About The Author