Content Declension: Adaptive content for the Hierarchy of Information Needs

Last week, I wrote a piece called, “A Hierarchy of Information Needs.” It described one way to think about how an information seeker’s question, problem, or interest at any moment can, in effect, blind him or her to other content, no matter how it’s formatted, nor how much the content’s creator wants it to be seen. Usually, the need for the “ephemera” of life becomes the dire matter to be resolved first.

That got me thinking about how else one might use the idea of a hierarchy, which led me to this pondering:

Consider how often a content owner would like for your content to be the most important thing in front of an information seeker at a moment. Let’s say, further, that the content is a full, well-created, powerful “story,” which will bring lasting value to the information seeker, if only you could get it to shift down the hierarchy, from “story” to “reference,” to “ephemera.” If only your content could be fleeting, you reason, and if it could wink out of existence as soon as it’s served its purpose, then it would be seen, explored, and valued in all its fullness and glory.

Content “Declension,” or manifesting your content at each level of the “Hierarchy of Information Needs”

I’m going to call this process—of setting off a content cascade through the hierarchy—“Content Declension,” which I will further call just one process of “Content Grammar.”

In many languages (other than English), nouns “decline” to suit the context in which they’re used. They take different prefixes and suffixes, and sometimes they take on entirely different forms, in order to communicate their role in a sentence, roles that are called “cases.” As a basic example of how this process works, you’ll recognize the vestiges of this process in English:

He is the subject, and the subject is about him, and his story is fascinating. 
(Nominative case he declines to dative case him, and then to genitive case his…)

OK, it’s a thin example, but Content Declension is the process of establishing patterns and formats for the different cases (or “contexts”) in which your content appears.

When you are creating content, it is vital to consider how it will be able to satisfy your information seekers’ most immediate needs, while providing paths deeper into the whole content. In one sense this is about creating useful, meaningful abstracts of your content, but it’s also about establishing consistent formats for each level, so that no matter what the underlying content, it will be clear how it all fits together, and where you are at each level of the content’s inherent “hierarchy.”

Let me use a blog post as an obvious example. This is easy-peasy on a printed page, since the article appears in a fixed position and format. In digital publication, however, the content declensions are complex.

full_storyWhen we think of the full article, standing alone on an HTML page, the answer is easy: We have the full “Story” form, with all its parts, in all their glory: All the text, the byline, the images and videos, as well as the comments, contact links for the author, and perhaps legal information, too.

At the side, however, is another box, called “Related Stories,” which is a “Reference” content component. With a glance, you can see other content you may want to read, but you don’t have to go there if you don’t want to. Inside that container are the stories’ “Ephemera” declensions. They probably include the headline, a thumbnail, a lead-in blurb, and maybe the byline. It just depends on how the designers chose the elements.

So all together, in this example, we have to plan three declensions of the same content: The full Story, the Reference, and the Ephemera— the same content, in three case forms.

It is vital to consider all the contexts in which your content will appear to the information seeker: In sidebar lists, in search results, in printed documents, in content links, and even in URLs. The more you can plan for the contexts in which your content appears, the better you can present it in a form (and format) that will suit the seeker’s present need.

Why is this important? It’s another step in making your content “adaptive” in preparation for “responsive design.”

But there are contexts, and there are contexts.

recent_postsFor the content of our time, there are infinite possibilities for what content is going to show up where, on what platform, in what physical context, and on and on, as we as content strategists are painfully aware. We have also been introduced recently to “responsive design” as a method of resolving some of that uncertainty and “adaptive content” as a way to teach the content about itself, so it can communicate its topics and other meta-properties to the design, so that it can shift.

But I would say that there is an additional property that we have not yet systematized, which is “content context.”

  • What happens when this content is called as a “link?” What do you, the content designer, want to present as the properties in the link?
  • What if the “link” is in a “related links” container? Should it be the same “link” as when it appears in the “Search Results” list? How can the metadata communicate which content ephemera should appear when it appears in one context or another?
  • How can we ensure that when this story is called from a blog post, it declines in one way, and when it’s called from a Twitter feed, it declines differently?
  • What if you want to provide hooks for other contexts, so that related content is served up in some contexts, but not others, when someone else is specifying the display?

search

 

A Call for the Next Evolution of Standards: Content Grammar

Content declension, as a standard, would need to address two issues. First, it would require that content experience designers imaging the functions and contexts in which a full version of content might appear, so that a responsive design could address differences in display for different contexts.

But it would also require that we establish a standard system to name these contexts, like any other evolution of markup. We would need to say that a link.related-links would be different from a link.search-results, to be followed by the fields, attributes, or properties that should appear in those cases. Something like that.

As Content Strategy is evolving, we are uncovering new questions and puzzles related to the “substance” of the digital universe, and I think this is an important next phase, like the “semantic web,” we might call it the “grammatical web.” I expect that if we sit here some more and think at it long enough, we’ll come up with more “Content Inflections,” like “Conjugations.”

Let me know what you think.

 

The PDF Tar Pits: Where content is trapped, struggles, sinks, and dies…

I’m working on a small government web project at the moment, and I was asked to assess the content to propose some content types. As I have looked across the landscape, there were very few content types, really. But then, as I continued my survey, I noticed these suspicious, dark black patches. I couldn’t see beneath their surface, so I started probing, moving closer…THEN, I was caught in sticky, black goo that started to pull me under, as panic rose in my throat…

Caught in the PDF Tar Pits

This is web content that was authored in MS Word, converted to portable document format (PDF) files, and then uploaded to the website, rather than loading it into a content management system (CMS) as text and images. PDF document libraries sprawl insidiously across the internet landscape, trapping living, breathing content in their depths, ossifying into solid rock—unusable, un-reusable—until some content strategist chips away the asphalt to discover the bones of content that is probably extinct, or at least years out of date.

I understand how these PDF libraries are created and why. Really, I do. It’s easier to have content owners whack up content however they want, then just toss the PDFs online, rather than spend the time to consider the content carefully, giving it the time, attention, and respect it deserves.

Let me offer, though, some reasons for helping to pull the poor, thrashing, doomed content back out of the tar.

  1. Oh, PDFs are searchable, so it’s OK to dump…er…upload them.

    It’s true. Adobe over the years has made provision for lots and lots of embedded metadata, so it’s easier to find them. But while search engines can index PDFs so that they can be found, the real human beings who are searching for that content cannot scan them to see whether it’s the content they’re seeking without opening them. Don’t make your visitors become content excavators. Don’t make them open that PDF to skim it.

  2. But this is how I want it to look.”

    It’s true. There are many times when our designers spend hours creating beautiful, high-end, printed publications. That’s good. That’s their art and craft. But creating print-ready publications does not release us from the responsibility of making all that content directly accessible as html text and images. You can certainly make BOTH available, as indeed you should.

  3. We’ll just make a content type for PDFs.”

    It’s true. It is indeed important to have content types to represent files in libraries, ready for download. They need to make their metadata available to the CMS, so that appropriate related files can be offered up alongside other primary content. But when text and image content are caught in the PDF tar, their own content type is masked. Unless the content is pulled from the PDF, your CMS cannot manage the true content types correctly. Files of the “PDF” type will be indistinguishable, one from another, all sticky and black as they are.

  4. This is how we got it from the content owner, so that’s how we’re going to publish it.”

    It’s true. Content owners own their content. (I know, it sounds silly.) They spend a lot of time, laboring in MS Word to format it just so. When they hand over their œuvre to you for posting, you’re stuck between appreciation of their efforts, compassion that they spent so much time on it, and horror that it’s going to require stripping it of all its format before it can be reformatted for the CMS. If you have content owners who are open to the liberation of “just give me the text,” then you can make their lives (and yours) easier, and the content escapes oblivion. If not, then although it means a longer road, reformatting the content will take you safely around the tar.

  5. PDFs of unstructured documents can never be reused as structured content.

    Finally, the most important reason for eschewing the PDF is that when content owners create MS Word documents, they almost never—like, ever!—understand the difference between “format” and “structure.” So they skip blithely through their document, clicking bold here, italic there, and changing fonts and colors according to how they think it will communicate their intentions, without capturing the meaning of those formatting changes in the structure of the content. If unstructured documents are then converted to PDFs and put online, they will be unusable as structured content, and meaningless to semantic search.

Time to Drain the Tar Pits

The simplest guidance you can give your clients, content owners, and stakeholders is to reserve PDF files for content that has been designed to be printed, and then only as a supplement to the live web content. You can probably get away will making PDFs available for content that no one will ever really need, like legal reports and other specific content types that will actually be easier to consume as printed documents rather than as web documents. Even in those circumstances, abstracts of that content should be posted, so that content consumers will be able to preview the documents before committing themselves to downloading them.

Adaptive Content: Our primary platform is burning; Time to jump.

The Burning PlatformWe were honored at our last enterprise web developers’ conference to welcome Karen McGrane (@karenmcgrane) as our first keynote presenter. I have known Karen since we were both attendees of the “Content Strategy Consortium” at the 2009 Information Architecture Summit, and every encounter, every opportunity to listen to her speak, has been an inspiration to me.

Currently, Karen is giving a talk called “Adapting Ourselves to Adaptive Content,” and many of you may have heard her give it as the closing keynote at the Content Strategy Confab 2012 in Minneapolis. For any who haven’t had the pleasure yet, I’d like to review my principal revelations from that marvelous talk.

As our conference theme was vaguely articulated as “mobile,” she addressed herself to the issues of how to ensure that our content plays well, when we have no idea on what sort of device or in what context people may be encountering and consuming our content. But more important than the “how-to” aspects, my main revelation from the talk was how hard it can be for us as content designers and producers to let go of control—to confront and release the idea that our content has a “primary platform,” from which are derived all the formats for the devices and contexts we can imagine and plan for.

Abandoning the “primary platform”

I think the greatest insight I gained from Karen’s adaptive content talk is the idea that historically, all content has been designed and created for a “primary platform,” whose format is well understood. After its initial publication, it must then be reformatted to meet the design realities of any other contexts in which it is to appear.

For example, a slick sales brochure is created as a print document. In this case, the paper page is its “primary platform.” The designer kerns and justifies, styles and tweaks, until a beautiful product has emerged, ready to be handed out at tradeshows or mailed out to prospective donors.

Then someone says, “Hey, we need to get this ‘up on the web,’” and it is (implicitly or explicitly) understood that it should look as much like the printed piece as possible. The brochure is then exported as a PDF, and on some webpage, there is a link to download it.

But then, someone notices that the brochure PDF doesn’t look right on a phone…or a tablet. The display is either too small to read, or it doesn’t rotate well from portrait to landscape. It is handed back to the designers to be “fixed.”

The design team then becomes trapped in an inescapable cycle of creating multiple formats for every content piece, first for print, then for web, then for mobile devices. The need to rework the design for different contexts multiplies the time and cost of creating the content.

Some designers, feeling the pain of the rework process, recommend “designing for mobile first.” But then “mobile” becomes the “primary platform,” and the need for redesigning and reformatting content for other contexts remains.

Responsive  Design: Teaching your design to adapt to its surroundings

Ethan Marcotte has sounded the call for “Responsive Web Design,” which from the visual designer’s perspective, offers a solid approach to putting intelligence into the CSS code, so that a design “knows” what device is calling it, and it can respond with the appropriate styling and format to match. By incorporating media queries and relative measures, web designers can teach their designs to accommodate a wide range of devices and formats. This brilliant work is revolutionizing the way we make design decisions and write code.

But if “responsive design” is about teaching the design know the device, “adaptive content,” according to Karen, is about teaching the content to know itself.

γνῶθι σαυτόν: Teaching your content to “know itself”

“Designers are control freaks,” admits Jared Ponchot at Lullabot in a blog post on responsive design. News Flash: So are writers, editors, and other content producers. “Hello. I’m Stephen, and I’m a content control freak.” I can only say that self-knowledge is the first step toward wisdom.

But it’s time to admit that we’re powerless over technology and its users. We can never know enough about our users, their needs, or their devices—let alone how devices will have changed by next year—to teach our content how to adapt to them. Instead, we must build into the content solid information about its structure and meaning, so that we can allow others to make decisions about how it should look and behave.

(It’s probably more like parenting than we care to admit: Parents do their best to rear their children and help them to know themselves, but eventually they must let go and let them be their own adults. They have to stop following them around to make decisions for them. I can hear my mother saying, “But you’ll always be my content…!”)

Karen points to National Public Radio’s “content API,” which streams no design information, but only content and its structure. Because the API doesn’t know anything about devices, devices can present the content according to their native styling instructions. The NPR website has templates to style the content for the main platforms, but application developers can also write native applications to style the content for their particular target devices and contexts.As technology changes, so will the styling, but the content remains well-structured and ready for anything.

Design can only be “responsive” when content is “adaptive.”

On reflection, I think the primary message of Karen’s talk is that we’ll get the most out of “responsive” design when we learn to make our content “adaptive.” We’ve long said that structure and presentation—content and design—should be independent of one another. Well, folks, it looks like this time we have to mean it. It will require both disciplines—and facing down our control needs—to provide rich content that plays well across the dizzying array of platforms.

Time for a deep breath. Time to jump…

Content Modeling is more than “fields”

When content management folk talk about “content modeling,” they are usually referring to the process of building templates for a CMS.  Besides the Content Management Bible by Bob Boiko, which is a great place to see how a lot of CMSes work, I found a series of excellent overviews of the discipline by Deane Barker of Blend Interactive, Inc., at Gadgetopia.

Barker says:

“Content modeling is the process of converting logical content concepts into content types, attributes, and datatypes.”

In academia, you can find inscrutably technical research on content modeling as the process of identifying the structure of documents algorithmically. (This gem from MIT scintillates! Content Modeling Using Latent Permutations, by Chen, Branavan, Barzilay, and Karger. 2009.)

But if that’s what is meant by “content modeling,” then there are essential aspects missing.

As content strategists, we face this technical view all the time, which I believe is descended from IT disciplines like “data modeling” for database design. We come on the scene talking about content purpose and process, and technologists ask us for template requirements, metadata fields, and data types. In these days of XML standards and the quest for the Holy Semantic Web, we find ourselves pushed into the thick of technical specification before we’ve had a chance to imagine what the content is supposed to be and do, let alone how it should be structured.

Returning to art

In my view, we’d be nearer the truth of “modeling” if we took our cues from other disciplines:

  • When a painter undertakes a monumental work of art, she doesn’t just run in with paintbrushes blazing. She sketches from life. She does études. She makes early decisions about what works and what doesn’t.
  • Murals often begin as drawings in miniature, which are enlarged to scale, then transferred to the wall.
  • The sculptor “models” in clay before casting in bronze.
  • The industrial designer creates digital “models” before production.
  • Developers create prototypes (just “models” by another name) before turning the coders loose.

Models serve as demonstration and instruction to the producers, the assistants, and the artists themselves. They remind and guide. They provide format and boundaries to inspire greater creativity.

Content must be modeled in this creative sense, as well as in the technical sense.

Some suggestions for modeling

  • Banish the “basic page” from your content types. The “webpage” is the content parallel to the “miscellaneous” category in information architecture. Far from being your standard content type, it should be your very last resort.
  • Ask the simple questions. Why are we creating this content form? What are people supposed to do with it? What does that mean for the other kinds of content we produce? How can they be combined into content “super-types?”
  • Do some content studies and sketches. Before you define technical requirements, spend time whipping up some real content to see how it behaves in your domain. If you already have content, gauge the consistency of its form from one piece to the next.
  • Test the usability of your content. Like a user interface, you should see whether people can actually use your content in the way it was intended. Do they get from it what you hoped they would?
  • Define the “rules” for each content type. You’re establishing conventions for the content creators, so they know what they’re doing, and so they can do it consistently over time.

By modeling your content in the artistic sense—by setting the forms and boundaries even before the content is “designed”—all the technical content management exigencies, like “fields” and “data types,” are set in their proper perspective. Templates are simply the mold into which your material is poured and out of which the sculpture emerges, fully formed.

Taxonomy: A “Disambiguation”

I was not able to attend the several workshops on “taxonomy” at the recent WebContent2010 conference (#wcconf) in Chicago: Tough choices were made. Yet I think I got a lot out of those workshops because of the seriously faithful tweeting coming out of them, and when I said so to some new friends, they almost all said, “How? I didn’t understand any of it…overwhelming.” I replied that when you follow a tweetstream, you only see what people understand, already interpreted for you. (Which is a recommendation, really, to follow conferences you can’t attend: Done well, the tweets will give you at least the essential points.)

Amid the summary tweets of the workshops’ content, however, I saw comments such as these:

“A workshop and a session on taxonomy and I’m still confused. Is it just me? #wcconf” – @EvanKittleton

“Ouch. My head hurts. Taxonomy not an easy beast to wrestle. #wcconf” –  @cc_holland

A lot of the confusion centered on how the idea of taxonomy relates to—and differs from—other elements of Information Architecture, such as sitemaps and navigation. Are they the same thing? Is it just your metadata?

With the guidance of my best-bud colleague Becky Bristol as technical reviewer (@paintingblue) I’m going to try to “disambiguate” it, that is, to explain and clarify.

Disclaimer: I’m an explainer, not a taxonomist, so if you’d like to help with the definition, please by all means chime in.

The Roots of Taxonomy

“Taxonomy” is an ancient scientific practice. It means to find names for things. In naming things, you try to figure out how sets of things are related to one another, so that each, unique item will not only have a unique name, but also a reference to the others to which it relates.

Taxonomy creates a hierarchy of inheritance, from general down to specific and back: A giant tree, on which there is a unique place for every item, like the leaves at the ends of twigs at the ends of branches connected to a trunk and running deep into the earth.

In order to build a taxonomy in the scientific sense, you have to create a framework that tells you how to name a thing. This is the “schema.” The most famous schema was created by Carl Linnaeus, an 18th Century Swedish botanist, to categorize and name life on Earth. It has eight, major taxonomic ranks:

Domain -> Kingdom -> Phylum (botany)/Division (zoology) -> Class -> Order -> Family -> Genus -> Species

If you’re REALLY geeky, you can lay it out in Latin:

Regio -> Regnum -> Phylum/Divisio -> Classis -> Ordo -> Familia -> Genus -> Species

There are only certain terms you can put into those fields. Imagine drop-down boxes from which you MUST choose. Let’s try it on ourselves, humans:

Domain Kingdom Division Class Order Family Genus Species
Eukarya Animalia Chordata Mammalia Primates Hominidae Homo H. Sapiens

When the terms don’t apply at a certain point, then you get to pick a new term, which at that point, creates a new branch. If you find a new item in nature, something that hasn’t been named before, you get to name it yourself, but you will use the same set of terms down the tree as far as you can to demonstrate your new species’s relationship to all other life.

Taken altogether, this classification system becomes the official way of understanding the whole world of animals, plants, and bacteria. Taxonomy is powerful because it is universally adopted: You could try to work out a new system, but then you’d have to explain it to everyone and get buy-in for it to mean anything to anyone else but you. It is at this point that we make the transition to the Web…

Taxonomy on the Web

Now at some point, the word “taxonomy” was appropriated by information architects to talk about web content. When one discipline borrows from another’s, the meaning and use of the term can change significantly, and so “taxonomy” doesn’t mean to the web professional quite what it means to the biologist.

A website’s taxonomy describes how all the content relates to each other. Through its rigidly controlled network of meaning, there is a way to say with confidence:

“Item X and Item Y are in the same group. When you look at Item X, you may also be interested in Item Y.”

We take this kind of connection for granted these days because Amazon and other e-commerce giants have made such ubiquitous and successful use of taxonomy to sell related things, but it’s really quite difficult to establish those kinds of relationships in your content without taxonomy.

In summary to this point, then, “taxonomy” on a website is a classification system that maps all your content to other content. Taxonomy on a website creates a scaffold that holds your content together.

Not one taxonomy, but many

It gets a little more complicated from here. Whereas in a biological taxonomy, we’re dealing with only one dimension of relationship, the ultimate relationship of one species to another through its name, on a website, there can be many classification systems to govern the relationship of content along many dimensions.

Let’s take with a clothing retailer. The most basic taxonomy would divide the products into groups of “kind” to answer the question, “What article of clothing is this?”

Clothing for the upper body

  • Shirts
    • Blouses
    • T-shirts
    • Polos
    • Turtlenecks
  • Jackets
    • Blazer
    • Windbreaker
  • Sweaters
    • Cardigan
    • Pull-over
    • Vest

Clothing for the legs

  • Pants
    • Dress pants
    • Jeans
    • Shorts
  • Skirts
    • Full-length
    • Wraps
    • Culottes (really a hybrid)

Accessories

  • Jewelry
    • Rings
    • Earrings
    • Watches
    • Necklaces
  • Belts
  • Hats
  • Bags

So far, so good. We have a system for identifying items by basic type. But that’s not so good for sales.

There will be, then, additional taxonomies to build up a multidimensional system that organizes products into classes: For women or men, girls or boys; for casual, work or formal contexts; for outdoor or indoor; by color; by season; by ethnic origin; and so on, and so on…

But that’s just the products. There will be other content that accompanies these products, and all that content must also be organized into categories.

  • “How to” content might include tieing neckties, caring for leather, assembling an ensemble for an evening out in Paris.
  • “About us” content might go through all the ways that this company works for environmental activism.
  • Product information might include stories about where the materials came from, or who made them.

The taxonomy must account for all these dimensions of content description and classification, so that when you pull up the product page for that pair of shoes you’re considering, you also can see:

  • What other colors are available?
  • What other shoes are in its class?
  • How do you care for them?
  • What accessories would complete your outfit?
  • How have other customers worn this item? (From their photos)
  • How long it would take to get them if you clicked the button right now…?

Taxonomy implemented through metadata

All this work of understanding the interrelationship of content has a specific and practical end: Metadata.

It is beyond the scope of this article to explain the process of developing taxonomic systems and how they are then translated into metdata for your web content. It is crucial, however, to recognize that having a clear, controlled system of metadata, which is then meticulously and consistently connected to your content, is the only way to ensure that your search and coordinated applications serve up the content the user expects, in the language the user expects, in combinations that make sense to the user.

Rich, interactive experiences require taxonomy

Creating rich internet applications (RIAs) is partly about the technology to evaluate and serve up all these connections, but it is impossible without care, design, and maintenance of your content’s taxonomy.

Again, unlike our scientific counterparts, there can be no, single, universal taxonomy for web content because each content domain has its own context of purpose, vocabulary, and peculiarity.  There are commercially available taxonomic systems to get you started, but they all have to be evaluated for your specific purpose, and there will always be adaptation of the metadata.

Taxonomy, Navigation, and Sitemaps

A lot of the confusion in the workshops dealt with how a website’s taxonomy relates to the other aspects of its information architecture. As we explore these concepts, keep in mind that when done well, the taxonomy is completely invisible to the user. It just makes everything run smoothly.

Sitemaps

The sitemap reveals the website’s overall organization. Every bit of content on a website needs a primary “home.” Ultimately, when you reach a content item, you are (virtually, of course) in a particular location on the site. The information architect’s job is to choose from the infinite range of organizational possibilities to anchor the user experience, which then is the foundation for the richness that the taxonomy creates.

The sitemap probably will reflect some basic aspects of the taxonomy underlying the content, but when you consider the richness and complexity described above, any relation between the sitemap and the taxonomy will be loose.

Navigation

Navigation is more closely related to the sitemap than to the taxonomy. The main navigation provides the user an organized path around the website, intended for browsing. Like the sitemap, it may reflect some aspects of the taxonomy, but it doesn’t have to.

The taxonomy will enable, however, the local navigation options through access points to content elsewhere on the site, reached through the relatedness of content.

IAs help you put it together!

It’s the job of information architects to work all these intricacies out. The skills for designing the taxonomy and associated metadata are extensive and precise. The content strategist helps to define the content domain and the language that will best represent it, but the IA will be able to build an organizational framework that links the content domain with the technical wizardry that serves up the user experience.

In conclusion, as my best-bud Becky says, “There is no right or wrong way of [creating taxonomy]. The trick is to come up with a taxonomy that works for your users.”

I hope that this article has helped to clarify the definition of taxonomy and its application. Please offer corrections, amplifications, and clarification. It’s a matter to wide importance, and we need to get it right!