This channel has it's own RSS feed at this link.

Gadgetopia Channel

Content Management

Nov 5

Search is Hard

We’re researching search options for a client this week, and I stumbled across this blog post which spoke volumes to me:

Search is Easy, But Good Search is Hard

So true. Search, in it’s most basic form, is easy. But there’s a lot of subtleties that you find yourself longing for that are harder to pull off:

  • Spelling suggestions
  • META searching
  • Content biasing
  • Incremental indexing
  • Index merging
  • Key matches or “best bets”
  • HTTP spidering
  • Etc.

Also, there’s a vast difference between a “search engine” and a “search application.” A engine is just that — a tool to search an index. Lucene.Net provides this. So does Swish-E and Searcharoo.

But a “search application” is all the stuff that surrounds it. The actual searching interface, the indexing methods, the process of maintaining the index, the metadata scheme of your content domain, the spidering process, the filtering of search results, etc.

This difference was fairly stark when we downloaded Lucene.Net the other day and got it working. Using two command-line executables, we were able to index a folder of text files and search them from the command line. So, we had a search engine.

This was great, but…what then? Being able to search like this is a far cry from actually having a search system on a Web site that provides some value.

Sometimes, it seems the actual search engine is the smallest part of it. Getting that core to work in a larger, integrated whole to deliver some value to the end user is much more difficult.


Nov 1

Give Me an API for Filtering Content

Here’s something that should be a required feature of any CMS —

You should be able to “wash” a list of content against the permissions model of the CMS. Meaning, if I get a list of content IDs, I should be able to “strain” them through some API call on the CMS, and get back a list of IDs the current user is allowed to see.

This would make my life so much easier, though what that really means is that I’d be able to hack systems at a lower-level, which probably makes CMS vendors nervous so they have little impetus to actually implement this, but I can dream.

Here’s an example of where I’m going with this —

Sometimes, you just can’t get “at” the content you want through the exposed APIs. You try, but you need to do something really hairy and the API is just not playing ball with you. So, you do the unthinkable and go straight to SQL. The idea is that you query for content IDs, then use the API to retrieve that content by specific IDs.

This is bad, for the primary reason that you may get back content that the current user should not see, either because they don’t have sufficient permissions or the content isn’t published yet. Essentially, you’re going around the permissions and publishing model of the CMS, and untold horrors await you.

But, what if you could say —

Hey, CMS, I have this list of content IDs here… How did I get them? Yeah, well, that’s not that important right now…

Anyway, can you look at this and tell me which ones I can show Nathaniel Snerpis? Here, just take them all, and give me back the ones I can show him.

In this case, the CMS has filtered your list, and everyone is happy. In the end, it doesn’t much matter how you got the list of content, just so long as you filter it.

Although I mentioned directly querying SQL, that’s just an example. There’s a number of different ways you could obtain an arbitrary list of content, and that’s the beauty of this idea — it doesn’t matter how you got the list. A filter just let’s you “correct” any list of content, regardless of origin.

Another good example is using a search system external to your CMS —

We have three Ektron installs using the Google Mini search appliance. One of the problems with the Mini is that it’s not plugged into Ektron’s permissions model, so it has no idea which content it can show an anonymous user in search results and which it can’t. To get it to play ball, you have to go to complicated lengths to expose user groups in META tags, then filter against them, etc. A good content filtering API would remove this problem.

I wonder if CMS vendors would be disinclined to do this because to do so would open their systems up to being…extended, in ways that they weren’t planning on. In the end, is this a bad thing?

Update: Henri Bergius of Midgard was nice enough to blog with some code of how to do this in his system. If you’re an expert with System X, would you be willing to indicate whether or not it can be done, how difficult it is, and how far you’d be coloring outside the lines to do it?


Oct 17

What Makes a Wiki?

In Sioux Falls this summer, we had something of a scooter revolution. Scooters were everywhere. And I noticed something — some of the scooters were so big they rivaled the size of motorcycles.

So, I got wondering, what’s the real difference between a scooter and a motorcycle? Where is the dividing line? I suspect it’s awfully blurry.

I’m beginning to think the same thing about the line between a wiki and a traditional CMS. What differentiates one from the other? I suspect this line is awfully blurry as well.

Way back in the day, when wikis were new and I was messing around with early versions of Twiki (we had to GlueWordsTogether to make links…), wikis had some prety clear differentiators:

  1. Everyone could edit any page. To my knowledge, there were no permissions (in fact, the Movable Type wiki (the old, unofficial one) was closed down due to vandalism they couldn’t stop).

  2. There was no structure of pages. They were just all in a big pool.

  3. Revisions were kept for every edit.

  4. There was no WYSIWYG. Just wikitext.

  5. There was no page structure at all — just a title and text.

And that was a wiki, and it was pretty clear that it wasn’t a CMS.

But these days, with a lot of wiki products, those five points of differentiation have been muted quite a bit.

  1. There are permissions models now, in the wiki world. Wikipedia has famously started to massage permissions, and any internal wiki at the enterprise level would almost have to have permissions, since the information held in it can be at varying levels of sensitivity.

  2. You can get wikis with some structure. Google Sites has pages and sub-pages, and I remember a Ruby-based wiki from some time ago called Hierachi that was built in a heirarchy (hence the name).

  3. Revisions are still kept, but this isn’t really a differentiators anymore because any good CMS does this too (every WordPress does it now).

  4. Many wiki products have WYSIWYG capability. Additionally, with the popularity of things like Markdown and Textile, many CMS products are using wikitext-ish syntax.

  5. Page structure is largely a configuration issue in a CMS anyway, so there’s no reason you can’t set up your pages like a wiki.

So, again, what makes a wiki? What are the features that, when present, make you say “this isn’t a CMS, it’s a wiki.” I maintain that this is getting hard to figure out, and a traditional CMS can often lay claim to being a wiki without too many changes.

Ektron includes a wiki module in their latest version. So does Xoops, I just discovered the other day. And I’m poking around EPiServer these days and wondering just how hard it would be to give that some wiki-like functionality.

Is it enough to just have an “edit this page” link on every page? If I did that with a traditional CMS, could I say I had a wiki? I think that in a point-by-point comparison, I’d come pretty close.

Or, is a wiki more of a cultural designation, than a technical one? Stewart Moder’s wikipatterns is all about how to get wikis into your enterprise, and it spends about zero time on the technical aspects of it. It’s all about how to overcome mental barriers to using a wiki, and how to get people to embrace the cultre.

There a case study in the book where some guy says something like this:

You can’t turn a CMS into a wiki

More and more, I disagree with this. The line is fine. A CMS can be very wiki-like, very easily in some cases. The concept of a wiki is largely a mental and cultural one, and I think accepting this fact and embracing this is a real key to getting your organization to embrace them.

Reading Stewart’s book wouldn’t hurt either.


Oct 11

wikipatterns

I have an extended flirtation with wikis. I like the open concept of them, but I find that only about 10% of wiki projects ever really take off due to one important fact: wikis are primarily a challenge of human, rather than technical engineering.

Technically, there are dozens of proven platforms on the market (try wikimatrix.org for scads of them), but the trick is getting the users to buy into the plan. Wikis constantly suffer from the “empty dance floor syndrome,” where no one wants to be the first one out there, and it’s tough to get anyone to shake their tail feather.

And this is where wikipatterns comes in — the Web site and the book.

What is a “wikipattern”? It’s a observed behavior or feature that lots of people have seen when trying to make wikis work. You have a “pattern” which is positive, and an “anti-pattern” which is negative. Example:

Champion [pattern]: A passionate, enthusiastic champion is essential to the success of wiki because s/he will be able to generate interest, give the appropriate amount of training for each person at the right time, monitor growth of the tool and fix problems that could derail adoption.

Bully [anti-pattern]: A bully is the opposite of a champion, and goes too far in pushing people to use the wiki. A good champion knows how to lead people in adopting the wiki, while a bully might get upset at someone for emailing rather than using the wiki.

(Note that patterns and anti-patterns aren’t always mirrored opposites, as in this example.)

These are fundamentally behavioral issues — how do you engineer the people around you into using the new technology?

There are also adoption patterns and anti-patterns, which are concerned less with the people and more with the implementation of project dynamics. Example:

Flying under the Radar [pattern]: […] hosting the wiki initially through unofficial channels, using a corporate credit card or other “black market” funding to pay for hosting, as well as using a set of community resources who are willing to play the role of Champion, Gardner, and other roles on their spare time.

But the Intranet [anti-pattern]: The “ButTheIntranet” pattern is one propogated by a webpageChampion to discourage wiki use, perhaps because they are familiar or vested in the “old” way of doing things on the world-wide-web.

Put together, these patterns and anti-patterns (there are four of five dozen of them), are a fantastic group of observed phenomena when implementing wikis. And, trust me, you’re going to need all the help you can get, because getting users on-board and contributing is often quite a trick. The wiki minefield is large.

Which brings us to the book “wikipatterns.” The book springboards quite a bit off the Web site, and it’s a good resource for two types of people.

  1. Those who barely know what a wiki is
  2. Those who are faced with a wiki implementation, and don’t know where to start

That said, the book doesn’t offer much in the way of technical advice, though the author’s companies own product is pimped quite a bit (Confluence, by Atlassian — I’ve looked at it, and it appears exceptional).

However, the technical side of the equation is not really what the book is about. Rather, it’s 150 pages of persuasion that wikis aren’t nearly as scary as you thought ,and that they’d probably work just fine for your project.

In-between the chapters are case studies from various people who have wikis running in product with some explanation of how they did it, and what specific patterns they saw in their own project.

I disgaree with a couple things in the book. First, I’m not a huge fan of wikitext, and I feel like WYSIWYG would give you more buy-in (but, given the author’s experience, who am I to argue?). Additionally, there’s a case study in their that irritated me a little when the author said, essentially, “A CMS can never be a wiki.” I don’t agree with this — I think there’s a very fine line between the two, but that’s another post entirely.

I got everything I expected out of the book. I’m better-equipped now to champion a wiki project, both from what I read in the book and for the introduction it gave me to the Web site, which is just as valuable.


Oct 3

The most important feature of a CMS is...

LinkedIn: Answers: What’s the most important feature that you look for in a CMS?: A simple question posed on the LinkedIn. Some good answers, and worth reading for a CMS junkie.

Excerpts:

Community size and support, ease of use, expandability of the CMS, the amount of available “add ons”, licensing types, and most of all -SECURITY and how issues are handled when they arise. […]

Fit for purpose ;-) […]

I think ADAPTABILITY would need to be the most important feature - the ability for the user to have the system adapted to his/her requirements quickly and easily by the developer. […]

The single most important, and often overlooked, feature of any CMS is ease-of-use for content editors and other end users. […]

This one seems a little out in left field.

As a chief of technical development of our CMS (Xmanager) I may say that the most important feature of a CMS is semantic web and ontology compliance.

Wow. Seriously? That’s “the most important feature of a CMS”?


Sep 26

Bureaupedia: Intellipedia for the FBI

FBI creates knowledge wiki: The use of wikis is spreading in government. This has been pushed forward by the success of Intellipedia.

The FBI is testing a new collaborative internal Web site, or wiki, called Bureaupedia that officials say will enable users to create an encyclopedia of lessons learned, best practices and subject-matter expertise.


Sep 17

Content Management as a Practice Re-visited

Content Management as a Practice: Seth has posted a follow-up to my blog post which expanded on a conversation he and I had in Chicago. Our conversation was about teaching content management as an abstract practice rather than as a specific platform integration.

In the course of that, we talked about why some people shun new systems and some people — Seth and I, for instance — embrace them.

Seth nails this topic with these comments.

To understand content management at this level you need to experience lots of different products, tease out concepts and patterns, and do a lot of comparative thinking. This is hard to do because it takes a certain kind of curiosity that I find surprisingly rare and certainly expensive to cultivate. Most people stop looking when they see something that works. They find it frustrating and risky to ignore familiar concepts to look at a problem from a fresh perspective. They dread the steep initial incline of a learning curve. They only suffer suffer through in the hope of flatter pitches ahead. Few have an insatiable need to challenge and be challenged. The instant the learning curve starts to level-off, they are looking for another one to run up against. They like the feeling of being dropped into a totally unfamiliar place and finding their way.

Beyond that, Seth expanded on my original two-level dichotomy of the content management profession. I said content management exists on two levels:

  1. Content management itself
  2. Specific platform integration

Seth takes this a step further (a step prior?) and includes the concept of understanding how a specific vertical or industry uses content (emphasis mine):

The key is to focus on the customer’s higher level business objectives, not a set of capabilities that defines a software market segment. Yes, an understanding of workflow as a foundational concept is useful but even more important is to understand the different editorial processes that happen within a news organization.

This is apply general content management knowledge to a specific domain. I’ve been tossing it around since this morning, and my gut tells me that it all still starts with basic content management knowledge — understanding content management in the abstract. From there, however, there are ways to apply it:

  1. Applying that knowledge to a specific platform, regardless of domain.
  2. Applying that knowledge to a specific domain, regardless of platform.

In a perfect world, you get both, I suppose. You know content management, you’re an expert in a great platform, and you stick to a lucrative vertical. But my point is is that the platform and the domain knowledge are still subservient to core content management principles. It still all starts there.


Sep 14

WorkHabit

Elastic2 | WorkHabit: Adam and company over at WorkHabit are apparently going to take Drupal hosting to a new level.

I can tell you that Adam and Jon have forgotten more about Drupal than most people will ever know. Combine that knowledge with the hosting architecture they’ve put together, and you could likely run Drupal behind something on the order of CNN.

Social media is exploding and Drupal is one of the most commonly used platforms for building social media applications. Getting it to scale has been a significant challenge - until now.

Elastic2 is a new automatically scaling cloud-based hosting service that allows you to serve millions of users and billions of pages.

Millions, billions — any way you slice it, that’s a lot. I’ve seen a simplified version of the architecture they’ve put together, and it’s impressive.

AutoPilot looks interesting too:

AutoPilot is a complete build system that captures code and configuration changes from entire teams of developers, merges and synchronizes it, and then provides a framework for easy, consistent, and optionally, scheduled builds. Manual, error-prone, time-consuming builds are a thing of the past.

Things like this, Acquia and SpikeSource and pushing Drupal steadily into the enterprise. Bigger companies fear open-source more than you think (something we’ve seen firsthand — “We loved your presentation, but come back with something non-open source…”), but things like this can’t help but alay some of those concerns.


Sep 14

Diplopedia

An Internal Wiki That’s Not Classified: A good article about Diplopedia, which is the State Department’s wiki, much like Intellipedia over at the CIA.

In addition to reference material like the 200 biographies of Italian political and business leaders, the more than 4,400 Diplopedia articles reflect the range of the staff’s concerns — among popular articles are high-minded titles like “Foreign Affairs Professional Reading List” and mundane ones like “Building Pass.”

The point of this article is this:

There was a larger point to bringing his message to Wikimania 2008, as the annual conference is called: if wikis can work at the State Department, with its fabled bureaucracy and attention to protocol and word choice, they can work anywhere.

At Blend, we’re having great luck with a wiki built on Google Sites (and, really, that’s what Google Sites is — a very polished wiki with some bells and whistles).

On another note, I’m really fascinated by the idea of the use of wikis, blogs, and social networking in government organizations. If anyone knows of any other links about this, please drop me a note.


Sep 13

Content Rot Re-Visited

Low value content is destroying your website: Interesting article about how Web sites get gunked up with crap. This is a huge problem with intranets. I wonder if Gadgetopia has this problem?

There is a saying: What do you get when you cross a fox with a chicken? A fox. When you manage low-level content and high-quality content on the same website, the low-level content smothers and eats up the high quality content. We must thus manage them separately. We need a website for the low level stuff. But our primary website should be for the high-quality content that people actually need today.

We’ve talked before about Fighting Content Rot which I think is important. Ektron has a nice feature in it where you “expire” content, but when it expires it just gets added to an “Expired Content” report and/or creates a task for someone to review the content. Done dilligently, adding “tickler review” times to content (perhaps forcing it programatically) has value.

However, a commenter on the original post (the one quoted above) makes a good point too:

I’ve had an experience similar to what you describe - finding old, out of date stuff when I perform a search. But… what about the times a user actually needs old information? Sometimes I go to a web site because I need to know what a company was saying in 2002, or I’m searching for specifications for an old product. One of the benefits of this information being available online is that it is findable, whereas in the past it would require human intervention to dig it up.

Does this stuff boil down to poor search engine implementation? Should there be a way to always isolate content by date?


Sep 11

Content Management Interoperability Services (CMIS)

Content Management Interoperability Services (CMIS): I’m seeing this float around more and more. CMIS is an abstraction layer designed so different content management systems can talk to each other, or so a CMS can separate itself from its repository and work with multiple repositories.

CMIS is designed around a services architecture based on SOAP, REST and Atom to simplify application development. The process of developing content centric applications that are repository independent or that are capable of working with the content from various repositories becomes a viable option as a result of the CMIS specification.

Conceivably, someone could sell a CMS without a repository, and you could buy the repository from another vendor.

Two articles on CMSWire today have tossed this in front of me:


Aug 29

Elgg

Elgg.org: Last year, when I was a judge in Packt’s awards, I was in the “social networking” category, which was odd since I didn’t know a thing about it. But I was still impressed with Elgg, and I voted for it.

Flash forward a year, and I now a lot more about social networking. And Elgg has a new version out, and it’s even better than the last one. I set up a demo with it for a client, and they (and I) were just blown away by it.

Elgg is “Facebook in a box.” It installed for me in about five minutes, and the default functionality out of the box is crazy impressive. In working with it, I was struck that this could be a great intranet and collaboration platform for a company intranet.

We get requests all the time for “how much would it cost to build Facebook/LinkedIn/whatever.” With Elgg, we now have a huge head start on that answer.


Aug 29

Packt Announces Finalists

2008 Open Source CMS Award Finalists: Packt announced the five finalists in each category for their open-source content management awards.

There’s a lot of new faces in the “Most Promising” category, but, the only surprise nomination (to me, anyway) was TYPOlight in the “Overall” category.

I’m a judge in the PHP category this year.


Aug 15

Content Management as a Practice

Four years ago, when announcing that the long sought-after title for his profession — that of “interaction architect” — had finally been found, Bruce Tognazzini started off his post with:

This is the most important column I have ever written.

Now, as much as I love hyperbole, I’m not going to go that far. But two months ago, I wrote this

I cut out of [a conference session] to grab a corner with Sett Gottlieb and have what was one of the most professionally meaningful conversations I’ve ever had in my life.

— so I need to deliver. Here goes…

At Web Content 2008, Seth Gottlieb gave a good session about open-source content management. During his introduction, he mentioned that he used to be the “Content Management Practice Director” for Optaros.

That phrase stuck with me. Content Managment Practice Director. It resonated over and over for the next 24 hours, until I finally sat down with Seth the next day to talk about it.

Seth’s lofty title at Optaros wasn’t “Development Director” or “Director of CM Integration” or something. Instead, his title evoked something I’ve always been attracted to: the idea of content management as a practice.

What I’ve learned in my years of content management is that it exists on two levels.

  1. Content management itself
  2. Specific platform integration

A lot of people jump right into the second one: platform integration. They learn Drupal. Or eZ publish. Or their company buys Red Dot and they do an integration with that.

Along the way, they’re exposed to some cool features: versioning, workflow, templating, etc. These features make sense, and they get implemented.

However, this person’s knowledge is very brittle. They don’t know content management, they know Drupal. Or eZ publish. Or Red Dot.

And how much do they really know about, say, workflow? If they’ve worked with Ektron, they know that it’s serial approval chains, nothing more. They’ve never been exposed to parallel workflow. Branching workflow. API or code exectuion steps in workflow. Workflow aliases. Ad-hoc workflow.

In the end, this person isn’t a content management practioner, they’re an Ektron integrator.

Now, this isn’t all bad. Doing integrations like this pays the bills, and a lot of people do great work and make a great living at this level (I’m one of them).

But I want to go deeper.

I’m interested in content management as a practice. I’m interested in content management as a transcendent skill. I’m interesting in learning, mastering, and teaching the eternal principles of content management, if I can be so dramatic.

Most everything in programming has patterns — ways of doing things that have proven to be pretty well-suited for a particular application. I wrote an entire post about Functional Design Patterns, in fact.

What are the patterns of content management? What are the features, skills, and theories that transcend all platforms. In the end, versioning isn’t about Drupal or eZ publish. Sure, they both have implementations of it will all their quirks and idiosyncracies, but versioning goes beyond that. It’s an eternal pattern of content management, and something that deserves to be studied and dissected far above the specific implementation level.

When I install and evaluate (read: play around with) a new CMS, I have a mental checklist in my head of what I’m looking for. The checklist looks a lot like the uber-post I made last year about just what comprises a CMS.

I love installing a new system and finding out how it does all of the things on my list. I love digging, prodding, researching, and breaking stuff until I figure out how they implemented Feature X, and how it works.

Seth summed this up in our conversation by saying (I’m paraphrasing from memory), “There are people who like to feel smart, and people who like to feel stupid. People who like to feel smart, never like to use something new because they don’t understand it. People who like to feel stupid, love using something new because it gives them a chance to learn.”

So, the question I posed to Seth, and I’ll pose to you now, is: how do you teach someone the core principles of content management? Have they been defined (my list non-withstanding)? Is there a curriculum? I once wrote a post about wanting a “Masters in Content Management” — what’s the closest thing?

At Blend, my goal has always been to develop a group of solid content management practitioners. While we have our favorite platforms, my hope is that the specific platform we’re currently integrating becomes interchangeable. I want Blend’s people to understand the core principles that transcend those platforms, and not get fixated on one specific environment.

Referring back to the anecdote I opened with, I think I feel the same…relief, as Bruce Tognazzini did when he found the title “interaction architect.” In doing so, he put an identify on an amorphous set of skils and aspirations floating around in his head.

I was just as surprised when Seth tossed out the phrase “Content Management Practice Director” and it started bouncing around in my head. It draws a circle around where I want to be, and where I want my developers to be.

The question becomes: how do we get there? I hope you stick around as I try to answer it.


Aug 15

When is it okay to lose browsability?

One of the things I struggled with in the redesign — and still haven’t completely figured out — is when it’s okay for some content to no longer be browsable. By “browsable,” I mean “non-orphaned” — a page that has an inbound link from some index page.

Consider the New York Times. They put their archives online — every article from 1851 forward. Some 13 million articles.

Do you think every article is browsable? Can you navigate through a series of index pages to find an article about some event in 1912? Probably not.

At a certain point, does content get “retired” to search only? When is it okay for something to not appear on an index page, and only be found via contextual links from other content and search?

While we don’t have 13 million articles, we are approaching 7,000. With the redesign, we removed the browsable category pages from Gadgetopia. You used to be able to scroll back through the categories, but logs show that it wasn’t used that much, and the value of it was questionable anyway. With the current format, there are links two perhaps 50 posts per category, which I’m hoping is enough.

The way it sits now, there are multiple ways to find older articles that have fallen off the index pages:

  1. Links within the text of a post. I link from post-to-post a lot. There are four in this post alone.
  2. Links from the “Related Posts” and the “What Links To/From Here” sections.
  3. Search. The searching system on this site is quite good and rarely fails me.

I’d be interested in opinions on this. When is it okay for something to lose its browsability?



Want to advertise on this site? Contact FM.
Web Hosting Web hosting, dedicated servers and Web design services
Laser Toner Cartridges UK laser toner, toner cartridges, hp toner, lexmark toner, samsung toner, canon, toner, epson toner, oki toner, kyocera toner, xerox toner, remanufactured toner, compatible toner
Direct TV Deals Free 4 room direct tv deals. no equipment to buy. free fast professional direct tv installation. this is the best direct tv deal available anywhere.
SEO Article Learn from the experts with our SEO article.
rope light Shopping with birddog distributing, inc., gives you access to the lowest prices, the best customer service and the quickest delivery times possible.
Laptop AC Adapter We offer genuine factory direct replacement AC adapters.
Direct TV Best satellite TV deals.
Direct TV Deals Direct TV programming deals are varied and include packages containing from 50 channels up to over 250 channels.
8mm film to DVD Retain family memories with the only frame by frame digital restoration service in the United States for your 8mm film to DVD today
Rubber Stamp Shop for custom self-inking stamps, hand stamps, address stamps, label stamps, check endorsement stamps, check deposit stamps, date stamps, pre inks, pocket stamps, ink and much more!

1