Thursday, August 3, 2006

Namespaced Extensions in Feeds

| More

Namespaced Extensions in Feeds

Feeds can be used for more than just text; they can embed pictures, podcasts and video. There are even more esoteric bits of data that can be attached to feeds, like the geographic location that a post is about, the number of comments it has received and that (legal) license its contents are available under. To make all of this information easily parseable by computers, it is usually available as additional items and attributes in XML namespaces. For example, the Media RSS namespace is used to add more information about videos and pictures, like dimensions, duration and a thumbnail.

This usually isn't of direct interest to end users, it's just matter of which namespaced extensions a feed reader supports, and the more the merrier. However, since there are quite a few ones out there, developers must make trade-offs and decisions. One easy way to prioritize extension support is to see which ones are used more often.

I wrote a small MapReduce program to go over our BigTable and get the top 50 namespaces based on the number of feeds that use them. This means that we only looked at feeds that have at least one subscriber, i.e. the "feeds that matter." Note that the default namespaces for syndication feed formats (e.g. http://www.w3.org/2005/Atom for Atom 1.0) are excluded, since I was interested only in extensions to the elements that are already expected to be in a feed.

We thought this information might be of interest to others, the way our analysis of XML errors and web authoring statistics have been. If I have missed anything, or if you have any feedback, a message in our discussion group or a link to this blog post is the best way to reach us.

table#stats { border-spacing: 0; border-collapse: collapse; font-family: sans-serif; } table#stats code { color: #333; font-weight: bold; } table#stats thead { background: #eee; } table#stats th:first-child { white-space: nowrap; } table#stats td, table#stats th { border: solid 1px #ddd; vertical-align: top; padding: 0.1em 0.3em 0.1em 0.3em; } table#stats td:first-child { text-align: right; }
% of Feeds Namespace URI
29.36% Dublin Core http://purl.org/dc/elements/1.1/
15.71% XHTML http://www.w3.org/1999/xhtml
11.92% Blogger Atom API Extensions http://www.blogger.com/atom/ns#
11.88% Blogger Draft Extension http://purl.org/atom-blog/ns#
11.16% RSS 1.0 Content Module http://purl.org/rss/1.0/modules/content/
8.39% Well-Formed Web Comment API http://wellformedweb.org/CommentAPI/
5.35% RSS 1.0 Administrative Module http://webns.net/mvcb/
3.85% FeedBurner Extensions http://rssnamespace.org/feedburner/ext/1.0
3.74% MSN Spaces http://schemas.microsoft.com/msn/spaces/2005/rss
3.66% Slash http://purl.org/rss/1.0/modules/slash/
3.59% RSS 1.0 Syndication Module http://purl.org/rss/1.0/modules/syndication/
2.50% iTunes http://www.itunes.com/dtds/podcast-1.0.dtd
2.49% LiveJournal RSS Module 1.0 http://www.livejournal.org/rss/lj/1.0/
2.33% Dublin Core Terms http://purl.org/dc/terms/
2.27% Microsoft Simple List Extensions http://www.microsoft.com/schemas/rss/core/2005
2.00% Yahoo Media RSS http://search.yahoo.com/mrss/
1.24% RSS 1.0 Taxonomy Module http://purl.org/rss/1.0/modules/taxonomy/
1.06% TrackBack Module for RSS 1.0/2.0 http://madskills.com/public/xml/rss/module/trackback/
1.04% creativeCommons RSS Module http://backend.userland.com/creativeCommonsRssModule
0.92% OpenSearch http://a9.com/-/spec/opensearchrss/1.1/
0.68% Basic Geo (WGS84 lat/long) Vocabulary http://www.w3.org/2003/01/geo/wgs84_pos#
0.54% Atom Threading http://purl.org/syndication/thread/1.0
0.42% Creative Commons (RDF) http://web.resource.org/cc/
0.39% Technorati API http://api.technorati.com/dtd/tapi-002.xml
0.36% Google Calendar http://schemas.google.com/gCal/2005
0.31% Google GData http://schemas.google.com/g/2005
0.28% Feed History http://purl.org/syndication/history/1.0
0.28% eBay urn:ebay:apis:eBLBaseComponents
0.27% Pheed http://www.pheed.com/pheed/
0.23% RSS 1.0 Annotation Module http://purl.org/rss/1.0/modules/annotation/
0.21% PRISM http://prismstandard.org/namespaces/1.2/basic/
0.18% Bulkfeeds http://bulkfeeds.net/xmlns#
0.16% Atom Indexing urn:atom-extension:indexing
0.15% AOL Journals http://journals.aol.com/_atom/aj#
0.14% Jive Forums http://www.jivesoftware.com/xmlns/jiveforums/rss
0.13% Yahoo! Weather http://xml.weather.yahoo.com/ns/rss/1.0
0.11% RSSWriter Manifest http://usefulinc.com/rss/manifest/
0.11% FOAF Vocabulary http://xmlns.com/foaf/0.1/
0.10% Feedster http://feedster.com/feedstersearch/ext/1.0
0.10% Google Picasa Web http://picasaweb.google.com/lh/picasaweb/
0.09% RSS 1.0 Link Module http://purl.org/rss/1.0/modules/link/
0.09% Buzznet http://www.buzznet.com/1.0/
0.09% Digg http://digg.com/docs/diggrss/
0.09% PubSub http://pubsub.com/xmlns
0.09% Snaplog PhotoBlog RSS extension http://snaplog.com/backend/PhotoBlog.html
0.08% XSL http://www.w3.org/1999/XSL/Transform
0.07% Hatena XML Namespace http://www.hatena.ne.jp/info/xmlns#
0.07% iTunes Music Store http://phobos.apple.com/rss/1.0/modules/itms/
0.07% Furl http://www.furl.net/resources/furlRSS.jsp#
0.06% Google Base http://base.google.com/cns/1.0
0.06% Web Wiz Forums http://syndication.webwizguide.info/rss_namespace/

URL: http://googlereader.blogspot.com/2006/08/namespaced-extensions-in-feeds.html