Category Archives: xml

seXML and Business Integration (what your parents didn’t tell you)

The following is a guest post by Jason Ouellette, CTO and Cofounder of SocialPandas.

My first startup job was as a junior engineer at webMethods in 1997. If you’re the kind of person who judges the passage of time by JDK version (and c’mon, who isn’t?) that’s the JDK 1.1 era. Although the details varied over the years, the mission of the company can be found in its camel cased name: to bring programmatic access to the web. The two technologies it used to do so: XML and HTTP. I’m going to jump into XML here because nothing says enterprise and sexy like XML.

XML had been proposed by the W3C as a kinder, friendlier replacement for SGML. Their XML 1.0 standard was structured, self-describing, “strongly-typed” and cognitively more accessible. The two founders of webMethods were excited about XML and its potential value in business integration, fanboi-level excited. They participated in standards bodies, conferences, and encouraged employees to contribute to new, related standards like XQL (XML Query Language, precursor to XPath). They even placed XML at the spiritual heart of our first product in the form of WIDL: Web Interface Definition Language, an XML dialect that described an API layer for CGI forms on websites. Or, in the interests of Keeping It Real: a language for writing web scrapers. However unglamorous it sounds, it made an impressive demo to ask the webMethods tool to examine a web page, build an API around it (like CORBA IDL, we would say enthusiastically), test it, and end up with a tidy, reusable XML description of the whole mess, importing the functionality of that website to any program.

A few months into WIDL we found that “Web Automation” (yet another euphemism for web scraping) was sadly not the Big Idea upon which we would grow big and rich like Netscape. No, we needed something less brittle and with more meat than the DHL and FedEx package tracking demos. We found it in B2B, namely Business-to-Business e-commerce. Using XML over HTTP we set out to improve the state of the art at the time, which was EDI (Electronic Data Interchange) documents over the VAN (Value Added Network). In comparison with the open and democratic interwebs, the VAN is a shadowy underworld of leased lines, arcane protocols like X25, and handshakes involving data formats that only a COBOL program could love.

A quick sidebar on EDI: In those ancient times of relative CPU and network I/O scarcity, many data formats and protocols were lean and mean, quite unfriendly to humans. They were optimized to live in the virtual space between computers, and space was tight. Some examples from the distributed computing wing of the IT museum: CORBA IIOP, COM/DCOM, and much higher up the stack, the ANSI X.12 and UN/EDIFACT family of EDI formats. If you’ve ever seen an “850” document you know it looks like petrified dino dung, but fixed-field documents like that are still hugely important today, helping businesses to exchange structured business information system-to-system, such as purchase orders and invoices.

In contrast to EDI, the human readability of XML business documents was one of its selling points, or so I observed on webMethods sales calls. It was not uncommon for raw XML to be shown in slide decks and demos to business people. While the IT folks in the room grumbled about its piggish appetite for memory and other computing resources, the less technical were transfixed with the beauty of its angle brackets. Fake but pretty purchase orders like this were often trotted out:

In addition to being transparent in a way that binary and fixed-field formats are not, XML documents can be validated against a Document Type Definition (DTD), the lightweight precursor to the more full-figured XML Schema specification. So before processing your purchase order for “Tasty XML Vittles” you can verify programmatically that it’s well-formed and will not break your e-commerce back-end (written in NetDynamics or some other cool web application technology of the day).

This was all sexy stuff back then. Many industries adopted XML dialects for e-commerce, like cXML for procurement and RosettaNet for high-tech manufacturers. webMethods sold lots of product built on the premise of XML document exchange over the Internet, went public (with its lovely WEBM ticker symbol), suffered Bubble Burst 1.0 (like $4B market cap one day, threats of Nasdaq delisting the next), and sold to Software AG for over $500M. XML went on to have a prodigious number of children, most with voluminous specifications, names like WS*. SOAP and Web Services live on today as the most mainstream incarnation of XML.

But like wholesome mid-century American family values, we still pine for the idealized simplicity of early XML. JSON (JavaScript Object Notation), a militant branch of the document simplification movement, has been sexy for a while now. I wonder what will come next, when JSON begins a course of Botox injections. That’s one of the fun parts of being in this industry. Knowing your history is optional, and you’re doomed to repeat it with faster machines.


About Jason

Jason has always preferred computers to people; so starting at age 6 he was coding Commodore BASIC and 6502 assembly. As Chief Architect at Appirio, he wrote three of the most popular apps on Salesforce.com’s AppExchange as well as a book for developers about Force.com. At data virtualization vendor Composite Software, Jason led R&D efforts to build connector products for SAP, Siebel, and salesforce.com. As a founding engineer at webMethods (now Software AG), he developed the industry’s first XML-based B2B integration server. He has a B.S. in Information & Decision Systems from Carnegie Mellon, and graduated summa cum laude from the San Francisco School of Home Renovation Hard Knocks in 2009 with his thesis, “The Information Asymmetry of Milestone Payments.”