- What is ONIX?
- How was ONIX for Books created?
- Who is responsible for ONIX for Books now?
- So where can I buy it?
- If ONIX is free of charge, how can EDItEUR support it?
- Is ONIX used only for ‘books’?
- What are the benefits of adopting ONIX?
- What is the current version of ONIX?
- What are ONIX tags and codelists?
- What are ‘Reference names’ and ‘Short tags’?
- Does the order of elements matter in ONIX?
- What should I do if I have no information to go in an ONIX data element?
- Can ONIX cope with books and metadata in any language?
- As a publisher, how can I implement ONIX for Books?
- Will ONIX affect the way I store and manage my product information?
- Where can I get more information on implementing ONIX for Books?
- Do I need to include all elements listed in ONIX for Books?
- Is ONIX training available?
Separate FAQ documents are available on:
- ONIX Release 3.0
- ONIX schemas and validation
- Digital products in ONIX 2.1 and 3.0
- Open access monographs in ONIX
ONIX – more specifically ‘ONIX for Books’ – is a standard specification for communicating book metadata between publishers, various intermediaries like distributors, wholesalers and data service organisations, and retailers in the book supply chain.
Metadata – information about each book – has become vitally important to the smooth running of the supply chain and the effective marketing, merchandising and retailing of books and related products. And given both the huge number of books made available in the market and the rich range of information about each book, a standardised method of communicating this information between supply chain partners is critical. It saves money and speeds up the flow of information by being suitable for highly automated exchanges of data.
Because ONIX is intended and optimised for computer-to-computer communication, it’s not particularly easy for humans to read and interpret. And because it’s focused on communication, ONIX files are usually referred to as ‘ONIX messages’. These messages flow between databases, not between people.
ONIX for Books was originally developed jointly by the Digital Issues working group of the Association of American Publishers (AAP) and EDItEUR, in response to the growing importance and value of high-quality metadata to publishers and online booksellers. ONIX stands for ONline Information eXchange, and the first version was published in January 2000.
That first version combined ideas about metadata from earlier work including Book Industry Communication’s ‘BIC Basic’ specification and EDItEUR’s EPICS data dictionary. It was also influenced by the XML specification released in 1998 by the World Wide Web Consortium. ONIX for Books aimed to reduce the difficulty of managing, distributing and updating large volumes of rich and dynamic metadata.
Today, ONIX for Books is just one member of a family of XML-based international standards intended to support computer-to-computer communication between parties involved in creating, distributing, licensing or otherwise making available intellectual property in published form, whether physical or digital. It is by far the most widely adopted and implemented member of the family, but there are specifications intended for the serials (academic journals) trade, for library licensing, and other specialist uses.
While ONIX for Books (from now on, just ‘ONIX’) was originally developed jointly by the AAP and EDItEUR, with collaboration from Book Industry Communication (BIC) in the UK and the Book Industry Study Group (BISG) in the US, subsequent development and support has been the responsibility of EDItEUR.
ONIX is now firmly established around the world as the book trade standard for the communication of ‘rich product metadata’ – the type of metadata that is needed to support the sale of books in the supply chain, not least for online retailing.
Ongoing development is managed by EDItEUR, overseen by an International Steering Committee with representatives of user groups in 15 or more countries including Australia, Belgium, China, Canada, Finland, France, Germany, Italy, Japan, the Netherlands, Norway, Russia, Spain, Sweden and the Republic of Korea, as well as US and UK representatives from BISG and BIC. The ONIX Steering Committee meets twice a year, at the London and Frankfurt Book Fairs, and is responsible for ensuring the standard develops in line with the needs of ONIX users.
It’s important to understand that ONIX is simply a specification for a standard method for communication. It is not in itself a database, a software application, or a product or service that you can purchase. Although many software applications and databases may well implement the ONIX standard, these are third-party applications and services, and are not supplied by EDItEUR.
The specification and associated documents are available and usable without charge: EDItEUR makes all ONIX documentation available for download on it’s website, and there are no licensing charges, royalties or fees payable if you make use of the standard.
Note however, although EDItEUR is committed to ensure that ONIX is free to use, it is not ‘free’ in the sense that the standard itself and all associated documentation remains © EDItEUR. This is simply to protect the integrity of the standard and the development process.
EDItEUR is a not-for-profit, membership-supported organisation. Members of EDItEUR pay an annual subscription, and in return they have a voice in the governance of the standard and the direction of development, and better access to EDItEUR’s expertise.
Membership of EDItEUR is a visible sign of an organisation’s support for a standards-based approach to building an effective supply chain, for the benefit of all stakeholders. But membership is not required to make use of ONIX or any other EDItEUR standard.
Although the format is formally known as ONIX for Books, it has always covered other media and other products produced by book publishers or other organisations and distributed through the book supply chain. This includes educational software, cartographic products, toys and games, apparel, point-of-sale items and – of course – e-books and electronic devices like e-book readers.
For publishers, experience has shown that ONIX brings two important business benefits. As a communication format, it makes it possible to deliver rich product information into the supply chain in a standard form, to wholesalers and distributors, to larger retailers, to data aggregators, and to affiliate companies. A single set of data can be suitable for all downstream requirements. And by providing a template for the content and structure of the information about a product, ONIX has stimulated the development of better internal information systems, capable of bringing together all the ‘metadata’ needed for the description and promotion of new and backlist titles. The same core data can also be used to produce advance information sheets, catalogues and other promotional material.
For ‘downstream’ supply chain partners such as distributors and wholesalers, data intermediaries and retailers, ONIX means that they can speed up the loading of up-to-date product information into customer-facing systems, with less need for manual intervention and much lower risk of error. It reduces the need to deal with multiple proprietary data formats, and hence reduces support costs. And by enabling third parties such as trade associations or data aggregators to develop metrics for data quality and timeliness, it enables benchmarking and encourages overall improvement of the data available throughout the supply chain.
Over the years since the publication of version 1.0 in January 2000, ONIX has undergone continual development – to improve the format itself, to incorporate new types of data, and to ensure the standard keeps pace with evolving business requirements across the book and e-book market.
That initial version 1.0 and a quick succession of minor updates (International 1.0, 1.1, 1.2, 1.2.1) were followed by Release 2.0 in July 2001. These versions are all obsolete (though 2.0 remains in legacy use in a few implementations).
Release 2.1 – fully compatible with 2.0 – was released two years later in July 2003 and, following the release of a couple of minor updates containing optional extensions, has been stable since mid-2004. Version 2.1 rev.02 remains to this day the most commonly implemented version in some markets (particularly in the US, UK and Germany). There have been two further minor revisions intended for use in specific countries, the last being rev.04 in early 2011 (intended for use only in Japan).
Release 3.0 was published in April 2009, and has been stable since late 2010. There have been two minor updates containing only optional additions, 3.0.1 in early 2012 and 3.0.2 in early 2014.
So there is no single ‘current’ version. Releases 2.1 and 3.0 are both in widespread use.
However, the ONIX International Steering Committee announced at the beginning of 2012 that the level of support for 2.1 will be reduced at the end of 2014. The committee gave three years notice of the withdrawal of support, allowing adequate time for planning, budgeting and software development to bring implementations up to version 3.0.
This ‘sunset date’ at the end of 2014 should be seen by organisations using 2.1 as a target for completing their migration to ONIX 3.0. Although 2.1 will continue to be usable after sunset, all ONIX users are strongly encouraged to update their implementations to the latest 3.0 specification.
ONIX is an XML data format, and all of the data it contains is embedded inside ‘XML tags’. Tags are markup constructs between < and > characters, and they always occur in pairs, an opening tag and a closing tag. So for example, the name of an author might be held like this:
between two tags, and the whole thing is generally termed a ‘data element’. The <PersonNameInverted> element is defined to always hold a (personal) name, family name first, with given names, titles and honorifics following the family name. There are perhaps 400 such ‘tags’ in ONIX. Some hold text data like a name, others hold numbers, like the <SequenceNumber> tag below, or data with a restricted range of values (codes), such as <ContributorRole>. And some other tags are purely structural, grouping together other elements that logically belong together. For example, all data elements associated with a particular author are enclosed within a <Contributor> ‘composite element’:
Where a data element has a restricted range of values – such as <ContributorRole> – the allowed values are defined in a ‘codelist’ or ‘controlled vocabulary’. There are different codelists for different data elements, about 100 in all. Each lists all the allowed codes, so list 17 (for contributor roles) lists A01, A02 and so on, plus their meanings. A01 means ‘written by’.
The tags, whether they are mandatory or optional, whether they are repeatable, and their nesting within composites, are defined by the ONIX Specification and codified in the ONIX schema. Different releases of ONIX use slightly different sets of tags. Codelists are shared between all releases of ONIX, but the codelists are revised regularly. Adding new codes (without having to change the tags themselves) is the main way that the functionality of ONIX is extended to cope with new business requirements.
Each element in ONIX has a ‘plain language’ reference name (for example, <PersonNameInverted>) and a short tag (for example, <b037>). So:
can also be expressed as:
These are identical in meaning. The schema definitions allow either reference names or short tags to be used to label the elements in an ONIX XML message. They cannot, however, be mixed in the same message. Users can make the choice between readability and conciseness.
Short tags make ONIX files something like a quarter or a third smaller, though they remain equally complex to process – and zipping files before transmission is considerably more effective in reducing the file size. Reference names make debugging simpler. For ONIX 2.1 and for ONIX 3.0, there are tagname converters which translate reference names to short tags, and vice versa.
NB in the very earliest ONIX releases, the initial letters of the short tags indicated an attempted logical grouping of elements. As the format developed, this grouping quickly became impractical, so that there is now no significance to be drawn from the initial letters. (The numeric part of the tag has always been unique by itself.) In ONIX 3.0, all new elements have been assigned short tags of the form <xnnn>, so that they can be immediately recognised as new.
Yes. Elements in an ONIX for Books message must be delivered in the sequence defined by the schema. A message in which elements occur out of sequence will not validate. In the example:
the <ContributorRole> must occur before the name. If it follows the name, or if either is missing, the ONIX is not valid.
On the other hand, where an element or composite is repeated several times (say because there are several contributors, or several subject codes) the order of the repeats is generally not meaningful.
If for a particular product there is no information available or appropriate for an optional data element, the data element should simply be omitted.
So for example, the weight or exact spine thickness of a physical book, or the filesize of an e-book may not be known with any certainty until close to publication. In ONIX sent several months before publication, you should simply omit the relevant <Measure> composite for weight or spine thickness, or the <Extent> composite for the filesize – though you could still include the <Measure> composites for height and width, and the <Extent> for the notional number of pages.
If the element for which the data is missing is mandatory, however, you must not send an ONIX record without providing valid data for the element. In neither case is it permissible to send an ‘empty element’.
Note that if the element for which there is no information available is a mandatory part of a larger composite element within the message, the whole composite element must be omitted.
[In XML, it is possible to deliver an empty element in either of two forms: <Tag></Tag> or <Tag/>. Neither of these forms should ever occur in an ONIX message, except in a few instances where an element (eg the <MainSubject/> flag in Group P.12 in Release 3.0) is specifically defined as always empty, in which case we strongly recommend that the second form <Tag/> should be used. Aside from the handful of ‘always empty’ elements, if there is no data for an element, the element should be omitted completely. Most ‘illegal’ empty elements are detected by validation against the XSD or RNG schemas, but not by validation against the DTD. Consequently there is a risk with the DTD that a message could pass validation even though it includes mandatory elements for which there is no data. This is another reason for using the XSD or RNG schema for validation wherever possible.]
Yes. ONIX is not built around the needs of any one country, or any one supply chain.
Although the reference tag names are in English, this is just a convenience, and the data carried within most data elements is language-independent. Whether a book is known as a hardback (English), a hardcover (American English), or eine gebundene ausgabe (German), the ONIX data is the same (<ProductForm>BB</ProductForm>).
Where the data is text in a particular language – for example, a contributor’s biography – the text can be provided in one language or in several languages in parallel.
And to facilitate this text in multiple languages, ONIX data can use Unicode: whether your metadata is in Latin script (for English and most European languages), Cyrillic (for Russian), Arabic script, Hanzi or Kanji (for Chinese and Japanese) or Hangul (for Korean), Unicode covers what you need.
The options for publishers who want to adopt ONIX and start sending ONIX messages to their business partners are of three kinds:
- develop or commission bespoke software for managing their product metadata, and build in support for ONIX for Books;
- acquire a third-party application for product data management which supports ONIX; or
- contract to use a web-based service which supports online data entry and delivery of ONIX output to designated recipients.
The availability and practicality of each of these options will vary to some degree from country to country, and the choices taken by large publishers may be different from small publishers.
Some ONIX national groups may be able to provide contacts with system and service suppliers, and a selection are also listed on EDItEUR’s ONIX Users and Services Directory. Wherever possible, EDItEUR provides links to national groups and to system vendors from the ONIX for Books pages on this website.
ONIX is a metadata format specifically developed to support the communication of information from one computer system to another. It is not designed as an internal database format. However, it is obvious that communication can only work if the systems at each end are substantially ‘ONIX compliant’ – for example, supporting data element structures and data encoding which are no less exact or granular than those used in ONIX. It is important to verify at the earliest possible stage that any product data management system you plan to use is ‘ONIX compliant’ in this sense.
It is all too easy to deliver something which is technically correct in XML terms, and which validates correctly, but wrong in data content. For example, title fields in some book trade systems are traditionally used to carry added data, such as an edition number, which differentiates one product from another. But in ONIX, title elements should carry only the component parts of the title itself, and there are separate data elements for edition numbers and types. So not only your internal database structures, but also the disciplines followed by your data management staff, need to be ‘ONIX compliant’.
ONIX aims to cover the widest possible range of needs, and it therefore includes many elements which are specialised to particular forms of publishing or particular markets. Nobody uses all the elements available, and only a very small number are mandatory. The technical minimum (which is implicit in the ONIX schema) is entirely useless, and the effective minimum range of tags in practice depends on your business requirements and those of your supply chain partners. The minimum set of data elements to support for a trade publisher is a little different from that of an academic publisher or that of a religious publisher.
EDItEUR provides a comprehensive Implementation and Best Practice Guide that offers advice on the most commonly useful ONIX 3.0 data elements
And in many countries where ONIX has been adopted, national groups publish their own guidelines for implementation and ‘good practice’, building on the global EDItEUR advice. If you are buying a third-party system or service, you need to check that it complies with the guidelines that are in use in your particular market(s). Some receivers may choose not to use certain elements, or may ask for one option rather than another. However, no receiver should reject a valid message because it includes optional content which they have chosen not to use.
Implementation questions are routinely discussed on the ONIX_Implement mailing list, and notices of ONIX developments are sent out to the list as well as to national groups. If you have not signed up to the list, you can do so here.
Some national user groups and other organizations provide implementation help, workshops and formal training. EDItEUR also provides in-house training for its members. If you are interested in receiving – or providing – formal training, please contact EDItEUR.